Show Summary Details

Page of

PRINTED FROM OXFORD HANDBOOKS ONLINE ( © Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

date: 24 June 2021

Abstract and Keywords

This chapter introduces a conceptual framework for the evaluation of natural language processing systems. It characterizes evaluation in terms of four dimensions: intrinsic versus extrinsic evaluation, stand-alone systems versus components, manual versus automated methods, and laboratory versus real-world conditions. A comparative overview of evaluation methods in major areas of NLP is provided, covering distinct applications such as information extraction, machine translation, automatic summarization, and natural language generation. The discussion of these applications emphasizes commonalities across evaluation methods. Next, evaluation of particular component technologies is discussed, addressing coreference, word sense disambiguation and semantic role labelling, and finally referring-expression generation. The chapter concludes with a brief assessment of the status of evaluation in NLP.

Keywords: intrinsic evaluation, extrinsic evaluation, black-box evaluation, glass-box evaluation, component evaluation, inter-annotator reliability, informativeness, quality, adequacy, fluency

Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.

Please subscribe or login to access full text content.

If you have purchased a print title that contains an access token, please see the token for information about how to register your code.

For questions on access or troubleshooting, please check our FAQs, and if you can''t find the answer there, please contact us.