Abstract and Keywords
This chapter introduces a conceptual framework for the evaluation of natural language processing systems. It characterizes evaluation in terms of four dimensions: intrinsic versus extrinsic evaluation, stand-alone systems versus components, manual versus automated methods, and laboratory versus real-world conditions. A comparative overview of evaluation methods in major areas of NLP is provided, covering distinct applications such as information extraction, machine translation, automatic summarization, and natural language generation. The discussion of these applications emphasizes commonalities across evaluation methods. Next, evaluation of particular component technologies is discussed, addressing coreference, word sense disambiguation and semantic role labelling, and finally referring-expression generation. The chapter concludes with a brief assessment of the status of evaluation in NLP.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.