- The Oxford Handbook of Computational Linguistics
- Pragmatics and Dialogue
- Formal Grammars and Languages
- Text Segmentation
- Part-of-Speech Tagging
- Word-Sense Disambiguation
- Anaphora Resolution
- Natural Language Generation
- Speech Recognition
- Text-to-Speech Synthesis
- Finite-State Technology
- Statistical Methods
- Machine Learning
- Lexical Knowledge Acquisition
- Sublanguages and Controlled Languages
- Corpus Linguistics
- Tree-Adjoining Grammars
- Machine Translation: General Overview
- Machine Translation: Latest Developments
- Information Retrieval
- Information Extraction
- Question Answering
- Text Summarization
- Term Extraction and Automatic Indexing
- Text Data Mining
- Natural Language Interaction
- Natural Language in Multimodal and Multimedia Systems
- Natural Language Processing in Computer-Assisted Language Learning
- Multilingual On-Line Natural Language Processing
- Notes on Contributors
- Index of Authors
- Subject Index
Abstract and Keywords
The commercial success of natural language (NL) technology has raised the technical criticality of evaluation. Choices of evaluation methods depend on software life cycles, typically charting four stages — research, advance prototype, operational prototype, and product. At the prototype stage, embedded evaluation can prove helpful. Analysis components can be loose grouped viz., segmentation, tagging, extracting information, and document threading. Output technologies such as text summarization can be evaluated in terms of intrinsic and extrinsic measures, the former checking for quality and informativeness and the latter, for efficiency and acceptability, in some tasks. ‘Post edit measures’ commonly used in machine translation, determine the amount of correction required to obtain a desirable output. Evaluation of interactive systems typically evaluates the system and the user as one team and deploys subject variability, which runs enough subjects to obtain statistical validity hence, incurring substantial costs. Evaluation being a social activity, creates a community for internal technical comparison, via shared evaluation criteria.
Lynette Hirschman is Chief Scientist for the Information Technology Center at MITRE in Bedford, Mass. Dr Hirschman received her Ph.D. in formal linguistics from the University of Pennsylvania in 1972. She has worked in both text and spoken language processing, with a strong emphasis on evaluation. She was involved in the Message Understanding Conferences, and was the the organizer of the common data collection for the Air Travel Information System (ATIS) spoken language effort. Her current research interests include biolinguistics—the application of human language technology to bioinformatics—and reading comprehension tests as a vehicle for evaluation of language understanding systems.
Inderjeet Mani is Associate Professor of Linguistics and Head of the Computational Linguistics Program at Georgetown University in Washington, DC. His books include the co-edited volumes Advances in Automatic Text Summarisation (MIT Press, 1999), and The Language of Time: A Reader (Oxford University Press, to appear), as well as an authored book Automatic Summarisation (John Benjamins, 2001). His work on evaluation includes assisting the U.S. government with the TIPSTER-SUMMAC text summarisation evaluation. In addition to text summarisation, Dr. Mani's current research includes temporal information processing and building ontologies.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.