- The Oxford Handbook of Computational Linguistics
- Pragmatics and Dialogue
- Formal Grammars and Languages
- Text Segmentation
- Part-of-Speech Tagging
- Word-Sense Disambiguation
- Anaphora Resolution
- Natural Language Generation
- Speech Recognition
- Text-to-Speech Synthesis
- Finite-State Technology
- Statistical Methods
- Machine Learning
- Lexical Knowledge Acquisition
- Sublanguages and Controlled Languages
- Corpus Linguistics
- Tree-Adjoining Grammars
- Machine Translation: General Overview
- Machine Translation: Latest Developments
- Information Retrieval
- Information Extraction
- Question Answering
- Text Summarization
- Term Extraction and Automatic Indexing
- Text Data Mining
- Natural Language Interaction
- Natural Language in Multimodal and Multimedia Systems
- Natural Language Processing in Computer-Assisted Language Learning
- Multilingual On-Line Natural Language Processing
- Notes on Contributors
- Index of Authors
- Subject Index
Abstract and Keywords
Information retrieval (IR) involves retrieving information from stored data, through user queries or pre-formulated user profiles. The information can be in any format. IR typically advances over four broad stages viz., identification of text types, document preprocessing, document indexing, and query processing and matching the same to documents. Although NLP has a role to play in IR, the procedural complexities of the latter impede determination of the stage of incorporation of the former into the latter. Earliest attempts at connecting NLP with IR, were extremely ambitious, proposing concepts instead of terms, as complex structures, to be compared using sophisticated algorithms. In its current state, IR still comes in handy, to retrieve information from various thesauri and ontologies, both in general-purpose lexical databases, as well as those categorizing knowledge in particular scientific and trade domains. However, NLP has yet to prove a better compatibility with IR, in enhancing the latter.
Evelyne Tzoukermann is Research Staff Member at Bell-Labs, Lucent Technologies. After completing her Ph.D. at the University of Paris, she obtained a Fullbright fellowship to pursue a postdoctoral year at Brown University. Dr Tzoukermann spent two years at the IBM Watson Research Center as a visiting scientist. Her research interests include computational linguistics, information retrieval, and text-to-speech synthesis.
Judith L. Klavans is Director of the Center for Information Access at Columbia University, an interdisciplinary research centre, the focus of which is to build research projects on networked information access. She joined Columbia after a decade at the IBM T. J. Watson Research Center working on language technologies. Prior to this, she had a fellowship as a researcher at MIT. In 1980, she received her Ph.D. in Theoretical Linguistics from University College London.
Tomek Strzalkowski is an Associate Professor of Computer Science at the University at Albany (SUNY). His research interests include applications of natural language processing to information retrieval, automated summarization, open domain question answering, and natural language dialogue. Dr Strzalkowski is on the Programme Committee of the Text Retrieval Conference (TREC), where he is also co-Chair of the Question Answering track, and former Chair of the Natural Language track.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.