- Mathematical Foundations: Formal Grammars and Languages
- Finite-State Technology
- Statistical Models for Natural Language Processing
- Machine Learning
- Word Representation
- Deep Learning
- Sublanguages and Controlled Languages
- Corpus Annotation
- Text Segmentation
- Part-of-Speech Tagging
- Semantic Role Labelling
- Word Sense Disambiguation
- Computational Treatment of Multiword Expressions
- Textual Entailment
- Natural Language Generation
- Speech Recognition
- Temporal Processing
- Text-to-Speech Synthesis
- Machine Translation
- Translation Technology
- Information Retrieval
- Information Extraction
- Question Answering
- Text Summarization
- Term Extraction
- Web Text Mining
- Opinion Mining and Sentiment Analysis
- Spoken Language Dialogue Systems
- Multimodal Systems
- Natural Language Processing for Educational Applications
- Automated Writing Assistance
- Text Simplification
- Author Profiling and Related Applications
Abstract and Keywords
Speech recognition is concerned with converting the speech waveform, an acoustic signal, into a sequence of words. Today’s best-performing approaches are based on a statistical modelization of the speech signal. This chapter provides an overview of the main topics addressed in speech recognition: that is, acoustic-phonetic modelling, lexical representation, language modelling, decoding, and model adaptation. The focus is on methods used in state-of-the-art, speaker-independent, large-vocabulary continuous speech recognition (LVCSR). Some of the technology advances over the last decade are highlighted. Primary application areas for such technology initially addressed dictation tasks and interactive systems for limited domain information access (usually referred to as spoken language dialogue systems). The last decade has witnessed a wider coverage of languages, as well as growing interest in transcription systems for information archival and retrieval, media monitoring, automatic subtitling and speech analytics. Some outstanding issues and directions of future research are discussed.
Jean-Luc Gauvain is a permanent CNRS researcher at LIMSI where he has been since 1983. He is head of the Spoken Language Processing Group. He received a doctorate in Electronics from the University of Paris XI in 1982. His research centres on speech recognition, language identification, audio indexing, and spoken language dialogue systems. He has over 160 publications and received the 1996 IEEE SPS Best Paper Award. He is a member of the IEEE Signal Processing Society's Speech Technical Committee.
Lori Lamel joined LIMSI as a permanent CNRS researcher in October 1991. She received her Ph.D. in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology (1988). Her research interests include speaker-independent, large vocabulary continuous speech recognition, acousticphonetics, lexical and phonological modelling, speaker/language identification, audio indexation and spoken language dialogue systems. She has over 140 publications, and is a member of the Speech Communication editorial board and the permanent council of ICSLP.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.