date: 22 June 2021

Abstract and Keywords

Speech recognition is concerned with converting the speech waveform, an acoustic signal, into a sequence of words. Today’s best-performing approaches are based on a statistical modelization of the speech signal. This chapter provides an overview of the main topics addressed in speech recognition: that is, acoustic-phonetic modelling, lexical representation, language modelling, decoding, and model adaptation. The focus is on methods used in state-of-the-art, speaker-independent, large-vocabulary continuous speech recognition (LVCSR). Some of the technology advances over the last decade are highlighted. Primary application areas for such technology initially addressed dictation tasks and interactive systems for limited domain information access (usually referred to as spoken language dialogue systems). The last decade has witnessed a wider coverage of languages, as well as growing interest in transcription systems for information archival and retrieval, media monitoring, automatic subtitling and speech analytics. Some outstanding issues and directions of future research are discussed.

Keywords: speech recognition, statistical modelling, acoustic-phonetic modelling, pronunciation lexicon, language modelling, decoding

