- Mathematical Foundations: Formal Grammars and Languages
- Finite-State Technology
- Statistical Models for Natural Language Processing
- Machine Learning
- Word Representation
- Deep Learning
- Sublanguages and Controlled Languages
- Corpus Annotation
- Part-of-Speech Tagging
- Semantic Role Labelling
- Word Sense Disambiguation
- Computational Treatment of Multiword Expressions
- Textual Entailment
- Natural Language Generation
- Speech Recognition
- Temporal Processing
- Text-to-Speech Synthesis
- Machine Translation
- Translation Technology
- Information Retrieval
- Information Extraction
- Question Answering
- Text Summarization
- Term Extraction
- Web Text Mining
- Opinion Mining and Sentiment Analysis
- Spoken Language Dialogue Systems
- Multimodal Systems
- Automated Writing Assistance
- Text Simplification
- Natural Language Processing for Educational Applications
Abstract and Keywords
This chapter provides an overview of classical formal language theory. The text is focused on the definition of the fundamental concepts of language, grammar, and automata, and introduces some basic related notions. We also present the hierarchical classification of formal grammars proposed by N. Chomsky in the 1950s, known as the Chomsky hierarchy. The location of natural languages in this hierarchy is also discussed, together with the concept of Mildly Context Sensitivity. In the last part of the chapter, other formalisms that have interesting linguistic and computational properties are briefly introduced. The chapter concludes with a short review of grammatical inference, a subfield of machine learning that deals with the process of learning grammars and languages from data.
Leonor Becerra-Bonache is an Associate Professor at Jean Monnet University (France). She works in the Machine Learning team at the Hubert Curien Laboratory. During her PhD studies, she worked with prestigious research groups at 5 different universities: Rovira i Virgili University (Spain), University of Alicante (Spain), Waseda University (Japan), University of Maryland Baltimore County (USA) and Jean Monnet University (France). Her post-doctoral training includes a two-year stay at the Department of Computer Science at Yale University (USA) as a Marie Curie Researcher. Her main research interests are in the area of Grammatical Inference, Machine Learning, Formal Language Theory and Computational Linguistics.
Gemma Bel-Enguix has a PhD in Linguistics in the Rovira i Virgili University and has been post-doctoral and research fellow in Georgetown, Milano-Bicocca, Aix-Marseille and Rovira i Virgili. She has been working in a multidisciplinary area involving Computer Science, Biology and Linguistics, developing models of natural computing for approaching problems of natural language. Currently, she is a researcher at the Universidad Nacional Autónoma de México. Her research is focused on theoretical models of Natural Language Processing, and the study of language from the perspective of Complex Systems.
M. Dolores Jiménez-López is an Associate Professor at Departament de Filologies Romaniques at the Universitat Rovira i Virgili (Tarragona, Spain). She has a PhD degree in linguistics. She worked for two years, as a pre-doctoral fellow, at the Computer and Automation Research Institute of the Hungarian Academy of Sciences in Budapest (Hungary). Her post-doctoral training includes a three-year stay at Department of Computer Science in University of Pisa (Italy). Application of formal models to natural language analysis is one of her main research topics.
Carlos Martín-Vide is Full Professor at Rovira i Virgili University, Tarragona. His research areas include automata and language theory, molecular computing, theoretical computer science and mathematical and computational linguistics, where he is (co)author of more than 300 papers. He has been involved in the definition, operation and monitoring of several European funding initiatives in support of fundamental research in mathematics and computer science.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.