- Oxford Handbooks in Linguistics
- List of Contributors
- Corpus Design
- Data Collection
- Corpus Annotation: Methodology and Transcription Systems
- On Automatic Phonological Transcription of Speech Corpora
- Statistical Corpus Exploitation
- Corpus Archiving and Dissemination
- Metadata Formats
- Data Formats for Phonological Corpora
- Corpus and Research in Phonetics and Phonology: Methodological and Formal Considerations
- A Corpus-Based Study of Apicalization of /s/ before /l/ in Oslo Norwegian
- Corpora, Variation, and Phonology: An Illustration from French Liaison
- Corpus-Based Investigations of Child Phonological Development: Formal and Practical Considerations
- Corpus Phonology and Second Language Acquisition
- ELAN: Multimedia Annotation Application
- The Use of Praat in Corpus Research
- Praat Scripting
- The PhonBank Project: Data and Software-Assisted Methods for the Study of Phonology and Phonological Development
- ANVIL: The Video Annotation Research Tool
- Web-Based Archiving and Sharing of Phonological Corpora
- The IViE Corpus
- French Phonology from a Corpus Perspective: The PFC Programme
- Two Norwegian Speech Corpora: NoTa-Oslo and TAUS
- The LeaP Corpus
- The Diachronic Electronic Corpus of Tyneside English: Annotation Practices and Dissemination Strategies
- The Lanchart Corpus
- Phonological and Phonetic Databases at the Meertens Institute
- The VALIBEL Speech Database
- Prosody and Discourse in the Australian Map Task Corpus
- A Phonological Corpus of L1 Acquisition of Taiwan Southern Min
Abstract and Keywords
The aim of the chapter is twofold: explaining the prerequisites for providing a phonetic/phonological annotation of speech data; and presenting the different systems that can be used to encode the phonetic and phonological events present in the speech signal. Since phonetic/phonological annotation can be seen as the assignment of a label to a specific unit in the data, the segmentation of the speech signal and the assignment of labels are crucial tasks in the annotation process, regardless of the system chosen. As to the presentation of the systems, a distinction can be made between systems that are primarily designed to represent the segmental dimension of the speech signal and those that encode prosodic events such as stress, phrasing, and tonal or intonational patterns. In this chapter, we explore the advantages and limitations of the systems presented by considering the different types of speech data that one may want to annotate (standard data or non-standard data such as acquisition data, pathological speech, etc.) as well as the amount of knowledge about the language spoken that the annotator needs to have in order to successfully transcribe speech with the system in question (whether its phonology is known, etc.).
Keywords: transcription systems, corpus annotation, prosodic transcription, segmental transcription, levels of representation, phonological representation, phonetic representation, orthographic transcription, audio annotation
Elisabeth Delais-Roussarie is a senior researcher at the CNRS, Laboratoire de Linguistique Formelle, Paris (Université Paris-Diderot). She has worked on several topics of sentence phonology, such as the modelling of intonation and accentual patterns in French, the phonology-syntax interface and prosodic phrasing in French. Her recent works focused on the development and the evaluation of prosodic annotation systems and tools that facilitate corpus-based approach in sentence phonology and in the L2 acquisition of prosody.
Brechtje Post is Lecturer in Phonetics and Phonology in the Department for Theoretical and Applied Linguistics, University of Cambridge. Her research interests include intonational phonetics and phonology, speech processing, prosodic phonology, and the acquisition of prosody.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.