- Oxford Handbooks in Linguistics
- List of Contributors
- Corpus Design
- Data Collection
- Corpus Annotation: Methodology and Transcription Systems
- On Automatic Phonological Transcription of Speech Corpora
- Statistical Corpus Exploitation
- Corpus Archiving and Dissemination
- Metadata Formats
- Data Formats for Phonological Corpora
- Corpus and Research in Phonetics and Phonology: Methodological and Formal Considerations
- A Corpus-Based Study of Apicalization of /s/ before /l/ in Oslo Norwegian
- Corpora, Variation, and Phonology: An Illustration from French Liaison
- Corpus-Based Investigations of Child Phonological Development: Formal and Practical Considerations
- Corpus Phonology and Second Language Acquisition
- ELAN: Multimedia Annotation Application
- The Use of Praat in Corpus Research
- Praat Scripting
- The PhonBank Project: Data and Software-Assisted Methods for the Study of Phonology and Phonological Development
- ANVIL: The Video Annotation Research Tool
- Web-Based Archiving and Sharing of Phonological Corpora
- The IViE Corpus
- French Phonology from a Corpus Perspective: The PFC Programme
- Two Norwegian Speech Corpora: NoTa-Oslo and TAUS
- The LeaP Corpus
- The Diachronic Electronic Corpus of Tyneside English: Annotation Practices and Dissemination Strategies
- The Lanchart Corpus
- Phonological and Phonetic Databases at the Meertens Institute
- The VALIBEL Speech Database
- Prosody and Discourse in the Australian Map Task Corpus
- A Phonological Corpus of L1 Acquisition of Taiwan Southern Min
Abstract and Keywords
The aim of this chapter is to present two of the speech corpora developed at the Text Laboratory of the Department of Linguistics and Scandinavian Studies at the University of Oslo, which are available on the web. These two corpora, NoTa-Oslo and TAUS, consist of spontaneous speech from residents of Oslo, NoTa-Oslo speech recorded in 2005–2006 and TAUS speech recorded in 1972–1973. The web search interface is relatively simple to use, and the transcriptions are linked to audio files (for both NoTa-Oslo and TAUS) and video files (for NoTa-Oslo). NoTa-Oslo and TAUS are both multi-purpose corpora, designed to serve several research tasks in different fields, such as phonology, morphology, syntax, semantics, dialogue, dialectology, sociolinguistics, lexicography, and language technology. This makes the corpora very useful for most purposes, but it also means that they cannot immediately meet the demands of every research task. The authors present the two corpora, their content and their web interface, and then discuss how they can be used for phonological research.
Kristin Hagen is Senior Engineer at the Text Laboratory, Department of Linguistic and Scandinavian Studies at the University of Oslo. For many years she has worked with the development of speech corpora such as NoTa-Oslo and the Nordic Dialect Corpus. She has also worked in other language technology domains like POS tagging (The Oslo-Bergen tagger), parsing and grammar checking. Her background is both in linguistics and in computer science.
Hanne Gram Simonsen is Professor of Linguistics at the Department of Linguistics and Scandinavian studies at the University of Oslo. Her research interests include language acquisition (in particular phonology, morphology, and lexicon) and instrumental and articulatory phonetics, as well as clinical linguistics (language disorders in children and adults). She has published on these topics in journals such as Journal of Child Language, Journal of Phonetics, and Clinical Linguistics and Phonetics.
Access to the complete content on Oxford Handbooks Online requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.