(p. 906) List of corpora and databases
(p. 906) List of corpora and databases
Links accessed in April 2012.
ARCHER = A Representative Corpus of Historical English Registers, version 3.1. 1990–93, 2002, 2007, 2010. Compiled under the supervision of Douglas Biber and Edward Finegan at Northern Arizona University, University of Southern California, University of Freiburg, University of Heidelberg, University of Helsinki, Uppsala University, University of Michigan, University of Manchester, Lancaster University, University of Bamberg, University of Zurich, University of Trier, University of Salford, and University of Santiago de Compostela. http://www.llc.manchester.ac.uk/research/projects/archer/.
B-Brown = B-Brown Corpus. In progress. English Department, University of Zurich. http://www.es.uzh.ch/Subsites/Projects/BBROWN.html.
BE06 = The British English 2006 corpus. 2008. Compiled by Paul Baker. Lancaster University. http://www.helsinki.fi/varieng/CoRD/corpora/BE06/index.html.
BLOB-1901 = Lancaster-1901 Corpus. In progress. Compiled by Nick Smith, Paul Rayson, and Geoffrey Leech. Lancaster University.
BLOB-1931 = Lancaster-1931 Corpus. 2003–6. Compiled by Geoffrey Leech, Paul Rayson, and Nick Smith. Lancaster University. http://www.helsinki.fi/varieng/CoRD/corpora/BLOB-1931/.
BoE = Bank of English Corpus (Cobuild Corpus). Distributed by Collins WordBanks Online. http://collinslanguage.com/content-solutions/wordbanks.
BNC = The British National Corpus, version 3 (BNC XML Edition). 2007. Distributed by Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/.
Brown = A Standard Corpus of Present-Day Edited American English, for use with Digital Computers. 1964, 1971, 1979. Compiled by W. Nelson Francis and Henry Kučera. Brown University. http://www.helsinki.fi/varieng/CoRD/corpora/BROWN/.
BYU-BNC = BYU-BNC: The British National Corpus. 2004–. Interface by Mark Davies. http://corpus.byu.edu/bnc/.
CED = A Corpus of English Dialogues 1560–1760. 2006. Compiled under the supervision of Merja Kytö (Uppsala University) and Jonathan Culpeper (Lancaster University). http://www.helsinki.fi/varieng/CoRD/corpora/CED/index.html.
CEEC = Corpus of Early English Correspondence. 1998. Compiled by Terttu Nevalainen, Helena Raumolin-Brunberg, Jukka Keränen, Minna Nevala, Arja Nurmi, and Minna Palander-Collin. Department of English, University of Helsinki. http://www.helsinki.fi/varieng/CoRD/corpora/CEEC/index.html.
(p. 907) CEECS = Corpus of Early English Correspondence Sampler. 1998. Compiled by Jukka Keränen, Minna Nevala, Terttu Nevalainen, Arja Nurmi, Minna Palander-Collin, and Helena Raumolin-Brunberg. Department of English, University of Helsinki. http://www.helsinki.fi/varieng/CoRD/corpora/CEEC/ceecs.html.
CEEM = Corpus of Early English Medical Writing. In progress. Compiled under the supervision of Irma Taavitsainen and Päivi Pahta. University of Helsinki. See MEMT and EMEMT.
CHF = Corpus of Historical Fiction. 2010. Compiled by Bethany Gray. Northern Arizona University.
CIE = A Corpus of Irish English. 2003. Compiled by Raymond Hickey. University of Duisburg-Essen. http://www.uni-due.de/CP/CIE.htm.
CLMETEV = The Corpus of Late Modern English Texts (Extended Version). 2006. Compiled by Hendrik De Smet. Department of Linguistics, University of Leuven. http://www.helsinki.fi/varieng/CoRD/corpora/CLMETEV/.
CMSW = Corpus of Modern Scottish Writing. In progress. Principal investigator: John Corbett. University of Glasgow. http://www.scottishcorpus.ac.uk/cmsw.
COCA = The Corpus of Contemporary American English. 2008–. Compiled by Mark Davies. Brigham Young University. http://corpus.byu.edu/coca/.
COERP = Corpus of English Religious Prose. In progress. Compiled by Thomas Kohnen, Tanja Rütten, Ingvilt Marcoe, Kirsten Gather, and Dorothee Groeger. University of Cologne. http://www.helsinki.fi/varieng/CoRD/corpora/COERP/.
COHA = Corpus of Historical American English. 2010–. Compiled by Mark Davies. Brigham Young University. http://corpus.byu.edu/coha/.
CONCE = A Corpus of Nineteenth-Century English. 2000. Compiled by Merja Kytö (Uppsala University) and Juhani Rudanko (University of Tampere).
COOEE = Corpus of Oz Early English. 2004. Compiled by Clemens Fritz. Free University of Berlin. http://www.helsinki.fi/varieng/CoRD/corpora/COOEE/.
CoRD = Corpus Resource Database. 2007–. http://www.helsinki.fi/varieng/CoRD/index.html.
CSC = Corpus of Scottish Correspondence, 1500–1715. 2007. Compiled by Anneli Meurman-Solin. University of Helsinki. http://www.helsinki.fi/varieng/CoRD/corpora/CSC/index.html.
DCPSE = Diachronic Corpus of Present-Day Spoken English. 2002–4. Compiled under the supervision of Bas Aarts. Survey of English Usage, University College London. http://www.ucl.ac.uk/english-usage/projects/dcpse/.
DECTE = Diachronic Electronic Corpus of Tyneside English. In progress. Compiled under the supervision of Karen P. Corrigan. Newcastle University. http://research.ncl.ac.uk/decte/.
DOEC = Dictionary of Old English Corpus. Original release 1981 compiled by Angus Cameron, Ashley Crandell Amos, Sharon Butler, and Antonette diPaolo Healey. Release 2009 compiled by Antonette diPaolo Healey, Joan Holland, Ian McDougall, and David McDougall, with Xin Xiang. University of Toronto. http://www.helsinki.fi/varieng/CoRD/corpora/DOEC/index.html.
ECCO = Eighteenth-Century Collections Online. http://gale.cengage.co.uk/product-highlights/history/eighteenth-century-collections-online.aspx.
EMC = Corpus of Early Medieval Coin Finds and Sylloge of Coins of the British Isles databases (Fitzwilliam Museum, Cambridge). http://www.fitzmuseum.cam.ac.uk/coins/emc/.
EMEMT = Early Modern English Medical Texts. 2010. Compiled by Irma Taavitsainen (University of Helsinki), Päivi Pahta (University of Tampere), Martti Mäkinen (Svenska handelshögskolan), Turo Hiltunen, Ville Marttila, Maura Ratia, Carla Suhr, and Jukka Tyrkkö (University of Helsinki). http://www.helsinki.fi/varieng/CoRD/corpora/CEEM/EMEMTindex.html.
ESTC = English Short Title Catalogue. http://estc.bl.uk.
eWAVE = The electronic World Atlas of Varieties of English. 2011. Edited by Bernd Kortmann and Kerstin Lunkenheimer. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://www.ewave-atlas.org/.
FLOB/F-LOB = The Freiburg–LOB Corpus of British English. Original release 1999 compiled by Christian Mair (Albert-Ludwigs-Universität Freiburg). Release 2007 compiled by Christian Mair (Albert Ludwigs-Universität Freiburg) and Geoffrey Leech (University of Lancaster). http://www.helsinki.fi/varieng/CoRD/corpora/FLOB/.
Frown = The Freiburg-Brown Corpus. Original release 1999 compiled by Christian Mair (Albert-Ludwigs-Universität Freiburg). Release 2007 compiled by Christian Mair (Albert Ludwigs-Universität Freiburg) and Geoffrey Leech (University of Lancaster). http://www.helsinki.fi/varieng/CoRD/corpora/FROWN/.
Google Books (American English) Corpus. 2011–. Compiled by Mark Davies. Brigham Young University. http://googlebooks.byu.edu/.
The Gutenberg Archive. 2011. http://www.gutenberg.org/.
HC = Helsinki Corpus of English Texts. 1991. Compiled by Matti Rissanen (Project leader), Merja Kytö (Project secretary); Leena Kahlas-Tarkka, Matti Kilpiö (Old English); Saara Nevanlinna, Irma Taavitsainen (Middle English); Terttu Nevalainen, Helena Raumolin-Brunberg (Early Modern English). Department of English, University of Helsinki. http://www.helsinki.fi/varieng/CoRD/corpora/HelsinkiCorpus/index.html.
HCOS = Helsinki Corpus of Older Scots. 1995. Compiled by Anneli Meurman-Solin. Department of English, University of Helsinki. http://www.helsinki.fi/varieng/CoRD/corpora/HCOS/.
ICAME = International Computer Archive of Modern and Medieval English. http://icame.uib.no/.
ICE = The International Corpus of English, version 2. 2006. Coordinated by Gerald Nelson (University of Hong Kong). http://ice-corpora.net/ice/index.htm.
LAEME = A Linguistic Atlas of Early Middle English, 1150–1325. 2007. Compiled by Margaret Laing and Roger Lass. University of Edinburgh. http://www.lel.ed.ac.uk/ihd/laeme1/laeme1.html.
LAOS = A Linguistic Atlas of Older Scots, Phase 1: 1380–1500. 2008. Compiled by Keith Williamson. University of Edinburgh. http://www.lel.ed.ac.uk/ihd/laos1/laos1.html.
Lion = Literature Online. http://lion.chadwyck.co.uk/.
LOB = The Lancaster-Oslo/Bergen Corpus, original version. 1970–78. Compiled by Geoffrey Leech (Lancaster University), Stig Johansson (University of Oslo), and Knut Hofland (University of Bergen). http://www.helsinki.fi/varieng/CoRD/corpora/LOB/.
London Lives: 1690–1800. Crime, Poverty and Social Policy in the Metropolis. Version 1.0. 1 September 2010. http://www.londonlives.org/.
LSWE = The Longman Spoken and Written English Corpus. http://www.pearsonlongman.com/dictionaries/corpus/.
MEG-C= The Middle English Grammar Corpus, version 2011.1. Compiled by Merja Stenroos, Martti Mäkinen, Simon Horobin, and Jeremy J. Smith. University of Stavanger. http://www.uis.no/research/culture/the_middle_english_grammar_project/meg-c/.
MEMT = Middle English Medical Texts. 2005. Compiled by Irma Taavitsainen (University of Helsinki), Päivi Pahta (University of Tampere), and Martti Mäkinen (University of Stavanger). http://www.helsinki.fi/varieng/CoRD/corpora/CEEM/MEMTindex.html.
NECTE = The Newcastle Electronic Corpus of Tyneside English. 2005. Compiled by Karen Corrigan (Newcastle University), Hermann Moisl (Newcastle University), and Joan Beal (University of Sheffield). http://www.helsinki.fi/varieng/CoRD/corpora/NECTE/.
NECTE2 = The Newcastle Electronic Corpus of Tyneside English 2. In progress. Compiled by Karen Corrigan. Newcastle University. http://www.research.ncl.ac.uk/necte2/.
NEET = Network of Eighteenth-century English Texts. 2007. Compiled by Susan Fitzmaurice. University of Sheffield. http://sites.google.com/site/helontheweb/corpora.
NYT = Corpus of Historical Newspaper Writing (New York Times). 2010–11. Compiled by Bethany Gray. Northern Arizona University.
OBC = Old Bailey Corpus. In progress. Compiled under the supervision of Magnus Huber. University of Giessen. http://www.uni-giessen.de/oldbaileycorpus/index.php.
Old Bailey Online. Version 6.0. March 2011. http://www.oldbaileyonline.org/.
ONZE = Origins of New Zealand English Corpus. In progress. Compiled by the ONZE project team. University of Canterbury. http://www.lacl.canterbury.ac.nz/onze/index.html.
PASE = Prosopography of Anglo-Saxon England. 2010. http://www.pase.ac.uk/index.html/.
PCEEC = The Parsed Corpus of Early English Correspondence. 2006. Annotated by Ann Taylor, Arja Nurmi, Anthony Warner, Susan Pintzuk, and Terttu Nevalainen. Compiled by the CEEC Project Team. University of York and University of Helsinki. Distributed through the Oxford Text Archive. http://www.helsinki.fi/varieng/CoRD/corpora/CEEC/pceec.html.
PPCEME = Penn-Helsinki Parsed Corpus of Early Modern English. 2004. Compiled by Anthony Kroch, Beatrice Santorini, and Ariel Diertani. University of Pennsylvania. http://www.ling.upenn.edu/hist-corpora/PPCEME-RELEASE-2/index.html.
(p. 910) PPCMBE = Penn Parsed Corpus of Modern British English. 2010. Compiled by Anthony Kroch, Beatrice Santorini, and Ariel Diertani. University of Pennsylvania. http://www.ling.upenn.edu/hist-corpora/PPCMBE-RELEASE-1/index.html.
PPCME2 = Penn-Helsinki Parsed Corpus of Middle English, 2nd edn. 2000. Compiled by Anthony Kroch and Ann Taylor. University of Pennsylvania. http://www.ling.upenn.edu/hist-corpora/PPCME2-RELEASE-3/index.html.
PT = Corpus of Historical Science Writing (The Philosophical Transactions of the Royal Society). 2010–11. Compiled by Bethany Gray. Northern Arizona University.
S = The Electronic Sawyer. An Online Catalogue of Anglo-Saxon Charters. 2011. http://www.esawyer.org.uk/.
SAVE = The South Asian Varieties of English Corpus. 2011. Compiled by Joybrato Mukherjee, Tobias Bernaisch, Christopher Koch, and Marco Schilk. University of Giessen. http://www.uni-giessen.de/cms/faculties/f05/engl/ling/research/save.
SCOTS = Scottish Corpus of Texts and Speech. 2007. Compiled by Professor John Corbett (Principal Investigator), Dr. Wendy Anderson (Research Assistant 2004–7), Dr. Fiona Douglas (Research Assistant 2001–3), Dave Beavan (Computing Manager), Professor Christian Kay, Jean Anderson, Dr. Jane Stuart-Smith, Louise Sweeney, Cerwyss O'Hare, and Flora Edmonds. Department of English Language, University of Glasgow. http://www.scottishcorpus.ac.uk.
The Statesman. 2011. http://www.thestatesman.net/.
TEAMS = The Consortium for the Teaching of the Middle Ages. http://www.lib.rochester.edu/camelot/teams/tmsmenu.htm.
TIME = TIME Magazine Corpus. 2007–. Compiled by Mark Davies. Brigham Young University. http://corpus.byu.edu/time.
WebCorp = The Web as Corpus. Created, operated, and maintained by the Research and Development Unit for English Studies, School of English, Birmingham City University. http://www.webcorp.org.uk.
YCOE = The York-Toronto-Helsinki Parsed Corpus of Old English Prose. 2003. Compiled by Ann Taylor, Anthony Warner, Susan Pintzuk, and Frank Beths. Department of Language and Linguistic Science, University of York. http://www.helsinki.fi/varieng/CoRD/corpora/YCOE/.
ZEN = Zurich English Newspaper Corpus, version 1.0. 2004. Compiled by Udo Fries, Hans Martin Lehmann, Beni Ruef, Peter Schneider, Patrick Studer, Caren auf dem Keller, Beat Nietlispach, Sandra Engler, Sabine Hensel, and Franziska Zeller. English Department, University of Zurich. http://es-zen.unizh.ch.