Linguistic Corpora

A linguistic corpus is a set of representative written or oral texts, often with linguistic annotations, for use as data in certain types of linguistic analysis. These texts are usually selected through a method of sampling and are meant to be representative of a certain form of a language, either in a certain time period or over time.

This collection in the UNT Digital Library contains a number of corpora (the plural of "corpus") that have been licensed for use by members of the UNT community. For more on linguistic corpora and tools for working with linguistic data, see the linguistics subject guide, which includes contact information for UNT's subject librarian for linguistics.

Basic statistics about this collection.
27 Items	1 Type	4 Titles
1 Partner	2 Decades	2 Languages
0 Counties	0 States	0 Countries
6,058 Usage	4 years ago Collection Created	3 years, 6 months ago Last Updated

Latest Additions RSS Feed

Corpus of Contemporary American English (2020 update)

ETS Corpus of Non-Native Written English

Corpus of News on the Web (NOW) - June 2018

Corpus of News on the Web (NOW) - March 2018

Corpus of News on the Web (NOW) - April 2018

Corpus of News on the Web (NOW) - May 2018

Corpus of News on the Web (NOW) - February 2017

Corpus of News on the Web (NOW) - September 2017

Corpus of News on the Web (NOW) - January 2018

Corpus of News on the Web (NOW) - January 2010 to October 2016

Corpus of News on the Web (NOW) - May 2017

VIEW ALL

Cite This Collection

Here is our suggested citation. Consult an appropriate style guide for conformance to specific guidelines.

Linguistic Corpora in UNT Digital Library. University of North Texas Libraries. https://digital.library.unt.edu/explore/collections/LINGC/ accessed May 12, 2024.

Explore Holdings

Start browsing through the holdings of this collection in one of the following ways:

At a Glance

Latest Additions RSS Feed

Cite This Collection

Explore Holdings

Partner

Resource Type

Languages

Decades

Titles

Contact Us