nlp:corpus_analysis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:corpus_analysis [2023/05/16 08:08] jmflanignlp:corpus_analysis [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Corpus Analysis ====== ====== Corpus Analysis ======
-Often considered a linguistics topic, //**corpus analysis**// is the study of language use in a corpus, often analyzing the distribution of various phenomena (phonological, lexical, syntactic, etc). Sometimes the analysis is performed comparing across time, languages, or different genres.+Often considered a linguistics topic, //**corpus analysis**// is the study of language in a corpus, often analyzing the distribution of various phenomena (phonological, lexical, syntactic, etc). Sometimes the analysis is performed comparing across time, languages, or different genres.
  
 ===== Frequency Distribution and Zipf's Law ===== ===== Frequency Distribution and Zipf's Law =====
Line 11: Line 11:
   * [[https://www.sciencedirect.com/science/article/pii/S0019995858902298|Miller et al 1959 - Length-frequency statistics for written English]], available [[https://www.sciencedirect.com/journal/information-and-control/vol/1/issue/4|here]]. A study of frequency statistics of words using the UNIVAC. Talks about types and tokens.  Introduces the terms "function words" and "content words" on p. 377 (p. 8 in the pdf).   * [[https://www.sciencedirect.com/science/article/pii/S0019995858902298|Miller et al 1959 - Length-frequency statistics for written English]], available [[https://www.sciencedirect.com/journal/information-and-control/vol/1/issue/4|here]]. A study of frequency statistics of words using the UNIVAC. Talks about types and tokens.  Introduces the terms "function words" and "content words" on p. 377 (p. 8 in the pdf).
  
-==== People ===== +===== Books ===== 
-  * [[https://en.wikipedia.org/wiki/George_Armitage_Miller|George Armitage Miller]] +  * [[https://books.google.com/books?id=fzkQPKoFEb0C&pg=PA1|Word Frequency Distributions]], Harald (2002) 
-  * [[https://en.wikipedia.org/wiki/George_Kingsley_Zipf|George Kingsley Zipf]]+ 
 +===== People ===== 
 +  * [[https://en.wikipedia.org/wiki/George_Armitage_Miller|George Miller]] 
 +  * [[https://en.wikipedia.org/wiki/George_Kingsley_Zipf|George Zipf]]
  
nlp/corpus_analysis.1684224498.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki