Home > Coverage Calculator
  Coverage Calculator v.1
      The percentage of corpus words covered by a word list
This program calculates the number of times the words on a list appear in a corpus. For example, a list of the 2000 most common word families is often said to 'cover' up to 80% of the words in a general corpus of English. The coverage figure refers to individual words (= 'running words'/'tokens') appearing throughout a corpus. Headword lists can be expanded into family/lemma lists here . List coverage in texts can be calculated here (Demo 7). Known max of this routine late 2017: 13k wds list x 2.3m wds corpus

COVERAGE RESEARCH: >   1. Nation (2006) 2. Laufer Ravenhorst (2010) 3. Schmitt Jiang Grabe (2011) 4. Schmitt Cobb et al (2015)  

 

DEMO
LISTS

AWL Heads

AWL Families

BNC 1k Fams

BNC 1k Lemmas

BNC-Coca
1k Fams

BNC-Coca
1k+2k Fams

NGSL
1k Lemmas

NGSL
1k+2k Lemmas

(1) Click or paste LIST/name

(2) Choose Corpus

(3) Click

   

(4) Result