Cover_Calc INPUT

Home > Coverage

Coverage Calculator v.3 4 NEW:: NFL-0 hyper-FAMS May 2026
The percentage of words in a list that appear in a corpus

This program calculates how many times the words on a list appear in a corpus. A list of the 2,000 most common word families is often said to 'cover' up to 80% of the individual words (tokens) in a general corpus of English - i.e., 80% of the words in the corpus are words from the list. || Treatment of proper nouns is a checkbox option.|| Headword lists can be expanded into family/lemma lists here || List coverage in texts can be calculated here (Demo 7). || Known max of this routine 2024: 13,000 wds in list by ≈ 1 million wds in corpus (test corpora/texts will be reduced by program if needed)

RESEARCH: > 1. Nation (2006) 2. Laufer Ravenhorst (2010) 3. Schmitt Jiang Grabe (2011) 4. Schmitt Cobb et al (2015) 5. Laufer (2020) 6. Cobb Laufer (2021)

DEMO LISTS
BNC/COCA
1-2k 1-3k 1-4k
1-5k 1-6k 1-7k 1-8k
~ NUCLEAR 1-3k
As per Cobb & Laufer 2021
nfl-1 nfl-2 nfl-7

Nuc v.2 May '26

BNC/Coca HyperFams
1-2k || 1-3k || 1-4k || 1-5k
CURRENT PROJECT
French NUCLÉAIRE (1-3k)
LFNF - Listes de fréquence
nucléaires françaises
As per Cobb Lindqvist Ramnas 2026
fr_lfnf-0
fr_lfnf-1
fr_lfnf-5
Updates may
be available at
Nuc. List Builder
(1) Select or paste + name a LIST

the is of and a an
(2) Choose Test Text
(Eng 1 Fr 2; chops to 350k)
(3)

(4) Check result

stats count