Home>Frequency>Nuclear input ::: UPDATE 2025-12-12
Nuclear List Builder v.4.3
  Reduce a family list to frequent members
  + NEW - DERIVATIONS COUNT || FRENCH FAMILIES
+ Mobile
Jan '25
The BNC/Coca family lists are based on large corpora with families as complete as possible in ortder to classify every word of any text (in, e.g., VocabProfiles). But even K-1 to K-3 families may contain members that learners will never meet, or which appear mainly in specific text types (medicine, engineering). Thus the case for reducing these lists to their essentials in both initial and specialist learning.
    Nuclear List Builder "crosses" family lists against word frequencies in a smaller (1-4 million words) corpus to obtain a list of just the family members that are frequent in that corpus. Why is this interesting? Read a paper about this, or its summary. (*Parallel French study en route arrivĂ© 14 janvier 2026*)


(1) Start from full familized BNC/Coca (Eng) or Lonsdale/Le Bras' LFNF-0/3000 (Fr)

(2) Choose Cross-Corpus

User upload
(850k wds ; format ~.txt; Enc UTF-8)

 OR 

Stored corpus

(3) Click 'Make List' to see complete list

FIRST
Explore cutoffs
Or just
Fam sum


(4)

THEN
Choose
cut-offs

  (5) Cut-offs↓
Include only words >
of Fam

OR
Count > in Cross-Corpus
 ? 

WITh OPTIONS:

Mark derived
words "z_"      ? 

Show %          ? 

Fam sums      ? 

(6)

(7) Get Result   (8)