The BNC/Coca family lists are based on large corpora with families as complete as possible in ortder to classify every word of any text (in, e.g., VocabProfiles). But even K-1 to K-3 families may contain members that learners will never meet, or which appear mainly in specific text types (medicine, engineering). Thus the case for reducing these lists to their essentials in both initial and specialist learning. Nuclear List Builder "crosses" family lists against word frequencies in a smaller (1-4 million words) corpus to obtain a list of just the family members that are frequent in that corpus. Read a
paper or summary about this. (*Parallel French application en route*)