These routines chop words into prefixes, head words/roots, and suffixes in various ways that have proven useful in language learning research. They make use of PERL's ability to identify linguistic patterns through its powerful Regular Expressions regexes. (These routines are experimental - 98% accurate/95% complete in August 2014).
- Affix Levels x Frequency List Builder (Affixes + independent headwords - unCLEAR)Nation and his colleagues have worked out a hierarchy of derivational affixation relevant to learning English as a second language, according to which affixes are most used, most transparent, and most likely to be known at different stages of learning. These are the basis of the different family lists. This program lets you track five sets of affixes (or all of them together) through 20 k-lists (effectively the whole non-specialist lexicon of English).
NOTE that only affixes attached to a stable real independent word are handled at this point in this program's development (enflame but not enter or even endure).
- 14 Master Words x Frequency search (Affixes + Greco-Latin roots - conCEIVE)An influential notion in some ESL circles postulates that the morphological components of just 14 words are re-used in as many as 14,000 further words. Thus, if learners knew the morphology of these 14 words... Test this idea against the entire lexicon of general English. (Ref: Thompson, E. (1958). The “Master Word” approach to vocabulary training. Journal of Developmental Reading, 2 (1), 62-66.)Related:
• Recent English versions of Vocabprofile include a TYPES:FAMILIES index. Used longitudinally, this index is a powerful indicator of interlanguage morphology development (used to effect in a study by Horst & Collins 2006).
• List_Learn/French and the AWL section of List_Learn/English have routines that assemble all available morphologies for a given word-radical. For example, clicking on the AWL entry academ' produces all the forms (academy, academic, etc. in a corpus), plus information about the frequency of each (academic is easily the most common form), as well as any semantic distinctions between them (academic is indirectly related to academy.)
Based on research by Brown (1975) and Bauer & Nation (1993). Perl regexes by Tom Cobb - Université du Québec à Montréal