Why are 95% and 98% underlined on the VP output table?

An important and interesting question in vocabulary studies is the proportion of a text's words that must be known for the text to be understood.

Laufer's (1989) empirical research showed that learners who knew 95% of the words in a text would tend to score at least 60% on a comprehension quiz for the text, while Nation's (2006) corpus research showed that learners who knew 98% of the words would tend to score at least 70%.

Further, Laufer proposed that her 95% figure would correspond to knowing 5,000 word families for average texts, while Nation proposed that his would correspond to 8,000 word families. Obviously this will depend on the type of text to some extent.

The red lines on Lextutor-VP Compleat's summary table help you explore these findings with your own texts.

For one practical use, the number of k-levels needed to get to the 95% line is probably a good indicator of text difficulty. If the 95% line is reached with only 1,000 words, this is a fairly basic text. Learners who know only 1,000 word families can make some sense of the text. But if 95% is reached only after 5,000 or 6,000 words, this is a text with complex vocabulary and not for beginners - or if so as an intensive reading exercise with lots of look-ups and several readings.


2018 Apr 21