F is for Frequency

Ken BeattyDr. Ken Beatty

Here are two key questions related to frequency: What are the most frequent words in the English language? and How frequently do we need to be exposed to new words in order to acquire them?

To answer the first question, certain words in every language appear more often than others. A common list of the most frequent 20 words in English typically includes the, be, and, of, a, in, to, have, to, it, I, that, for, you, he, with, on, do, say, and this.

Many would agree that these 20 words seem extremely common, but any comprehensive list of words tends to be drawn from a selective corpus, or body of words. The selective nature of any corpus influences what words will appear most often. For example, the Cambridge International Corpus replaces some of the above top 20 words with, uh, yeah, know, like, they, so, was, and but. Many teachers would take exception to teaching uh and yeah as among the most important words for their beginner students to learn.

The sources for corpora are often dialect-specific, for example, American English or British English, and may focus on written English or spoken English or a combination. Some corpora only collect words related to particular themes and time periods (e.g., 18th-century novels) or specific genres of speech or writing, such as medical English. Some corpora document dead or obscure forms of English, such as those drawn from Old English, Middle English, and Early English texts.

The practice of building corpora goes back to the Middle Ages, but truly useful corpora only appeared in the 1960s when computer databases could be used to quickly compile and search them. The Brown Corpus of 1961 featured what was then an astounding million words. Modern corpora are larger—much larger. The Google N-gram corpus is at 155 billion words and growing. Even with the largest databases, a computer can quickly determine which words are the most frequently used.

However, looking over the list of 20 words in the first paragraph, it seems that only a few meaningful sentences can be fashioned from them, for example, I have that. You say it. This is because of the low number of content words versus function words.

Content words are the ones we first learn as children. Even before babies are taught the correct names for everyday people, places, and things, they will improvise and attach sounds to things of interest. In this way, a pet dog might be called an emphatic “Ba!”

A study by Hochmann, Endress & Mehler (2010) tried to trick children into learning a different part of speech (determiners such as a, an, the) instead of content words, but their experiment failed. The learning of content words seems to be more natural. It might be because young children are often monosyllabic. They get away with just using content words because their parents and peers are happy to infer the larger meanings. For example, a baby using the word milk is generally understood to be announcing her interest in having some.

If we can identify the most frequent words, should they be the ones we should teach? It would seem sensible and since 1953, when Michael West produced the General Service List of 2,300 words that were thought to be essential to communication, other lists have been introduced on a regular basis. Some, like Averil Coxhead’s Academic Word List are more specialized. Textbooks tend to focus on such word lists to help students learn the most age- and subject-appropriate vocabulary.

But what about the second question, How frequently do we need to be exposed to new words in order to acquire them? In 1990, Paul Nation suggested that students need to be exposed to a new word between five and sixteen times in order to acquire it. But according to Gu (2003), researchers have since tested that hypothesis with wildly different findings.

Some researchers suggest that it is not the frequency with which one encounters words but the intervals in which students are exposed to them. Those in favor of interval theories suggest that having a schedule for reviewing flashcards of new vocabulary items is a useful task, and most teachers would agree. As each word becomes firmly set in memory, it can be discarded from the flashcard deck and new words added. Most textbooks recognize the importance of exposing students to both words and structures repeatedly. In publishing parlance, this is called recycling and it’s something good teachers do naturally.

Regardless of the method, part of the problem is that students have difficulties retaining new vocabulary items that they don’t perceive as being useful to them.

This probably explains why beginner students seem to acquire so many new words while advanced students struggle to expand their vocabularies. Consider the basic and high-frequency word window and the lower-frequency word llama. A beginner student learning the word window will have seen windows since birth and will continue to see them every day. Moreover, there will be frequent opportunities to hear and use the word window in everyday conversation: “Look out the window!” “Open the window, please.”

On the other hand, while the word llama might be part of a short story or an article on South American animal husbandry, it’s unlikely to be part of a learner’s everyday vocabulary unless the learner’s parents are llama herders.

This last point is not as silly as it sounds, as it helps to point out a fault in many frequency lists: Such lists fail to acknowledge individual or local vocabulary needs. Frequency lists are usually developed on national or international scales, and although they are perfectly suitable for most challenges students face in reading, writing, speaking, and listening, they don’t allow students to talk about many of the local and personal things that are most important to them.

Imagine the differences between two students living in the same country using the same generic textbooks. The first student lives in a small, snowy mountain village where skiing and other winter sports are the most common pastimes. The second student lives in a large tropical city where life revolves around the beach. Imagine the different foods they eat, the clothes they wear, and the transportation they use. It’s natural that they would need different vocabularies to narrate their respective lives.

Students need to be exposed to high-frequently vocabulary and exposed to it in meaningful ways. Beyond reading and hearing new words, students require opportunities to use vocabulary in speaking and writing tasks. More importantly, they need to be exposed to the vocabulary that they need to explore and explain their everyday lives.

“Oh! Look out the window. There’s a llama!”

Tasks for Teachers

1. Create a personal and local dictionary

Work with students to create your own frequency list of words that your students need to learn based on their needs. These include personal needs (e.g., particular sports) and local needs (e.g., language to describe their neighborhood). Do this in an online document set out with each letter of the alphabet. You and your students will be able to use and expand your dictionary of personal and local vocabulary over many years.

2. Test the hypothesis of new words being easier to use if they’re contextual to the students’ lives.

  • Consider the level of your students and teach them ten low-frequency words, five that name things they are likely to see every day, and five that they are unlikely to encounter. For example, the low-frequency word for the end of a shoelace is aglet. Another five-letter, low-frequency word students are also unlikely to know is reeve, a nautical verb meaning to pass a line through a hole, eye, or block. Students are likely to see aglets every day but unlikely to come across anyone reeving a line.
  • With a group of young beginners, you might teach the names of ten animals, five they are likely to see and five they are not. Try to keep the letter count the same for both sets of words.
  • With advanced students you might consider using Nadsat, the Russian-based invented language created by Anthony Burgess for his novel A Clockwork Orange. Dictionaries of Nadsat are available online.
  • Follow Paul Nation’s suggestion of exposing the students to each word five to sixteen times. After a week, and after a month or more, test the students’ memory of the words to see if there are differences between their retention of obscure words they see in their everyday contexts compared with obscure words that they do not encounter.

Tasks for Students

1. Be a researcher and do a corpus task

  • Ask students to search online and locate the British National Corpus (BNC).
  • In the search box, have each student type in a different everyday word, such as field. The BNC will produce 50 random meanings (different ones are generated each time you search) with a sentence for each showing the context.
  • Have students examine the context sentences and see what different meanings there are for each word. Which meanings are common and easy to understand? Which are uncommon and difficult to understand?
  • Ask students to compare their lists with other students.

2. Play a vocabulary game that encourages peer teaching

  • Have each student prepare a list of ten difficult words with definitions taken from books the students are currently reading (rather than combing the dictionary for obscure language).
  • One student begins by sharing a word that she knows and asks if any other student can define it. Students are welcome to guess.
  • When a correct answer is given, the student who has given it can go next or can be awarded a point.

If you try these activities and have feedback, suggestions or questions, Dr. Ken Beatty would love to hear from you.


Acknowledgement: I thank my graduate student, David Penton, for observations on the informal language in the Cambridge International Corpus.

Dr. Ken Beatty, TESOL Professor at Anaheim University in California, has also taught in Asia, Canada, and the Middle East, and lectures widely on language teaching and learning from the elementary through university levels. He has given 200+ teacher-training sessions in 25 countries and is author of 130+ textbooks, including books in the Pearson series Learning English for Academic Purposes (LEAP).