Word Frequencies

Word frequency lists are lists of a language's words grouped by frequency of occurrence within some given text. We use codes from the following research for the frequency values:

Robyn Speer, Joshua Chin, Andrew Lin, Sara Jewett, & Lance Nathan. (2018, October 3). LuminosoInsight/wordfreq: v2.2. Zenodo.

The Zipf frequency of a word is the base-10 logarithm of the number of times it appears per billion words. For example, a word with a Zipf value 6 appears once in every 1000 words from start to the end of a given text. Or, a word with a 4.5 value appears in every 5000 words on average. Values range from 0 to 8, as given in table.

We excluded phrases and other form of word groups from this list because of their inconsistent frequency values.

Frequency data for English has been collected from various sources according to the research:

  • Wikipedia
  • Subtitles
  • News
  • Books
  • Web texts
  • Twitter
  • Reddit
# Appearance
1 once per 100 million words
2 once per 10 million words
3 once per 1 million words
4 once per 100 thousand words
5 once per 10 thousand words
6 once per 1 thousand words
7 once per 100 words