How Many Words Do You Actually Know?

You may have a rough sense of your level in a language.

You can read some articles, follow some podcasts, hold some conversations. But if someone asked you a more concrete question — how many words do you actually know? — most people would struggle to answer.

That question is harder than it looks. Vocabulary size clearly matters, but measuring it in a meaningful way is not straightforward.

That is why I built the Vocabulary Size Estimator, a new free tool in the Lab.

A small tool for a difficult question

The idea is simple.

You choose a language, and the tool shows you a short checklist of words drawn from different frequency bands — from very common vocabulary to much rarer items. You mark the words you recognise, and the tool returns:

  • an estimated receptive vocabulary size
  • a confidence range
  • a CEFR-level approximation
  • a breakdown by frequency band

At the moment, it supports English, German, Spanish, French, and Russian

The whole test only takes a few minutes, but it is designed to produce something more useful than a vague impression.

How the estimate works

Behind the scenes, the tool uses precomputed lemma frequency tables grouped into five Zipf-frequency bands. Each band contributes a small sample of real words, along with a few pseudowords — plausible-looking non-words included to detect overclaiming. 

If a user marks too many pseudowords as familiar, the result is flagged as unreliable. Otherwise, the estimate is adjusted using the pseudoword false-positive rate and reported with Wilson-score confidence intervals. 

The band sizes are also capped at realistic maxima, so the result is not inflated by the long tail of proper names, abbreviations, misspellings, and other marginal items found in large corpora. 

So the goal is not to generate an impressive number. It is to arrive at a more plausible one.

Why I made it

I wanted a vocabulary test that would be short enough to take on a coffee break, but still serious enough to be worth taking.

Many vocabulary tests online are either too casual to mean much or too opaque in how they arrive at their results. I wanted something more transparent: a compact test based on word frequency, with a correction for overclaiming, and an output that tells you a bit more than a single raw score.

No short test can measure vocabulary knowledge perfectly. But it can still give you a useful reference point — especially if you are trying to think more clearly about reading ability, lexical coverage, or your current stage in a language.

Try it

The Vocabulary Size Estimator is now live in the Lab.

Try it, see what it gives you, and compare the result with your own intuition.

Sometimes a rough estimate is already enough to make the question feel more concrete.