Word Frequency Analyzer
Paste your text to get word frequencies, Zipf's Law visualization, type-token ratio, and hapax legomena analysis. All processing happens in your browser.
Drop a .txt file here or click to upload
Zipf's Law Visualization
Log-log plot of word rank vs. frequency — natural language follows a power law
Frequency Table
| Rank | Word | Count | % | Cum. % | Distribution |
|---|
Zipf's Law states that in any natural language corpus, the frequency of a word is inversely proportional to its rank. The most frequent word appears roughly twice as often as the second most frequent, three times as often as the third, and so on.
Type-Token Ratio (TTR) measures lexical diversity. Types are unique words; tokens are total words. A TTR closer to 1.0 means more diverse vocabulary. Shorter texts tend to have higher TTR.
Hapax legomena (Greek for "said once") are words that appear only once in a text. In large corpora, hapax legomena typically make up 40-60% of unique words — a hallmark of Zipf's Law.