A 2015 count of Japanese word frequency

*Slowly emerges from underground cave*

I’ve been doing computer things to large samples of Japanese text. To be more specific, I’ve been feeding the full contents of the Japanese Wikipedia to Mecab, R, python, and several small shell scripts.

It occurred to me that, while these things are at hand, it would be simple to make a new count of frequent Japanese words. So I did. You can see what is it like at this Wiktionary page. Full TSV tables are available for download: the count of lemmas (uninflected words), and of inflected word forms.

New stuff about kanji is forthcoming.

*Slowly submerges to cave*