One of the things that interest me in Japanese are the references to written language—specifically, to Chinese characters (kanji)—in speech. Of course, literate speakers of most languages will sometimes refer to writing (“I meant cue, cue with a ‘c’”). It’s my subjective impression, however, that the Japanese do it more often, and the morphographic nature of kanji makes it feel… different.

Unfortunately I have no way of backing my impression with facts, lacking expertise and access to a Japanese spoken conversation corpus. But, as an illustration of the kind of thing I’m thinking, here’s an excerpt from Masayo Koike’s げんじつ荘 Genjitsu-sou. The main character is having a telephone conversation with her friend Yoriko, who is bragging about giving natural birth at her age:



Normally they do what’s called a “tēōsekkai” [テーオーセッカイ], you know, where they cut your belly. Yoriko-san told me this as if it was enough for her to be proud of.

Hmm, “tēōsekkai”? I’m not as healthy as Yoriko-san, so if it was I giving birth, surely it would have to be that way, right? While I replied Yoriko-san thus, I converted that sound “tēō” inside my head to the kanji 帝王 ["Emperor"], and wondered why they gave it such an impressive name.

The mental process described by the narrator must be familiar to readers of all languages: you did manage to catch a sequence of phonemes, but failed to parse and retrieve their corresponding morphemes (meaning-units), so all you’re left with is a meaningless string of sounds. The narrator retains it and bluffs through the conversation, and then, mid-speech, suddenly manages to retrieve the meaning, and only then wonders about semantic overtones. What’s interesting, to me, is that she equates grasping the meaning with knowing the kanji—even more, she describes the process of retrieving morphemes from phonemes as henkan “converting”, which is what a computer does when you feed it a string of (phonetic) kana and press a henkan “conversion” key to find the appropriate kanji. Both ideas—kanji as “meaning”, and henkan as meaning-retrieval—are by no means exceptional to this story, and I often hear in actual speech such expressions as henkan dekinakatta “I couldn’t convert it”, or sore no kanji wa? “What’s the kanji of that?” (= “What does that mean?”)

To my mind, this is evidence that kanji are felt as primarily morphographic, as argued by Joyce; and that, as proposed by Hansen, they correspond approximately to “ideas” in the popular conception of language. Hansen in particular has been much maligned by the “visible speech” school—De Francis, Unger and so on—so I should insist on the often-overlooked point that what Hansen (a philosopher) was talking about was the popular conception of language, not the “correct” conception, much less the empirically verifiable model. Insofar as we’re talking about how the average Japanese in the street conceive of kanji I’d say that, yes, they conceive of it as fundamentally semantic. Adepts of the phonetic model (like Boltz or Matsunaga) often express perplexity at the tenacity of Chinese and Japanese attachment to the characters, given that kanji are both complex and unnecessary; and then they dismiss their continued use as motivated by “social, cultural and emotional” factors. As if cultural, social and emotional factors weren’t the most important ones.

Writing Japanese without kanji: It’s possible, but…

Of course, these references to characters in speech are occasional, and it would be absurd to say that the Japanese need kanji to speak; contrary to legend, there are in fact such things as illiterate native speakers, and they can communicate perfectly well in purely spoken Japanese. What’s more, contrary to many Japanese I’ve met, I don’t even think kanji are strictly necessary for writing the language. After all, it is possible to understand a lecture or speech in spoken Japanese, so Japanese has to be communicable through sound only (it’s true that Japanese television, significantly, has an unusual tendency to attach subtitles to their own language; but, while helpful for “henkan“, these are by no means necessary).

A common objection to kanji-less writing is that an existing text, when converted to kana or Latin characters, usually becomes much harder to read, or even unintelligible. But this is easy to account for, if we remember that Japanese writing tends to use a lot more Sino-Japanese (kango) vocabulary, which is highly homophonic (too many morphemes sound the same). For example, there are five different shiyou-suru verbs; but in speech, if there was a possibility of confusion, the speaker would naturally choose an unambiguous native Japanese equivalent, like tsukau or tamesu.

Why are kango words so homophonic? Simple: kango morphemes came from written Literary Chinese (Jap. kanbun), which was itself homophonic to the point of unintelligibility. Literary Chinese was a kind of condensed abbreviation of the spoken language, relying on visual signs to distinguish morphemes. In other words, the reason why Japanese used five different shiyou verbs is because they could—as long as they had visual support available, there’s no reason not to overload the sounds. But in speech one has to control the ambiguity, which often means toning down on kango loans. So the way to write kanji-less Japanese is simple: just change the kind of Japanese you write—make it closer to the spoken language (“ii-kae”).

A form of this argument is presented by Matsunaga, with psycholinguistic experiments to back her up, and it’s all perfectly sound. But it misses the point that just because the Japanese could change their writing practice to abolish kanji doesn’t mean they want to (Unger’s argument that kanji are too hard for computers to deal with seems frankly naïve to my computer-science sensibilities—it´s just data munching, hardly an NP-hard problem!—and at any rate I think Time has already disproved it). And, to my mind, the Japanese motivations to stick with kanji are as valid as any. Sure, Chinese characters aren’t the simplest writing system, but “simplicity” was never the question; the metrics were never so crudely utilitarian. Au contraire, from the beginning, it was precisely the complexity that delighted the Japanese (or they’d have long switched to kana or bonji). And remote classical poetry isn’t the only example, either; kanji inscription-play was a constant through history of Japanese writing, from Heian court culture to Edo mass literature to modern teenagers’ manga comics (some examples of the latter being forthcoming in this blog).

The mental conception of kanji: related morphemes

Let’s go back Hansen’s observation that the average native assign to kanji more or less the same role that the average European assign to “idea” in their conception of language, i.e. that of a nexus of different sounds with the “same meaning” (Hansen was thinking of Chinese culture, but the observation extends naturally to Japanese, possibly even more so). I want to look at this from a purely linguistic angle—that is, synchronically, and phonetically.

Consider the mental lexicon of e.g. a child who’s still unschooled in kanji but already fluent in the language. Recall that the Japanese vocabulary is partitioned into four components with quite distinct phonotactics (ways of combining sounds):

Sound-symbolic words are a special category, and Anglo-Japanese is still “green”, but it’s usually the case that SJ morphemes have native equivalents and vice-versa. So it must be the case that the Japanese lexicon is organized into sets of near-synonyms (some, like Morioka and Joyce, even consider them to be allomorphs—different forms of the same morpheme). For example, once I saw someone ask why a tea ceremony shelf was called shun’shuu, and the answer was Haru no ‘Shun’ to Aki no ‘Shuu’ no desu kara “It’s the shun of haru [spring] and the shuu of aki [autumn]“. The speaker must group together parasynonyms like {shun, haru} and {shuu, aki}, because, when asked, they know it’s “the same shun” in seishun “springtime of youth” but “not the same” in shunji “instant” (they’d say something like “no, that’s the shun of matataku, not the shun of haru“). These morpheme sets can be compared to Germanic/Latinate pairs in English, such as {freedom, liberty} or {green, verdant}, except that 1) in Japanese, having alternate forms is the rule rather than the exception; 2) both forms see frequent use, and 3) one of them often has many homophones.

What’s more, the sets can be larger than two ({kiyo, sei, shou, shin}), and each morpheme (or allomorph, in the Morioka model) must have a property specifying it as SJ (kango) or native (wago), because the speakers can tell. Harukaze means “spring wind” just like shunpuu, but any speaker knows that the two words feel different, on the pragmatic level (kango is heavier, more formal and serious).

So the Japanese lexicon has this partitioned organization, with islands of related morphemes with “the same meaning” (or almost) but different feel. Notice that nothing in this structure needs kanji at all, and the same organization exists for children and the illiterate and the blind. Nonetheless, literate Japanese often think of the morpheme-sets in terms of kanji. I believe that’s the reason why we find the spoken expressions mentioned above, “knowing the kanji” for “understanding the meaning”, and “converting to kanji” for “retrieving the meaning”. “Kanji”, here, is just a convenient shorthand for the natural relationships between Japanese morphemes.

Now if we expand our analysis from the individual to the history of the language, it’s hardly surprising that kanji are such convenient “labels” for the synonym-sets. After all, the kango words were imported through written Chinese, and the equals-sign that exists between certain SJ and native morphemes was first drawn by the traditional “explanatory readings” (kun’yomi) of kanji. For example, the morpheme san/mountain first entered the language as a “reading” of the Chinese character 山, and it became a (quasi-)synonym of native yama because that was the set kun’doku “reading” of the same character. So it’s quite natural for Japanese to think of their morphemes as on’yomi and kun’yomi “readings” of kanji, even though, historically and individually, the spoken language came first.

Kun´doku still lives

Perhaps the key issue here is the very peculiar kun’doku reading technique, which converts Chinese text to Japanese on-the-fly (it’s more mechanical than a real translation, but more complex than a word-for-word rendition). It was considered to be a valid way of writing Japanese, making it perhaps one of the most indirect writing systems devised so far. And kun’doku is hardly a thing of the past either, as I learned when I tried to find a Japanese “translation” of the Tao Te Ching and found instead “explanations” attached to the original Chinese text. Once I asked my (Japanese) teacher about how to read a Chinese calligraphic scroll, which was written in hard-to-read cursive:


He replied:

Fūjūko, Unjūryū. Kaze wa Tora ni shitagai, Kumo wa Tatsu ni shitagau.

(“Wind-follow-tiger, cloud-follow-dragon. The winds follow the Tiger, and the clouds follow the Dragon.”)

I instantly realized I had just been treated to what was technically a live rendition of kanbun ondoku and kanbun kundoku: He first “read the sounds” (in Chinese, with SJ phonetics), and then gave an “explanation-reading” (converting to yamato-kotoba in Japanese syntax). Later I’d learn this is common practice when “explaining” hanging scrolls.


I haven’t read any research on this, but I’m convinced that kanji also help provide the information that (say) European languages provide via word breaks. Consider these sentences:

1. 誰か慌たゞしく門前を馳けて行く足音がした時、代助の頭の中には、大きな俎下駄が空から、ぶら下つてゐた。
2. だれかあはたゞしくもんぜんをかけてゆくあしおとがしたとき、だいすけのあたまのなかには、おおきなまないたげたがくうからぶらさがつてゐた。
3. だれか あはたゞしく もんぜんを かけて ゆく あしおとがした とき、だいすけの あたまの なかには、おおきな まないたげたが くうから、ぶら さがつてゐた。

(1) isn’t hard to read as long as you know the kanji. (2) is hard to read, even though the Sino-Japanese vocabulary is minimal. (And yeah, this is partly because I’m not used to it.) But (3) is really easy to read! And you can get (3) from (1), simply by inserting a space into any kana-kanji transition (i.e. when for the character sequence XY, X is kana and Y is kanji), and then replacing all the kanji with kana.

Note that (3) is basically what childrens’ books look like, and is arguably even divided into words, if you accept a definition of “word” that includes what other definitions call “particles”, Jesuit Missionary style.

If you accept the hypothesis that the initial kanji of a word basically marks the start of that word, the use of katakana for foreign words, animal/plant names, onomatopoeia etc. seems to fit the same pattern too: it’s not some deep-rooted xenophobia, it evolved in response to the need for a “kanji equivalent” for words that don’t actually have kanji historically associated with them.

(And even the historical experiments in this area support this idea — in the original edition of Shosetsu no Shinzui, Shoyo used an orthography like “真の小説の世に行はるゝ前に羅マンスといへる一種の…” Something like “羅マンス” is exactly what you’d expect to see if the initial character of a word is expected to be a kanji.)

And more: Why are we expected to write the noun 話 with no okurigana, unlike the verb form 話し? Because we recognize the end of (the non-particle part of) a noun by the end of the kanji string, while we recognize the end of a verb by the okurigana. (I would bet that there is a statistically significant drop in the use of ren’yokei form for verbs like 見る and 得る as compared to consonant-stem verbs and longer vowel-stem verbs.)

None of this makes kanji indispensible, of course (if what they do is basically what word breaks do, we could just use word breaks instead). But it makes sense that, given that they evolved over time without too much in the way of top-down control until quite recently, they are embedded in the writing system in much more complicated and fundamental ways than people acknowledge. They’re much more than just a “meaning overlay” or a medium for puns.

Yeah, I don’t think anyone would disagree with you there. Even Matsunaga mentions the word-separation effect (of course, with the caveat that it’s easy to reproduce it with spacing).

There are phonological reasons to consider particles part of “words”, as the Japanese do with spaced children’s kana, and even with rōmaji when they’re unfamiliar with it (particles are part of the phonological words, including the pitch-accent contours, and normally can’t be preceded by a pause). However, there are morphosyntactical reasons to consider them separate words (unlike the suffixes for i-adjectives and verbs, the morphemes preceding particles are quite free and can be moved without trouble).

I think I must have mentioned this to you somewhere, but in my opinion typographical kana are quite bad as a writing system. Manuscript kana is pretty awesome (and exceptionally beautiful), but typographical hiragana is more or less the equivalent to breaking Spencerian cursive into individual letters, stripping away all calligraphic nuances, and piling up one on top of the other. In my personal evaluation, the modern Latin alphabet (with a full set of lowercase, uppercase, and italics) is a much more natural fit for mechanical typography (i.e. with repeated identical sybols), and it’s very good for word retrieval, so in case of script reform I’d favor rōmaji over pure kana. But I’m not an advocate for script reform.

Even folks that mention it, though, seem to assume that word spacing would work just as well. I don’t think that that’s a justified assumption in the absence of proof, though. Maybe the two types of “break” (kana->kanji and kanji->kana) are useful because of the “hanaga vs hana ga” issue. I’m sure the research is out there, but I haven’t encountered it. Basically I see this as another facet of the counterargument to “Spoken Japanese works just fine, so all-romaji written Japanese should too” that you outline above: the two are different enough that you can’t necessarily argue about one based on the other.

I agree with you on hiragana, to an extent (I think the difference between hiragana and Latin is more quantitative than qualitative, but it’s a big quantity, to be sure). Katakana would make more sense for a kana-only writing system, but as you say, any script reform that did take place might as well skip straight to romaji. If only to prevent the inevitable round 2 in a couple of decades: kana are cumbersome and isolate Japan from the rest of the world, they are an unconscionable imposition on schoolchildren, etc.

Japanese traditional grammarians call the ‘noun+clitics’ and ‘verbs+auxiliary verbs’ unit a 文節. It’s pretty basic to their analysis, much more so than the word or 語, which is an artificial unit (e.g., hanasanakatta is treated as something like the word hanasa- plus the joshi naka- and -tta — in fact, I’m not even sure of the split of the joshi in that case!)

I believe in traditional grammar they’d posit something like an underlying hanasa-naki-ari-ta, from which hanasa-nakari-tahanasa-nakatta by “euphony” (onbin, the last step specifically sokuonbin).

But notice that the first morphemes in hanas-u, tabe-ta, or oishi-i are much more bound than the first morphemes in hon da or kirei da. You can test this in many ways, like reduplication for emphasis:

This line of argument is explored in depth by Uehara, who uses it to posit a fundamental distinction between “nouny” non-inflected and “verby” inflected words in Japanese. (And there are all kinds of cool consequences—for example, notice how the “nouny” word classes are much more open (likely to accept new words) than the “verby”). I like this monography a lot, and will try to blog about it eventually.

The upshot is that there are good reasons to define a (language-specific) “morphosyntactical word” in Japanese, including things like hon and kirei but not hanasa- or -katta. Of course, there are good reasons to speak of bunsetsu too.

I’m not sure about the hanasa-naki-ari-ta bit as I haven’t got my Tokieda or Hashimoto bunpō’s with me to check.

The idea that the joshi and jodōshi are all equivalent, in terms of being treated as ‘clitics’ to the main stem (noun or verb), seems to go back at least to Fujitani Nariakira’s studies of Japanese grammar. I often wonder whether this approach might not owe just as much to the writing system, which doesn’t orthographically split the sentence up into ‘words’, as it does to any other element of native speaker intuition.

It might be interesting to do a study of grammatical treatments of agglutinating languages to see what commonalities and differences they have. By this I mean, how do they grammatically treat things like ‘word’, ‘bunsetsu’, ‘joshi’, ‘jodōshi’, ‘conjugation’, ‘case’, etc. I suspect that Finnish, Hungarian, Turkish, Mongolian, Korean, Japanese, and others all adopt varying approaches that reflect their particular circumstances (writing system, grammatical tradition, etc.).

For example, the Mongolian traditional script writes the joshi (but not the jodōshi) separately, giving them the appearance of semi-independent elements in the script, even though in the modern language they are morphologically and phonologically fused to the noun. I’m sure that must influence how they are perceived by speakers, although I have no knowledge of their grammars. In Mongolia, where they use the Cyrillic script and orthographically fuse the joshi to the noun, my understanding is that the grammatical tradition is based on Russian and they talk about ‘case endings’.

The point comes back to the writing system. As Matt points out, the ‘kanji + kana’ organisation of the orthography gives important information about word breaks that spaces provide in Western languages. Was it the writing system that also gave rise to the concept of ‘main word + joshi/jodōshi’ in traditional grammar — a concept that became rooted so strongly that it took a Uehara to come along and write a monograph pointing out that there is actually a difference between “nouny” and “verby” words? Just wondering…

I wandered around the Wikipedia articles on Japanese, Finnish, Turkish, etc. grammar. None of the explicitly touches on concepts like ‘word’ or ‘phonological phrase’ (bunsetsu). The degree and nature of agglutination also differs according to the language.

Our rōmaji makes it easy to see the nominals separated from joshi, as independent units—but is our intuition born from modern rōmaji, or the other way around? And, again, which came first in Japanese (which I think we can trace back to senmyōgaki, predating kana majiribun)?

Not that writing system influence isn’t powerful (as shown famously by the way they failed to see hanas- as a root and had to group things like hanas-a- (bound, non-propositional) together with hanas-u etc.). But perhaps there’s some degree of mutual reinforcing…

While Westerners can easily spot the nouns, we seem to be prone to thinking about “na-adjectives” as inflected, despite rōmaji use (grouping kirei-na hito with yasashi-i hito and kur-u hito). This is probably semantic interference; we know they’re “adjectives”, so they must be like i-adjectives. Syntactically they’re much closer to nouns though, including in freedom of the first part. It takes some time to learn to grant full, independent status to the “two kinds” of adjective. (setting aside the bi-curious cases like chiisa-na…)

(The inflectional status of “na-adjectives” (inflected keiyōdōshi or noun-like junmeishi?) is a hotly debated question in native linguistics, too, both in traditional and in modern strands such as generative. It’s probably the issue in categorization.)

Another way to consider joshi as part of the word instead of as a separate one: in traditional poetry, it’s Not Done to split a word and its joshi into separate ku — that is, it’s a bad line break.


Joseph Allen has written on Chinese hànzì in speech, in: I Will Speak, Therefore, of a Graph: A Chinese Metalanguage.

