Phonogram transcriptions of Old Japanese: Miyake’s five systems

“Writing” in Japan up until the Nara period (8c) often meant writing in Literary Chinese (漢文 kanbun), but there are a number of extant texts in Japanese which make up our earliest records of the language—notoriously the songs collected in the Kojiki, the Nihon Shoki and the Man’yôshû. The Chinese writing system was designed to represent Chinese-specific morphemes and syllables, and adapting it to represent Japanese was a complex task (since neither morphemes nor sounds coincide between the languages). Contemporary scribes came upwith a number of different techniques to represent Japanese using Chinese characters, some of them quite complicated (indeed, the Man’yôshû actually took delight in scriptural-level complexity and indirectness). However, for modern readers interested in the Old Japanese (OJ) language itself, the most important technique was also the simplest: the technique of simply borrowing Chinese characters phonologically to represent OJ syllables. In this the Japanese followed earlier Chinese and Korean phonological transcription practices, which they learned together with the writing system from their Paekche instructors.

Unfortunately, it’s not that easy for the modern reader to decipher even this “simple” phonological use of Chinese characters (hereafter “phonograms”). Not only has the phonology of Japanese changed significantly, but the Chinese family of languages also kept changing (often radically), and in any given case it’s not obvious from which variety of Chinese—or Sino-Korean—pronunciations did the Japanese choose their phonograms. The textual evidence makes it clear that at different epochs they’ve used several different, conflicting phonological systems as sources. Marc Hideo Miyake identifies five such “systems of sinographic reading” borrowed to try to write down Japanese. To put it another way, written Chinese was a moving target; every so often the Japanese would come in contact with a new cultural influx that would teach new “correct readings” for the characters, which meant they had to change which phonograms should be used to represent Japanese syllables (which of course were also moving targets).

Now this is all terribly confusing, and I found it too easy to mix up Miyake’s five “systems” with the Sino-xenic systems such as Japanese go-on, kan-on &c., & to lose sight of which languages all this stuff is referring to and how. This post is related to the previous discussion on Sino-Japanese & I believe one should complement the other (though I hope I managed to make each post independently intelligible). In the interest of clarity, allow me to once again try to bullet things:

  • First, we have the Chinese languages, a family of related tongues coming from a single branch of Sino-Tibetan, all of which are now isolating (non-inflected) and tonal—though that wasn’t always true. It should be kept in mind that, like all languages, the Chinese forms of speaking existed before and independently of the writing system, both historically and for each individual.

  • Then we have Chinese writing, which is morphosyllablic; it’s a system of phonological and semantic hints to be interpreted as Chinese syllables and morphemes (Chinese syllables are usually morphemes). For most of history, the Chinese cultural sphere was diglossic, i.e. the written language was quite distinct from the spoken varieties (“vernaculars”); the vernaculars changed and diversified tremendously, while writing, until the 20c, struggled to imitate and reproduce a more or less static, fossilized form dating from the Zhou dynasty (1c).

  • A system of sinographic reading is a tradition of how to pronounce Chinese writing (“sinography”). Since the classical written language differed from the vernaculars in vocabulary, grammar &c, the reading had to be in an artificial, learned system, like, say, the scholarly readings of Ancient Greek or Classical Latin. But since non-native phonemes are hard to use, speakers of newer vernaculars tended to “update” the phonetics quite freely (much like e.g. modern Japanese, who read Classical Japanese as if it sounded just like 20c Tokyo-Eastern dialect). Diglossia studies say that the phonology of a written, classical language is “parasitic”, i.e. derived from the living, spoken vernaculas. So each system of sinographic reading was based on a given historical/geographical variety of Chinese, perhaps keeping some archaic or unusual phonetics for flavor. People didn’t actually have conversations in the resulting language mixture; they simply had to assign some sound to read the texts aloud (which, in ancient times, was the usual way of reading), as well as to form the mental “acoustic image” when reading silently.

  • A phonogram transcription of Old Japanese is a technique for representing OJ syllables through the phonological values of Chinese characters. Each style of transcription would select different characters, depending on what system of sinographic reading was dominant at the time. Naturally, the basic approach was to try to equate the source readings to similar-sounding OJ syllables; the problem is, these two often enough were uncomfortably different.

  • Sino-xenic refers to the practice of massively borrowing Chinese words (or, more accurately, morphemes) into non-Sinitic languages; a development that wich has happened in Japanese, Korean, and Vietnamese. Specifically, it can refer to any of:

    1. The cultural practice which allowed borrowing any Chinese word into a foreign language through the medium of writing;
    2. A given foreign “system of sinographic reading”, using non-Chinese phonemes, and intended for reading and recitation by foreign speakers. These were built from existing systems in a rational, regular way, by mapping the “initials” (onsets) and “finals” (rhymes) of traditional Chinese philology to foreign phonemes; they include Jap. go-on, kan-on, tôsô-on and so on (no pun intended).
    3. Or the resulting (large) stratum of loanwords (loan-roots) which came in this way and became naturalized in the foreign language.

    Therefore, when we talk about e.g. Japanese go-on we might be talking about a system scholars devised to read Classical Chinese with Japanese phonemes, or about the many Chinese words/morphemes which entered Japanese through this system.

Please take care to distinguish the last two bullet-points:

  • A transcription of Old Japanese attempts to represent Old Japanese phonology using Chinese phonetics, with the goal of writing down Japanese;
  • while a Sino-Japanese system transforms Chinese phonology into Japanese phonology, with the goal of reading Classical Chinese. From the point of view of contemporary Japanese, this was just “how to read Chinese”.

Nonetheless, the two kinds of system are related in that usually a pair of them is based on a single Chinese variant, which had been transmitted to Japan through a given system of sinographic reading. The flow of phonetics is, schematically: 1) Source language → source system of reading → OJ transcription style; 2) Source language → source system of reading → Sino-Japanese system of reading.

With that out of the way, let’s see all five systems of reading which Miyake believes were used at some time or another for the transcription of OJ:

System A

By this term Miyake denotes the few Japanese words transcribed in the Chinese Wei Zhi (“Chronicles of Wei”, 220–265 CE), plus a couple of brief mentions in other Chinese histories (which might have been copied from the Wei Zhi anyway). Wei Zhi was the earliest history in the world to describe the islands we now call Japan. It transcribed a small number of Japanese titles and names; for example, a native position glossed as “co-chieftain” is transcribed as 卑奴母離 (Late Old Chinese (LOC) *pye noo məəq le). This is traditionally believed to represent an OJ word similar to pǐnamori (modern hinamori) “guard of a distant region”, though from a modern, skeptical point of view, this association is not totally certain.

These early transcriptions left no mark in Japanese language, and are not very useful as linguistic documents for several reasons:

  • The Wei Zhi only transcribes proper names and titles;
  • There are no glosses for the meaning of names, and those of titles are of little use (for example, “co-chieftain” doesn’t tell us a lot about what’s a 卑奴母離);
  • The Chinese “were not known for the accuracy of their transcriptions of barbarian languages” (Kane);
  • The choice of characters such as 卑 “humble” or 奴 “slave” might reflect rhetorical ethnocentrism, more than attempts at accurate phonetics;
  • And it is theoretically possible that some of the transcriptions represent dialects, or even languages distinct from Japanese. Some of the transcriptions don’t seem to resemble anything in later Japanese.

The Wei Zhi describes a kingdom (well, a queendom) called Yamatai, governed by a shamaness named Himiko (a conventional modern reading for the System-A transcription 卑彌呼 *pye mye hoo). It tells us that Himiko’s court exchanged letters with the Chinese Wei, but not who wrote these letters. Owing to a general lack of evidence for writing in Japan in this period (3c), Miyake thinks it’s likely that they employed foreign scribes, or at best literacy was limited to a few specialists. In any case it seems that LOC was the first variety of the Chinese language that the Japanese had contact with, though minor. (There are some artifacts with Chinese inscriptions dating from the first century, but the inscriptions at this time were probably unreadable to the Japanese.)

System B

The Kojiki and Nihon Shoki both say (in somewhat different accounts) that two sages called Achiki and Wani came from the kingdom of Paekche to teach writing, and that Japanese scribal clans claimed ancestry from them (which would be a powerful source of prestige back then). These accounts might be mythified, but modern historians agree that Paekche and Yamato had close ties and frequent diplomatic missions, and that Buddhism was introduced in this way. Chinese missions proper didn’t start until 600 CE, while the Korean scholars probably arrived in the early 5c.

That means the first teachers of sinography were likely Korean, and the pronunciation tradition they taught must had been based on early Sino-Korean/Sino-Paekche phonology, and not directly in Late Old Chinese. Our knowledge of Sino-Paekche is too fragmentary to draw much phonetic insight from this; it might have influence of Koguryŏ, it might have been very close to LOC with a subtle “Korean accent”, or it might have been something radically different.

Some archeological artifacts dated around the 5–6c have inscriptions that are assumed to be the first recorded writings by the Japanese themselves: the Inariyama sword, the Eta Funayama sword, and the Suda Hachiman mirror. They are in Classical Chinese, but include some OJ proper nouns in phonograms. A substantial number of these phonograms were used in earlier Korean texts such as the Samkwuk saki, and many reappear in later Japanese texts such as the Kojiki and Man’yôshû. These facts suggest a continuous ortographic tradition that would evolve over time as it came in contact with new reading systems (Seeley also mentions other Koreanisms in the Classical Chinese language of these inscriptions, such as the fact that month names are followed by a suffix 中 “during” that doesn’t occur in standard Chinese ). Bentley has more textual data showing the continuity of phonogram use from Han to Paekche to Japan.

Miyake calls “System B” the LOC readings taught by the 5c Pakche teachers to Yamato natives, assumed to be represented in these inscriptions. Its phonetics might or might not be particularly Paekchey, but in any case they gave rise to some notable distinctions from later transcriptions. For example, the character 支 is employed in the Inariyama inscription to represent a pre-OJ ; it’s never used this way in 8c texts, where it represented si. This matches developments in Chinese, since the LOC initial *[k] had changed to a *[cɕ].

System C

System C underlies a phonogram ortography starting at the Suiko period (592–628), when a larger number of inscriptions began to appear, often in a Buddhist context: the Gangôji tray inscription, the Hôryûji Yakushi Buddha halo inscription, the Tenjukoku Mandala inscription and many others. The transcription of Japanese here was still mostly limited to proper nouns in Chinese-language texts. System C shows features closer to Early Middle Chinese (EMC) than LOC; Miyake proposes it was based partly on very late Old Chinese and partly (mostly?) on early EMC, probably through Sino-Paekche. Its use in transcription corresponds to what Ôno Tôru calls “old stratum kana”.

Some distinctive features of System C–based transcriptions include:

  • Characters in the rhyme of 支 were used for OJ rhyme -a: 奇 ka, 宜 ga (*[ŋga]), 移 ya. This corresponds to EMC *-iə.

    By the end of the 6c this rhyme had changed to EMC *-i; some of the System C phonograms accordingly use this newer pronunciation to represent OJ i and ǐ, as in 支 for and 知 for ti. In other words, System C has two strata of the same Chinese rhyme.

  • Characters in the rhyme of 魚 were used for OJ : 居 , 擧 .

  • Characters in the rhyme of 之 were used solely for OJ ö: 意 ö, 己 , 思 &c.

    After Suiko, this rhyme frequently transcribe i or ï; there are only three syllables using them in the Kojiki, and none at all in the Nihon Shoki. This follows the change of LOC *-ə → EMC *-ɨ → LMC *-i.

  • The characters 支、侈 and 止 were used as phonograms for , ta and , reflecting Old Chinese consonants before they became palatalized. Miyake thinks unlikely that they were unpalatalized as late as the Suiko period; rather, these must be pronunciations that retained archaic, erudite readings.

Despite such special characteristics, 71 of the 88 (80.7%) attested Suiko phonograms retain their value in post-Suiko texts. Thus, System C is the foundation of eighth-century OJ phonographic writing, with some parts of it being revised in accordance to the newer systems described below.

System D

Thanks to a desire of recording native songs (waka), the three Nara-period (8c) classics mentioned at the top of this post included, for the first time, complete Japanese utterances written in phonograms. Together with some other complementary material, they give a wealth of data about Systems D and E, much richer than the earlier transcription styles. The Kojiki, Man’yô, and other texts are all based on System D, while the Shoki stands alone as the sole extant example of System E–based transcription. Before the Kojiki, there were some System D–based writings, including scattered proper nouns in censuses, the senmyô edicts, and Shintô “liturgies” (norito); later, there were also a few songs in Fudoki gazeteers and the 21 songs “carved into the stone representation of the footprint of Buddha” at Yakushi-ji (the bussokuseki-no-uta)—the great thing about stone inscriptions is that, unlike most of the other data, they’re free from textual corruption (almost everything else survived only in later manuscript copies, and we can’t trust them completely; Miyake uses statistical techniques to dismiss outlier phonograms).

The senmyô deserves special mention, since they’re the first historical example of the sort of mixed morphographic-phonographic writing that’s now standard in Japanese. They were written with characters in two sizes; lexical roots had large characters used for semantic meaning (morphograms), while inflections, particles and grammar elements were written in smaller characters meant to be read phonologically (but still in the same mana script, i.e. “full”, block handwriting; not in simplified cursive like the later kana). The language was a sinified Japanese, but since the phonographic part only extends to a few common grammatical morphemes, it illustrates only a subset of OJ phonetics.

Miyake believes that System D corresponds to a later variety of EMC, also through a Sino-Paekche reading system. It’s the same as Ôno Tôru’s “middle-stratum kana”. Tôru believed that these transcriptions must have had influence from the newer Chang’an Chinese (see below), but Miyake’s analysis seems to point rather that this influence only occurred on System E.

Most go-on kanji readings correspond to the same source as later System C and System D, i.e. to EMC-based systems; only a few LOC readings survived all the way from the time of System B up to standard go-on (such as 施 se ← LOC *ɕe and 是 ze ← LOC *je). As a result, one can read phonogram transcriptions based on System D with modern-day go-on and often approximate the text’s modernized readings; for example, the first line of the first poem of the Kojiki, meaning “eight clouds rise” looks like this (this poem is mythologically said to be the first Japanese song ever composed, by the storm-god Susano-wo):

System D–based transc. 夜久毛多都
Early Middle Chinese
(basis of System D)
*yiah *kuwq *maw *ta *tɔ
Old Japanese ya kumǒ tatu
Modern go-on reading ya kumou tatsu
Modern Japanese rendering ya kumo tatsu

Nonetheless, the reader should take care to distinguish the similar but distinct OJ syllables, go-on readings, and System D readings; Miyake points the following example:

Reconstructed EMC *kɨq *hɨəq *ɡɨəq
System D? (tentative) *kə *hə *ɡə
OJ, System-D transc.
OJ go-on reading gö *[ŋɡə]
Modern go-on ko ko go

Some distinct features of System D–based transcription (compared to System C) include:

  • 支 rhyme now used mostly for OJ i, ï, ǐ, rarely for e or ë, almost never for a (as in System C).

  • 魚 rhyme moved away from ë and now generally used for ö.

  • 之 rhyme, other than earlier ö, often used for i and ï.

The Man’yô, as one would expect, presents the least “pure” ortography. First of all it should be noted that, despite what the popular term man’yôgana suggests, phonographic writing is actually the least used technique in this complex piece of inscriptional art; and the phonographic sections also make use of the roundabout method of kungana transcription, in which one first translates a Chinese morpheme to Japanese, and then reinterpret the Japanese sounds as one or more unrelated, homophonous morphemes. What’s more, even in the sections that do use simple phonographic transcriptions (System D–based), a few archaisms are retained from System C even as one very recent innovation is shared with System E (namely, Chinese nasal-initial 泥 for voiced obstruent OJ de [*nde]; see below).

System E

The historical compendia, Kojiki and Nihon Shoki, are both originated during the same period, at the Tenmu court (673–686 CE), and took some three decades to be compiled. The Man’yôshû is older, being finished after 759. Despite this, the Shoki uses a newer ortography in its transcriptions, System E, which shows evidence of being based on a later variety of Chinese. Apparently System E–based transcription didn’t caught, and later material up to the rise of kana would be System D–based.

We already met this later Chinese when discussing kan-on: it’s the Chang’an dialect of Late Middle Chinese (CLMC), brought directly from the capital of the prestigious Sui and Tang empires by Heian envoys (the Heian 平安 capital was named in reference to Chang’an 長安, and was in fact architecturally modeled on it). In contrast to transcription practices, in the realm of Sino-Japanese the CLMC-based readings did caught and became dominant (though not to the point of completely replacing EMC-based go-on). System E corresponds to Tôru’s “new stratum kana”.

Compare the Shoki’s System E–based transcription of the first line of the first poem with Kojiki’s above:

System D–based transc. 夜久毛多都
(basis of System D)
*yiah *kuwq *maw *ta *tɔ
System E–based transcr. 夜句茂多兔
(basis of System E)
*yiah *kəw *mbəw *ta *thuəh
Old Japanese ya kumǒ tatu
Modern kan-on reading ya ku bou ta to
Modern Japanese rendering ya kumo tatsu

Notice that, for the same reasons we can approximate System D–based transcriptions with go-on, we can approximate System E–based with kan-on, though in both cases this approximation remains rough.

The most notable feature of System E transcription is that CLMC prenasalized obstruent onsets were used for both OJ nasals and OJ voiced obstruents:

EMC CLMC OJ Shoki examples
*m *mb b *[mb] 魔 *mba for /ba/
m *[m] 魔 *mba for /ma/
*n *nd d *[nd] 泥 *ndyiay for /de/
n *[n] 泥 *ndyiay for /ne/
*ɲj z *[nz] 珥 *ɲʑiq for /zi/
z *[n] 珥 *ɲʑiq for /ni/
*ŋɡ g *[ŋɡ] 疑 *ŋɡi for /gï/

(Note: I think there’s a typo in the second 珥 example on p. 33, but I haven’t yet checked wheter my supposition is correct; please confirm this before using this data! On second though, ignore the simplified exposition of this blog and do use the actual source!)

There are two reasons for the “double duty” of these CLMC initials:

  1. CLMC was missing certain syllables (*ma, *na…), forcing scribes to use nearest equivalents (*mba, *nda…); and
  2. The plain voiced obstruent initials of EMC had become complex “muddy” (濁) voiceless obstruents, so that what was a [b] became a [pɦ] with a breathy vowel—nasal [mb] was much closer to OJ /b/ than this mess (besides, OJ voiced obstruents were likely prenasalized anyway; EMC and earlier had lacked this kind of initial, so that OJ transcription had to make do with non-nasalized equivalents).

System E transcription had a number of other, more esoteric features, but one that’s too fascinating not to mention is that the choice of phonograms appears to be related to the accentual system, making it one of the few attempts in history to write down Japanese pitch accent. Takayama has found statistical correlations between the Middle Chinese tones of the Shoki phonograms and the Middle Japanese accent recorded in the 1081 glossary Myôgishô. Unfortunately there are a number of issues that complicates the philological usefulness of this system, such as the lack of System E phonograms for certain OJ syllable/tone combinations and the probable later manuscript corruption; but even then, other scholars who raised these issues (Mori, Martin) agree that there was very likely some sort of tone-consciousness in System E–based transcriptions.

Wrapping up

In conclusion, allow me to undo all the care in distinguishing the different ontological categories and present a table of which scriptural techniques and ortographies reflected which languages:

Source Paecke influence? System of sinographic reading Phonographic transcriptions Sino-Japanese stratum Ôno Tôru’s terminology
Early Old Chinese No System A Wei zhi Pre–Sino-Japanese loans?
Late Old Chinese Probably System B Inariyama, Funayama, Hachiman Very early go-on
Early Middle Chinese (early) Probably System C Suiko-period inscriptions Early go-on Old-stratum kana
Early Middle Chinese (late) Probably System D Post-Suiko up to Nara; Kojiki, Man’yô, &c. Later go-on Middle-stratum kana
Chang’an Late Middle Chinese No System E Nihon shoki kan-on New stratum kana

Next week (probably), we’ll look at some examples of the shenanigans that those wacky nobles devised when creating that extended exercise in puns and wordpuzzles, the Man’yôshû.

The alphabetic conventions used in this post are:

  • OJ is transcribed as in Modified Mathias-Miller notation, with inverted circumflexes marking series-A (甲類) rhymes.
  • OJ reconstrutions are Miyake’s, always preceded by an asterisk.
  • LOC and MC are reconstructions based on Sarostin & Pulleyblank’s, as cited by Miyake (who likes them a lot). Keep in mind that this is still a debated area & there are a number of competing theories.


  • Marc Hideo Miyake, Old Japanese: A phonetic reconstruction, chapter 2, and apuds:

    • Mori Hiromichi, Kodai no on’in, 1991.
    • Samuel Martin, The Japanese Language through Time, 1987.
    • Takayama Michiaki, Gen’on seichô kara mita Nihon Shoki ongana hyôki shiron, 1981 (and subsequent papers).
    • Ôno Tôru, Man’yôgana no kenkyû, 1962.
  • John R. Bentley, The Origin of Manʾyôgana.

  • Christopher Seeley, A history of writing in Japan.

