Alphabetic transcriptions of Old Japanese

(This is about modern transcriptions using the Latin alphabet; if you’re looking for historical Old Japanese transcription techniques, might I interest you in this other post?)

Because I’m quoting material from different works in this blog, I can end up citing various transcriptions and romanizations, which can be confusing. In fact I am confused. This post is to attempt to set things straight about how people represent Old Japanese (OJ) words in modern texts.

First of all, I find it’s desirable to keep in mind a distinction between transcription and reconstruction; transcription of text is much more certain and stable than phonetic reconstructions. Old Japanese was written in a number of ways, but for linguistic purposes the most interesting technique is the use of Chinese characters for their phonetic values, or “phonograms”. Contrary to popular opinion, this wasn’t invented in Man’yôshû times as man’yôgana; when Chinese writing came to Japan, phonogram writing was already common in China (for transcription of foreign words etc.) and in Korea (for native words), and the set of characters chosen as phonograms shows that Japanese usage was a continuation of that tradition.

Now the puzzle (in Kuhn’s sense) that has long fascinated OJ scholars is the fact that OJ phonogram writing preserves some distinctions that were lost in later Japanese texts. For example, at the time of the 17th-century priest Keichû, the phonograms 乎 and 於 were both pronunced [wo] (in modern Japanese, they’re both [o]). But he noticed that, in past texts, the accusative particle wo could be written as 乎, but never 於; and precisely the opposite held for the first syllable of the verb “to think” (which for him was womou). In fact there was a group of phonograms interchangeable with 乎, and another with 於, but these groups never mixed; “to think” was always written with a phonogram from the second group, and the direct-object particle with one from the first group.

Early kokugaku scholars like Keichû, Norinaga and Ishizuka thought of these distinctions mostly in terms of orthography: the Ancients had a “proper way” of writing that was now lost. But Ishizuka did start to notice these distinctions could be explained by phonetics (he discovered there was a “muddy/clear” distinction, i.e. voiced/unvoiced), which paved the way for the modern linguist Hashimoto Shinkichi to attempt the first OJ reconstructions. Hashimoto found out that the distinct phonogram-sets could be predicted from some morphological environments: 賣 and 米 both were characters for me, but the first ocurred only in imperative verbal forms (meireikei), while the second only in realis/conditional forms (izenkei). Hashimoto called these groups kou 甲 and otsu 乙 (“A” and “B”) types, respectively. There were, though, some merged syllables that didn’t occur in verbal inflections; Hashimoto classified those into A or B using more complex criteria, based on alternative readings for the characters. He didn’t give a name to the remaning majority of syllables which show no A/B distinction; following Miyake, we can call those the C-type (hei 丙). A– and B-types only occur for syllables which later merged for rhymes e, i, o; syllables corresponding to rhymes a and u have no such distinctions (i.e. they’re like C-types). Therefore, the following OJ rhymes are attested:

a (=aC)
i (=iC) iA iB
u (=uC)
e (=eC) eA eB
o (=oC) oA oB

Based on comparative data from nearby languages, Hashimoto believed these distinctions were not just ortographical but reflected ancient phonological distinctions. OJ had 15 possible consonantal onsets plus the zero-onset; multiplied by these 11 rhyme classes, there would be 165 possible syllables; but, as usual, only a fraction of those actually occurs—for example, iA and iB never appear after t, while iC never appears after p. The full set of syllables found in texts by Hashimoto amounts to 88:

Rhyme p b m w t n d r s z y k g
a (=aC) a pa ba ma wa ta na da ra sa za ya ka ga
iA piA biA miA kiA giA
iB piB biB miB kiB giB
i (=iC) i wi ti ni di ri si zi
u (=uC) u pu bu mu tu nu du ru su zu yu ku gu
eA peA beA meA keA geA
eB peB beB meB keB geB
e (=eC) e we te ne de re se ze ye
oA moA toA noA doA roA soA zoA yoA koA goA
oB moB toB noB doB roB soB zoB yoB koB goB
o (=oC) o po bo wo

Our discussion so far depends on four claims by Hashimoto: 1) That there are extra syllablic distinctions in the writing of Old Japanese; 2) that they reflect lost phonological distinctions; 3) further, that these distinctions happen in the rhyme (i.e. in the syllable final, not in the initial consonant); and 4) that the attested phonograms can be sorted into the above 88 syllables. This is what Miyake calls “the consensus view” (p. 50). As far as I know (which is not much), today mostly everyone agrees that 1 and 2 must be correct, and probably 3 too; a few scholars have minor disagreements about 4, but the general scheme is accepted (Yoshitake Saburô posited 86 syllables, Lange 82, Nagata/Mabuchi propose 91; Tôru, Igarashi, Inukai and Bentley accept 89 by adding poA/poB).

The problems is that scholars have used notations that embody assumptions not only of attested phonogram usage, which is more or less certain, but also of reconstructions, which come and go. For a long time, it was assumed without proof that type C would be phonetically equivalent to type A, reducing the 11 attested classes to 8; what’s more, many scholars assumed that the syllables were all of the CV form, meaning the A/B distinction must be purely vocalic. These assumptions framed the famous “eight-vowel system” of OJ, and were embodied in a number of notations. While an 8-vowel reconstruction is in principle possible, there is no strong evidence, and many recent scholars dispute it. Kiyose has a list of “pro-eight vowels” and “anti-eight vowels” reconstructions; most of the anti-8 camp posit that some of the distinctions were realized as glides ([y] or [w]). C-type rhymes could in principle have an entirely distinct phonetic realization, though Miyake’s analysis show that the textual evidence makes it unlikely (they employ phonograms that fall in the same classes as A and B; see p. 263–264)—but one should not simply assume this without arguments.

Which the caveats in mind, I can finally try to sort out the various OJ romanizations I’ve stumbled upon:

Superscript, latin Superscript, Japanese Yale (Martin) Japanese-style Miller; Ohno Modified Mathias-Miller Frellesvig & Whitman
a, aC a, a a a a a a
iA i yi i i î i
iB i iy ï ï ï wi
i, iC i, i i i i i i
u, uC u, u u u u u u
eA e ye e e ê ye
eB e ey ɜ ë ë e
e, eC e, e e e e e e
oA o wo o o ô wo
oB o o ö ö ö o
o, oC o, o o o o o o

Miller is given on its p. 180 as a minor revision of Japanese-style by changing ɜ to ë; both of them assume C-type = A-type. The umlaut originally denoted vowel centrality, but central/peripheral reconstructions are more or less discredited by now (I think?). Miyake prefers Yale romanization for its neutrality, but personally I dislike it for being misleading—one has to keep in mind at all times that the ey, ye, iy, yi, wo and o don’t mean or suggest a phonetic realization like [ey], [ye], etc., but are purely abstract, algebraic symbols. I also find it typographically hideous, what’s with the underlined o and all the extra ys. Modified Mathias-Miller, adding a circumflex diacritic to Series-A, seems a more reasonable neutral transcription (as long as you know that the circumflex doesn’t denote a long vowel). Frellesvig & Whitman’s is listed for convenience, but it’s not a pure transcription; by my criteria, it steps into reconstruction territory (the authors call it a “phonemic interpretation”). F&W proposes iC=iA, eC=eB, and oC=oB; its “y” and “w” represent actual glides in pronunciation and are not just an abstract notation as in Yale.


  • Marc Hideo Miyake, Old Japanese: A phonetic reconstruction.
  • Roy Andrew Miller, The Japanese Language.
  • John R. Bentley, The Origin of Man’yôgana.
  • Kiyose, Gisaburô. Japanese Linguistics and Altaic Linguistics. Tokyo: Meiji shoin. (apud Miyake).
  • Frellesvig & Whitman. The Oxford Corpus of Old Japanese.

5 thoughts on “Alphabetic transcriptions of Old Japanese

  1. This is great. What is the source for “Japanese-Style”? Kiyose? I was wondering what to post about today; I’ll try to add some extra info from the books I have.

    Speaking of this… sparked by the recent mention on my blog, I was looking through the ol’ Frellesvig 2010 again and wondering if maybe I should reassess how I represent premodern text on my blog. Up until now I’ve gone with the theory that if I include the Japanese text that should be enough for anyone who cares and so I can transcribe it the same as modern Japanese (no distinction between ko/otsu, etc.) — but I’ve not been consistent with this; I change between [F], [f], [h], [p] based on context with no clear rules, and I don’t even use the same Romanization system consistently. It’s usually Hepburn, but sometimes I find it easier to use a phonemic transcription (/hu:ziru/ vs fūjiru)… in summary, it’s a huge mess, and any attempt to rationalize it would be a weak excuse for the fact that the history of the phenomenon represents my journey from “no idea what I’m talking about” to “vague idea what I’m talking about”.

    So I was thinking about picking a relatively recent system (probably Frellesvig again, as he’s both recent and fairly comprehensive in terms of OJ, MJ, NJ variations) and using that instead.
    – Upsides would be: Make me + readers think about these things more clearly. Represent sounds of premodern poetry in form (theoretically) closer to actual contemporary usage.
    – Downsides would be: Frellesvig could be wrong. Connection to modern Japanese would be more obscure, especially to people unfamiliar with the background. Meta-connection to historical Japanese philology also becomes obscure (i.e. it would, in some sense, be a self-segregation from the thousand-year tradition of pretending that Japanese is a language that only evolves a bit around the edges of vocabulary and verb endings, and the interesting and valuable writing that this has enabled).

    What do you think?

  2. I got “Japanese-style” from Miller, who uses it as the basis for his; he only says that it’s what “Japanese scholars today generally use”, “today” being 1963. Having not read the papers of the Japanese scholars, I don’t know which ones he’s talking about, nor how widespread it was; but Miyake has a table (p. 62) with no less than 26 reconstructions, from 1932 (Nagata) to his own at 1995/2003, so there’s a lot to research.

    I kept trying to recall what Phillipi had used in his Kojiki translation; I think it had umlauts and A=C, so it was probably related. I’ll check at the library later.

    Yeah, I think Frellesvig would be as close to “mainstream” as it gets, but I feel your pain. I started reading on this topic because I wanted to know what poetry sounded like in the original (which, btw, is the motivation that Miyake gives in the preface for his book); but the more I learn, the less I trust reconstructions. I know I’m being unfair, but there’s just so many of them, and they proliferate and die like, dunno, bubbles on the still water; Miyake criticizes most of past reconstructions for being based on Karlgren, who is now considered incorrect—making the whole thing fall like a castle of cards. The tradition of reading premodern text as if it were in the current standard dialect (or Classical Chinese as Mandarin) doesn’t sound so dumb anymore: at least this way it’s consistently wrong. I kept scouring JSTOR perversely to see if someone has already proved that Miyake got it all wrong—the best I could find was a foonote by Frellesvig saying it was “a different interpretation on some points”, so it seems I got to read it while it’s still considered plausible.

    Sorry, I’m being overly bitter. I do think there’s real progress in the field (i.e. that uncertainty is decreasing), and in any case the situation seems much better for Heian Japanese onwards (which is after all the majority of texts). And sure, I agree it’s a great idea to adopt an explicit convention for transcriptions, and that raising awareness of the diacronicity of Japanese would be a Good Thing. Perhaps you could cheat and include both the Frellesvig reconstruction and the modern “translation” in Hepburn? It’s more work, but doesn’t lose any audiences. I’d just make sure to add a note somewhere (in the sidebar, perhaps?) briefly noting whose theory (Frellesvig’s) you’re using; this way, when citizens of the year 3000 read your blog in the HyperInternet Archive, they’ll know which updates to make to the phonetic transcriptions.

  3. Only after writing this post did I find out that the Oxford Old Japanese Corpus already had a table like this. Oh well, at least I learned a lot. I included extra transcriptions from their table in mine.

    In the first version of the post I wrote this:

    I wish they would just use a simple scheme with diacritics; say ǐ, ï, i, ě, ë, e, ǒ, ö, o or something.

    I guess Mathias must have thought the same, since “Modified Mathias-Miller” does precisely that; except it uses the circumflex instead of the caron (which I find potentially confusing, since the circumflex is used as an alternative to the macron for long vowels in Middle/Modern Japanese by some people—including your humble blogger). Unfortunately I don’t know where the Oxford Corpus got this transcription from.

  4. Choice quote from Unger’s review o Frellesvig :

    [Frellesvig’s] presentation of Old Japanese (OJ) phonology (26–50) is admirably clear, but I think his preference for ‘phonetic reconstruction and phonemic interpretation’ (30 and elsewhere) inverts the proper order. One can identify phonemically distinct syllables in OJ that merged in the course of time through an analysis of phonogram distributions alone, prior to interpreting other evidence to discover the phonetic differences underlying those distinctions (Lange 1973, Unger 2008). By starting with phonetics, F undermines his own interesting claims about OJ allophonic variations (34–39), which are not in general reflected in Middle Chinese syllables associated with characters used as ongana.⁷

Leave a Reply

Your email address will not be published. Required fields are marked *