This is a summary of Kōji Kawamoto’s theory of Japanese poetic metre. I am writing it from memory (quite distant memory, in fact), so it probably contain errors or half-remembered things. As such, it’s not really a reliable source for your thesis. But you’ve seen just how hard is The Poetics of Japanese Verse to get a hold of these days; it’s not on the usual online sources, it was nowhere in the libraries of Dublin or Bochum, & on bookstores it has reached the triple digits, in the manner of academic books under late-stage capitalism. By now you can finally access a copy, like some fabled treasure, at the same place I originally found it: in the charming little library of the Japan Foundation, São Paulo chapter. But I fear that, at this point, it might be too late. Kawamoto is dense reading; I had to make two or three separate attempts, each one exhaustingly intensive, until I felt like I had digested the gist of it. And of course at this point you can’t afford to muck around with the tarpits of one-more-source. That’s why I’m writing this summary, however lacking my memory is. You can’t of course cite something like this; but maybe you can use it to guide a brief look through Kawamoto’s book, pick out some choice quotations, and use this theory in an academically responsible way.
(I will drag a bit to get to the main point—well, you know me. Skim near the bottom for the meat of the thing, i.e. how to find rhythms in Japanese poetry.)
The nice thing about Kōji Kawamoto’s theory of Japanese poetic metre is its inspiration. In many ways it’s a condensation, a distillation of several other native Japanese meter theorists [I’ll never be able to recall the names of everyone; that’s one thing to hunt in the actual book]. Like his predecessors, Kawamoto understands that poetry is, at its roots, an oral art, not a written one; it’s meant to be set to voice, not read silently; and, moreover, it’s meant to be sung, not simply spoken in a prosaic manner. This is true not just of Japanese verse but of poetry in general; at the root, cross-culturally, poetry and music are one and the same. [For more on this topic, see Foley, How to Read an Oral Poem; and Ingold, Lines: A Brief History. You may find both at the Florestan Fernandes, & they’re quick, fascinating reads]. The problem is, we have no ancient system of notation for poetic melodies (we have stuff for Buddhist chants and Noh pieces, but not for waka); so it’s hard to know what manner of rhythm was Japanese verse built for. The best we can do is to look at what seem to be ongoing, traditional modes of recitation, combine that with native Japanese intuitions, and base our rhythmic theories on that. Kawamoto uses as his main source the uta-garuta [I can’t recall whether he tries some biwa-hōshi, too; might be interesting]. We are assuming then that the old-fashioned way that they sing the Hyakunin Isshu verses in New Year’s karuta games is at least somewhat related to the verses’ original metre. And the system that Kawamoto derives from these recitations, and following previous theories, is a binary rhythm of strong-weak feet, spread in verses of 8 beats each, with variable rests inserted at key points so as to keep word starts at the on-beats.
Let’s go over this part by part. As you know, English verse is based, in its most elemental forms, on the alternation of stressed and unstressed syllables:
Who will search
For holy grail
Past the edge
Beyond the veil
Syllables are grouped in units called “feet”, which get (confusing) names according to their stress patterns; e.g. stressed-unstressed, TA-dum, as in “WHO-will”, is called a “trochee”, and unstressed-stressed, dum-TA, like “be-YOND”, are “iambs”. The tribal-drum–like phonology of English, with all their impossible monosyllables (n.b. all verse-final words above are actually 1 syllable!!), plus the strong distinction between unstressed and stressed syllables, make those patterns very natural. It’s super easy to concoct some alternated beat on a whim, almost at random; say, the woman swims without a hitch along the stream, &c.
Recall that the stressed syllables are i) louder, ii) longer, iii) higher-pitched, and iv) more clearly articulated (no reduced vowels), &c. This is also true of Portuguese, but to a lesser extent. With the respect we have for each and every syllable and the bureaucratic Latinitās of our language, it’s much harder to sustain these binary drumbeats. So we don’t talk of “feet” very much, and worry more about syllable counts. Nonetheless, stress plays a significant role in our poetry, too, and we can use feet like the classical dactyl (dum-dum-DA):
Por que brilham teus olhos ardentes
E gemidos nos lábios frementes
Vertem fogo do teu coração?
Going back to Latin we get a separation of stress and quantity—phonemic long vowels, like in Japanese—and then we find feet again, but this time based on quantity, not stress:
Arma virumque canō, Troi-æ quī prīmus ab ō-rīs
The basic feet here are the dactyls (dum-dum-TA) and the spondee (TA-TA), but nevermind the details; the important point for us here is how the “strong” syllables aren’t necessarily stressed (cánō is stress-initial); instead, they’re long, either because they have long vowels, diphthongs, or end in consonants (≅they’d be written with two kana in Japanese: アル-マ, カノー, &c.; there are some additional complications, but nevermind those too). If we went by stress, it would be more monotone árma virúmque cáno, Tróiæ quí prímus ab óris; but that’s not how Latin metric goes. So in this tradition the “poetic beats”, the “strong/weak”, are less about loudness or prominence and more about phonological duration.
As you know, Tokyo Japanese has a restricted tone system, also called a “pitch accent”, which is somewhat comparable to our stress system in that there can be at most one “highlighted” (=accented) syllable per word (though it differs from stress in that words can have no accent at all, and in fact most Japanese words haven’t). All moræ have either a low or a high tone, but nowadays there are only a few possible melodies; the only thing that may vary is the position of the pitch drop that marks the “accent”. Heian-era Japanese had more possible melodies, so it was closer to Chinese in feel (they still had only “low” and “high” tones, but they occurred in more combinations; think Navajo, or Láadan). However, the interesting thing about Japanese is that, as far as we can tell, they’ve never used these pitch patterns for poetry at all. Ancient Chinese verse had intricate patterns of tonal melodies, and the Japanese knew how it works because they composed formal Chinese poetry, and quite well, too; what’s more, since we have the Ruiju Myōgishō dictionary from Heian as well as a number of tonally-annotated manuscripts (including a Kokinshū), we know that they could analyse the tones of their own language. They nonetheless never made poetic use of that. After Heian, Japanese got the long/short distinction that you know well, comparable to Latin; but they’ve never shown interest in Latin-style quantitative poetry, either. Moreover, after the tone system reduced to accent-like melodies, they didn’t try to use the position of the accent, like we use stress. To this we must add that Japanese poetry has no set requirement for rhymes nor alliteration, and the fact that is so short (for tanka, five verses in three lines); & it starts to feel like it’s dreadfully amusical. Boring. Did they just drone on and on over 31 moræ? Was there no musical device that a poet (or reciter) could use to make their verses sonorous and interesting?
Kawamoto finds the answer in the spaces between the syllables. When one listens to uta-garuta singing, one interesting trait that pops to the ear immediately is that they stretch some vowels, as in Aki no~o Tano~o~o. More careful listening shows that:
- The stretches are longer for 5-mora verses than 7-mora verses. In fact, they’re about thrice as long, equalising the verse lengths; both kinds of verse end up equal, with about 8 beats in a musical tempo (5-verses stretch 3 more beats, and 7-verses add 1).
- The stretches can be realised as vowel lengthening, or as silent pauses, or a combination of both. We’ll call them “rests”, regardless of how they’re pronounced.
- There are certain patterns to where the rests may be inserted, and variations within the patterns (the lifeblood of poetry—rules plus liberties).
Now we start to glimpse something akin to the notion of “cæsura” in European poetry. You know, like English alliterative verse has a clean break between the two halves of the alliteration:
Frozen and futile but far enough
From vile civilities vouched for by
Statisticians, this stupid world where
Gadgets are gods and we go on talking
Many about much, but remain alone,
Alive but alone, belonging—where?—
Unattached as tumbleweed. Time flies.
Can you intuit where the breaks go inside each verse? After that, try to find the alliterations—there will be at least one on each side of the cæsura, often 2 on the left side (frozen, futile || far). Hints: alliterations go on stressed syllables (“lifts”), not necessarily at the start of words; and /st/ counts as a unit.
With haiku, there’s often a big cæsura made explicit in the form of kireji. Write it in one row as the Japanese do, and still the break should be clear:
Shizukasa ya Iwa ni shimiiru Semi no Koe
Ame no nai Hi ga Hatsu-Zora zo Asu mo Tabi
Put a line break after the kireji (only) & you’ll have a typographical indication of lines (gyō) rather than verses (ku), which to my mind is how they should be printed in transcriptions & translations. Something like this:
Furu Ike ya
Kawasu tobikomu Mizu no Oto.
But in tanka we don’t always have something as obvious as kireji, and even in haiku, there are no explicit markers that tell you where you can put the smaller rests inside the verses (ku). Japanese poets of course do it by ear but how can we analyse it? How to make the intuitive rules explicit?
This is where Kawamoto’s theory comes in (this is where the promised meat of the post starts). He builds on the fact that, since a long time ago, other Japanese theorists insist that Japanese poems, and indeed the Japanese language itself, somehow have a binary trochaic rhythm. That would be strong-weak/strong-weak/strong-weak, TA-dum TA-dum TA-dum pattern, like a heartbeat. I think the idea should make intuitive sense to you: words like “Yokohama”, “Minamoto”, “Kashikiri” &c. somehow feel that way, YO-ko-HA-ma, KA-shi-KI-ri, O-o-SA-ka, &c. The problem is, these patterns do not match Japanese pitch accent at all (in Tokyo dialect all of these words would be LHHH, start low then rise, the “unaccented” melody). The proposed rhythm doesn’t match loudness, either (which seems to depend on vowel quality more than anything). In fact, there’s no detectable acoustic property whatsover that would justify the theory of strong-weak binary feet. And yet half the people who tried to study meter felt compelled to propose just that, even as the other half denounced them for baseless fantasies.
A solution to this dilemma was provided by an interesting psychoacoustics experiment [I forgot the name of the researcher, sorry; you’d have to hunt this citation, too]. What they did was to play a steady beep to Japanese speakers—just a completely identical, non-linguistic sound at evenly spaced intervals, “dum-dum-dum-dum…”—and ask them to describe the rhythm. They overwhelmingly said they heard a binary trochaic pattern, TA-dum;TA-dum;TA-dum. That is, they project such a pattern over even a perfectly monotone signal. The strong-weak alternation isn’t phonetic, it’s psychological.
This theory would still feel kind of useless if it couldn’t explain anything; but Kawamoto claims it can explain precisely that mystery pattern, the allowed position of the verse rests. The idea, interestingly, is very much a parallel to standard European music theory. Maybe some of this will feel familiar: In music theory, the first beat of a bar is called an “on” beat or “downbeat”, and is said to be “accented”. The 2nd beat is an “off” beat, and it’s said to be unaccented or “weaker”. This has nothing to do with the loudness of the notes or other acoustic properties (we’re not saying pianists hit every other note with softer finger presses); it’s a psychological notion of “accent”, based on the position of the note within the larger rhythmic pattern. The next beat (#3) is, again, “on”/strong, the 4th is “off”, and so forth; so if a bar has 4/4 time, you get one-two-three-four, with the odd positions (1, 3…) being “strong” and the even ones (2, 4…) “weak”. If the rhythm is binary, 2/2, as postulated for Japanese, every odd position in the verse will be the first one of a bar/foot, the “downbeat”—the most accented beat of all, while every even position will be the last one in a bar or an “upbeat”, said to be the most unaccented, to “anticipate” the following downbeat.
Now the rule is, you cannot start a word off-beat. That’s it. That’s the iron rule of Japanese poetry. (Recall that particles do not count as words; phonologically they’re part of the words, and included in pitch accent melodies—indeed pauses are not allowed before particles in normal speech). According to Kawamoto, when Japanese people say Yare-Yare they say it all in a row, but when they say Mukashi Mukashi they feel compelled to add a small (1-beat) rest between the words. If they didn’t, they’d start the second word on position 4, which is even, which is off-beat, and that sounds, well, off. (I think you can feel it yourself if you try; it feels like syncopation or “hurrying”).
Now within this restriction, there may be still several places to insert rests on 5-verses, and this allow some freedom of performance. Furu Ike ya can be (I’d like to use a musical symbol for rests, like 𝄽 , but the astral plane Unicode is breaking this software so I’ll use instead a tilde, ∼ ) Furu∼∼ Ike-ya∼, or Furu Ike-ya∼∼∼; but it can’t be *Furu∼ Ike-ya∼∼, because that would push Ike to the 4th beat, i.e. the off-beat of the 2nd binary foot, and you can’t start words off-beat. For other verses, there’s only one possible position: Hi-ga Hatsu-zora zo has to insert its single rest at the end. This is called a “running verse”. By contrast, Iwa-ni shimiiru has to put its rest in the middle, after ni (if this is getting unclear, try the alternatives and see where words end up starting). This is called a halted or paused verse [I can’t recall the precise word in the translation].
In classical tanka running verses are often used in the middle, and halted verses in the end, like this [you know how much I love the older stages of pronunciation, but for the sake of focus I’ll quote these in modern Tokyo Japanese phonology]:
Kari-wo no Iwo-no∼
Tsuyu ni∼ nure-tsusu.
The 5-verses can have variations, but pay attention to the 7-verses. Can you see how the running/halted alternation flows along with the poem?
(In verse #3 we have a seemingly extra beat, ji-amari; but in classical verse ji-amari only ever occurs when there are vowel encounters, like here in Toma-wo+arami. This fact, along with phonetic transformations like -te ari > -tari &c., show us that they were pronounced as a single syllable—either as a diphthong, or eliding vowels like wo+a>wa. By Yosano’s era, that was no longer the case; modern tanka did away with this ji-amari vowel restriction, because they just pronounce the extra syllables in a quick rhythmic pace, fitting 3 syllables in the space of 2. Perhaps you could mention this detail in your discussion re: tradition vs. innovation in her poetry.)
Natsu ki ni kerashi∼
Koromo∼ hosu chō
Ama no∼ Kaguyama.
Tago no͡Ura ni∼∼∼
Fuji no∼ Takane ni
Yuki wa∼ furi-tsusu.
This pattern where the end-verses are halted seems common. It makes one think of the long-long (“spondaic”) endings of Greek and Latin verse.
To sum it up, here’s how you find the verse rhythm:
- Number the beats.
- If any word starts at an even beat (=off-beat), add a rest before it, so that it starts on-beat. This is an obligatory rest.
- Take note of which verses turn out to be running or halted.
- Continue adding free rests until all verses are 8-beat. You can add them anywhere as long as it doesn’t violate the iron rule: words must start on odd beats (on-beat).
- Try reading it aloud this way! (You don’t have to sing like uta-garuta if you feel that’s too alien by now; you can read them more or less like normal speech, but respecting the rests as pause lengths).
I hope this helps on your metrical studies somehow, and to locate the relevantest parts of Kawamoto’s book if you feel it to be worth the while.
With love & yours, || always.