Glyph of the word 'noala'.


  • (v.) to sing (archaic meaning to make a noise)
  • (n.) song
  • (n.) singing
  • (adj.) singing
  • (n.) sound (technical)

Noala ei tie kepo iviki.
“I sing the body electric.”

Notes: This is the first line of one of the sections of Walt Whitman’s Leaves of Grass. The syntax in English is a bit odd, and doesn’t make immediate sense unless you recognize its ancestry in Latin verse (cf. the first line of Virgil’s The Aeneid Arma virumque cano, “Arms and the man I sing”, where both “arms” and “man” are in the accusative). In Latin, certainly, this meant something (where the subject of one’s song could serve as the direct object); in English, that’s not as certain. Perhaps if it were translated for sense, it would come out to something like “I sing about the body electric”, but that doesn’t really have the same ring…

Anyway, in Kamakawi, I figured since it’s a matter of an extra letter (not even an extra syllable), ti would make more sense here than the direct object marker i. It’s an instrumental, true, but it also marks sources frequently, and that’s what I think of when I read this: The cause of the poem is the body—the source of its inspiration. Ti, then would seem most appropriate.

But back to the point of this entry, a commenter asked yesterday what the overall letter frequency of Kamakawi was. I didn’t have this information immediately available, as Jim Henry, who did the initial analysis, focused on syllables rather than letters. This makes sense for Kamakawi, whose base unit is really the syllable (there aren’t even any codas), but it made me work a bit to get the letter frequency of Kamakawi. Nevertheless, I have done so, and I’ve presented the results below (which, I have to admit, took me quite by surprise!):

Rank Phoneme % of Letters
1 A 14.88%
2 I/Y 13.76%
3 E 12.41%
4 O 9.26%
5 U/W 9.10%
6 L 7.82%
7 K 7.75%
8 M 5.75%
9 T 4.47%
10 N 3.86%
11 H/’ 3.80%
12 P 3.74%
13 F/V 3.40%

Okay, before I comment, a couple of explanations first. In order for the count to work, Jim redid the intervocalic sound changes (so all instances of V became F and all instances of became H). He kept, however, both Y and W as phonemes. I think this is probably something one ought to do, since there are minimal pairs, but even those minimal pairs aren’t straightforward (the ones that don’t have the glide are often morphologically complex). In the overall syllable count statistics he came up with, then, I counted all occurrences of Y and W as I and U respectively. That, I think, gives a more accurate count.

Also, when I was doing the stats yesterday, the numbers didn’t add up to 100%. I had to do a little tweaking to get it to work, so I though to avoid that today, I’d take the numbers out to two decimal places. And wouldn’t you know it! The count added up to 100.58%! How does that happen?! So what I did was I subtracted six hundredths from the top six, and then subtracted four hundredths from each row. That probably won’t affect the percentages too much; it’s still fairly accurate…

Okay, now for the results! First, no one should be surprised that the first five spots are taken by the five vowels. Kamakawi is a language that allows one onset consonant maximum, and no codas. This means that every syllable in Kamakawi will have at least one vowel but at most one consonant. Since there are tons of syllables that have no consonants at all, it’s no shock that the vowels dominate the count. I’m also not surprised to see A on top of the list. Even with the prevalence of words beginning with the i- prefix, if one looks at all environments, I think there’s no question that A would win out.

I have to tell you, though, what absolutely shocked me was that L beat out K as the top phoneme. If you asked me yesterday what the most common consonant was in Kamakawi, I would’ve responded K, and would have done so immediately. I can’t believe L won out! It must be all the causative suffixes…

What I found next most surprising is how little there is of T and N those are two of the most common consonants in English—and with good reason (the alveolar ridge is the richest place of articulation in the human mouth, and its consonants are favored the world over—especially in inflectional morphology). Oh well. Perhaps it’s the Hawaiian influence creeping in (for T, of course, not for N).

I’m not surprised to see F/V at the bottom. It’s certainly the least common sound in Kamakawi. But I am surprised that H/’ wasn’t higher, and that P is so low. If I were to put it ahead of anything, I would’ve put it ahead of M, but take a look at M, up there as the third most common consonant! That I never would have guessed.

So, there you have it! The phoneme frequencies of Kamakawi. Again, this is based on an older form of the lexicon, so it’s not 100% up-to-date, but I expect that, for the most part, this pattern will hold.

2 Responses to “Noala”

  1. Ka kavaka Ember Nickel ti:

    Thank you!

    My interest in letter frequencies derives from my interest in lipograms, texts written (in English/French, anyway) without a particular letter of the alphabet–we need letter frequencies to figure out what the real challenges are. Upon looking at the relevant Wikipedia article, however, I see that different languages do it differently; Japanese lipograms are done on the syllabic level. Is that how Kamakawi would do it? ;)

    By the way, I haven’t computed any statistics for my language, but looking at the verb conjugations I’m guessing the hardest lipogram would be without the letter “o” (which all but forces a second-person story). However, I think I use the letter “o” to represent the same sound you and Rejistania mean by “a”, so phonetically we might all have the same hardest lipogram deep down!

  2. Ka kavaka David J. Peterson ti:

    Ah… For a lipogram, I’m fairly certain Kamakawi would use the syllable. If so, i would be quite complicated, if you did it by the sound (not the iku), since it’s used commonly as a syllable, but also the direct object marker. That’s a lot of passives!

