Glyph of the word 'fatu'.


  • (n.) number
  • (v.) to count
  • (v.) to number
  • (adj.) orderly, in order
  • (adj.) obedient

A fatu ue!
“Let’s count!”

Notes: Awhile back, a commenter posted a short comment about how conlangs ought not have the same number of words beginning with each letter/phoneme in their inventory. This was when I had pointed out that there weren’t enough l words in the Word of the Day. I pointed out then that, as I select which words to do, the Word of the Day words were not a random sampling of Kamakawi words, but that got to me thinking: Just how close are the counts?

Counting today’s f word, here’s the percentage breakdown for the initial phonemes of the Words of the Day so far:

Rank Initial Phoneme # of Posts % of Posts
1 T 25 12%
2 H 24 12%
3 K 21 10%
4 I 19 9%
5 P 17 8%
6 O 16 8%
7 F 14 7%
7 M 14 7%
9 N 13 6%
10 L 12 6%
11 E 11 5%
12 A 10 5%
12 U 10 5%

That’s the Word of the Day breakdown. Now let’s compare that to the actual breakdown in Kamakawi.

To do that, I’m going to make use of a statistical analysis conducted by a great conlanger Jim Henry a year or so ago (two years? Can’t remember). Jim created a Perl script which he ran on my modified Kamakawi dictionary (he stripped out all the definitions leaving just the words). What it did was it separated the entire list into syllables, and counted initial, final, medial and total syllables. Though the lexicon has since expanded, I think it’s a fair representation of just how frequently a given syllable is used in Kamakawi—and in which position.

In order to get the initial phonemes, I took his count of the initial syllables of Kamakawi and added all the like CV forms together. Then I did a little math and came up with the percentages for initial phonemes in Kamakawi. Here are the results:

Rank Initial Phoneme % of Words
1 I 16%
2 K 11%
3 H 9%
4 T 8.6%
5 N 8%
6 M 7.6%
7 F 7%
8 P 7%
9 L 6.6%
10 E 5%
10 O 5%
10 U 5%
13 A 4%

Quick note on the above: F and P have pretty much the same percentage, but there are two more F words than P words, so I didn’t list them as tied. Oddly enough, though, there are exactly the same number of words starting with E, O and U (or, rather, there were at the time that Jim ran these statistics. I’m sure that’s no longer the case).

As you can see, the percentages are close sometimes, but not near enough to be accurate. Also, you can see by the real count that I words blow all the rest out of the water. That’s due in large part to the i- prefix which enjoys a lot of use. If you stripped those out, K would be the winner, which isn’t surprising (or, at least, not to me, the one who coins the words).

Other than I, though, I realized that it shouldn’t be surprising that the vowel-initial words should come in last. Vowel-initial words can be thought of as, essentially, beginning with an empty consonant. If you added them all together, then, you’d get a count much like the other consonants, where, with K, for example, you get every word that starts with ka, ke, ki, ko and ku all together.

A small note about the iku here. This is essentially the Kamakawi equivalent of the pound (#) sign. It just means “number”. You may recognize this iku from the entry for ape, “one”. All the number glyphs are shapes traced from the original number system, which was just a series of dots. Since one dot is too small for a character, a short stroke (or dot above, originally) was added to the glyph for fatu, and that’s what became the iku for “one”. Basically, it reads as if it were “number one”.

All right, now to start on something exciting for the next week or so!

Tags: , , , , , , , , , ,

4 Responses to “Fatu”

  1. Ka kavaka Rejistania ti:

    That is interesting… I should check the same for Rejistanian stems.

  2. Ka kavaka Ember Nickel ti:

    I posted this on Rejistania’s blog, but I’m curious about Kamakawi too. Do you happen to know what is the most common letter, not just to begin words but in all of them? (Or, alternatively, which letter appears in the highest percentage of words?) English letter frequencies interest me, but I don’t think my conlang is developed enough to get any good data yet.

  3. Ka kavaka David J. Peterson ti:

    Since Kamakawi’s base unit is the syllable, Jim did an analysis on syllables, not letters. However, I used his overall syllable analysis to do a letter analysis. The results surprised me! But, like the true tease I am, I will post those results tomorrow… ;)

  4. Ka kavaka David J. Peterson ti:

    Here’s a link to a similar count for Rejistanian. :)

Leave a Reply