User:Soap/triplets

Here I list words with three of the same consonant in a row, like mamamahayag and pirarara. Many languages have geminate consonants, which in many ways behave like single consonants. I will consider these to be single instances, and only consider consonants to be separate instances if they are separated by the physical tongue and lip movement of an intervening vowel. Therefore Finnish totta counts as two ts, while Finnish totota counts as three.

I expect languages with small consonant inventories to dominate, with Finnish taking a clear lead over everything else. Finnish also has particles -ko and -pa, which could potentially make its listed words even longer. However, Rotokas might be the greatest performer of all, since it has quite a few entries on this list despite having a much smaller corpus than the other languages.

Scope
Strict rules prevail, so for example reduplication is not allowed,  nor are contrived compounds with no practical use. For example, popup is certainly a word in English even though it's a compound, but *poppy plant is not commonly used and does not count as an instance of 3 /p/'s.

B
The word bibimbap comes close to qualifying for this list, and without its /m/ it would arguably have four /b/'s in a row since there are no final voiced stops in Korean and the final voiceless stops most closely resemble the voiced (tenuis) stops in pronunciation. (This sound is sometimes spelled p in phonetic transcriptions because the voicing is allophonic.) It is possible that 밥 "rice" is actually a direct parallel of English pap meaning baby food, but even if this is discounted the initial consonant would still count as a /b/. The stem of the verb appears to just be bibi-, but I dont think there's any natural Korean expression that would get all the consonants together.

Greek

 * μπίμπιμπαπ, meaning bibimbap, if we accept /μπ/ as a single consonant rather than a cluster.

Translingual

 * baobab, no reduplication involved

Korean

 * 비법, three /b/'s or three /p/'s, depending on analysis.

Spanish

 * ababábite

Tagalog

 * bababa, an inflected form of bumaba. This can then be extended with other inflections to produce forms such as pagpapabababaan and pinagpapabababaan.

English
This would have more entries were I not so reluctant to include triplets based on morphology. But here are some examples:
 * deeded
 * did do is etymologically reduplication, but surely not understood as such by modern speakers.
 * A sentence like Eddy'd edited it already for some speakers of English contains five alveolar flaps in a row, then an R and then another flap.

Spanish

 * dedada

Algonquian
The Loup language, possibly Nipmuc, provides us the name Chargoggagoggmanchoggagogg (and sometimes even longer forms are seen). The language is extinct, and it is a matter of debate whether these /g/'s should actually be analyzed as /k/'s. There is no /r/ in the Wikipedia writeup for the language, just as most Algonquian languages don't have it, so that is probably a nonrhotic representation of /a/. Debate remains about the authenticity of the name, but the spelling above was in use even in the late 1700s before tourism provided an incentive to further exaggerate the length of the name.

Rotokas

 * gogagare, a sinkhole; a hole in the ground

English

 * cook a cookie. The two words are not cognates, and this is at least a believable phrase, although it is much more common to hear people say bake a cookie instead. *cook a cake will not work because this time the words really are cognates.
 * cookie cutter if the above is not tightly bound enough
 * cachexia, although this contains the root κακός which could be ultimately onomatopoeia

Finnish

 * kaukokatseinen, possibly shorter forms
 * kaukokäsi
 * kaukokaipuu
 * kukikas
 * jälkikaiku likely an ideophone, but at a very deep level. even so, this may disqualify any word with kaiku
 * keskikoko
 * silmukkakoukku
 * kiekkokurpitsa

A translative affix -ksi can add a fourth consecutive /k/ to some of these words.

An oft-repeated Finnish sentence is Kokoa koko kokko kokoon. It translates to "Put together the whole bonfire" and is usually seen as part of a longer tongue-twister. At first blush this sentence seems far superior to everything on the list. Nonetheless, it may still need to share its prize, as it appears that three of the four words in the sentence are derived from the same root koko and that the strictest possible rules would dictate that only koko kokko is valid as a set phrase. To this, I could add the question particle -ko, and arrive with five /k/'s in a row with no other intervening consonants: Koko kokkoko? This sentence would be "A full bonfire?" There are most likely other nouns with two /k/'s, but the tongue twister focused on also getting the vowels to line up.

Hawaiian
Possibly kokoke unless there is reduplication

Japanese
Many Japanese words with /k/, especially those with /k/ in consecutive syllables, are derived from Chinese where at one point various Chinese dorsal sounds were all represented as /k/ in Japanese. Thus one hears jiang ge in modern Mandarin but the word is pronounced /kōku/ in modern Japanese.


 * 降格 (kōkaku); this word seems to have two additional homonyms
 * 計画 (keikaku "plan"); well known from a particular anime translation
 * 国益 (kokueki)

国語学 has been used in text to mean Japanese language studies; it is pronounced /kokugogaku/, with five velars in a row, but there is most likely no stage of the Japanese language in which they would have all been /k/ without also having other consonants. Japanese linguists and  (father and son)  used this word as an example how they disliked the sound of their native language here,  and the word kokugogaku (in Roman letters) appears in Haruhiko's Wikipedia article today.

Rotokas

 * kaakaoko, a type of beetle.
 * kokokoru, flowerbud. Possible reduplication, since kokoru also means flowerbud and koru by itself might mean unripe fruit.
 * kukuku, headwaters of a river.

Common Southern Bantu

 * lalela
 * halalela, possibly related to above

English

 * calla lily

Finnish
The above words show the Finnish words are not just dependent on a single root. /lello/ might be a word for "game, toy" in Votic, which would then take case markers beginning with /l/. However the transcription seems to mix Cyrillic and Roman letters so there may be two phonemes
 * joululaululla
 * laulelolla
 * luolalentely
 * leluliike

Hawaiian
ʻiole liʻiliʻi though it deserves two cautions, one for having reduplication and one for having glottal stops. The glottal stop can contrast with hiatus.

Nahuatl

 * tlazōlolōlōlōni

Semai

 * lllaal "is sticking out their tongue", per w:Aslian_languages. Appears to be three phonemic /l/'s in a row, though I'm not sure how it's pronounced as a surface realization

English

 * mimeme

Greek

 * μιμέομαι, the same root as mimeme

Hebrew
I have read that ממון can take a prefix mi-, which would mean three /m/'s in a row

Tagalog

 * mamamahayag

English

 * Nenana (Native American placename, Alaska). This word is from the Lower Tanana language, related to Navajo, which has a large consonant inventory but where n, the only nasal consonant, is much more common than the many stops in the inventory.
 * Nonantum (Native American placename, Massachusetts)
 * nonanone, a chemical name. I found this while checking to see if nonanonymous was a word. This could also count, but I exclude it because it begins with two negative prefixes that resemble each other because they are cognate to each other. The little-used word onymous also exists.
 * nonunion. None of the morphemes are cognate to each other.

Finnish

 * enennen, the 1st person potential mood of the verb for "increase". A tongue twister page spells it as enenenen, adding yet another syllable, and it is possible that they are right .... their word could be a different form of the verb that merely isn't in common enough use to get its own listing.
 * banaaneineen, and possibly banaaninen "banana-like" if that type of word formation is standard

German

 * Bananen

Gilbertese

 * teinainano

Malagasy

 * fanonanana, meaning pronunciation. There is also fanononana, meaning unclear but presumably related

P
Many terms in this section are likely sound symbolism (consider the bouba/kiki effect), but I still place sound symbolism on a higher tier than reduplication so long as they represent single sounds and not repeated sounds. Maori has pēpepe as one word for butterfly, and I dont know if it uses reduplication or not.

Amis

 * papipacora'en, possibly with a verbal infix since this is an Austronesian language. There is no simpler form listed here as of yet

English

 * pupiparous
 * paper plate (nonrhotic)
 * copy paper
 * lipopeptide
 * honorable mention for popup hamper (four in a row but an /m/ comes between; three if deducting for sound symbolism)

Esperanto

 * opipapavo, the opium poppy. there is a variant spelling, opiopapavo, which is permissible according to the rules I've read, and is in fact apparently more common.

Finnish

 * papupata, papupurkki, hyppypapu, and siipipapu all involving beans. The redlink means "can of beans"


 * Honorable mention for pilppupeli since Finnish sonorant codas in some ways resemble the liquid diphthongs of Slavic
 * Possibly pippaa päässä

Hawaiian

 * pāpapa, "bean", possibly a loanword related to fava.

Italian

 * capopopolo
 * appoppare, which has only 2 /p/ per my criteria but produces inflected forms that end in a vowel and could be followed by another /p/.

Khmer
The name of the World Health Organization in Khmer is អង្គការសុខភាពពិភពលោក, which is pronounced /ɑŋ kaː so.kʰaʔ.pʰiəp pi.pʰup loːk/. Arguably this contains 4 or even 5 consecutive /p/'s, but Khmer /ph/ is usually analyzed as a consonant cluster, and if this analysis is rejected, it usually means that /p/ and /pʰ/ are being analyzed as different phonemes.

Korean
Likewise, does not contain more than 2 of any one consonant, but  it does have 4 syllables in a row beginning with bilabial stops.

Rotokas

 * papapa, that which flies. From papa + -pa.
 * Pipipaia, a placename.
 * pupupu, cotton and similar plant fibers.
 * upiapiepaiveira. a shorter word may be available

Telugu

 * . Unrelated to Finnish.

Thai

 * พิภพ, pronounced /pʰípʰóp/. A direct cognate of the Khmer word above.  Since there are no word-final aspirated stops, I would consider this to be three of the same phoneme in a row.

Tupi

 * Old Tupi (and possibly modern) implies three /p/ in a row for one form of the verb pererek, unless the language is pro-drop.

Wauja
The phrase kapaipiyapai ipitsi (4 /p/ in a row if treating  as part of a diphthong) appears on the meyeixapai page.

Unknown language
this document claims pepapa ~ pepapo ~ pepape along with some longer forms like pampapombe are all verbal forms in some language of SE Asia, possibly w:Pamona language.

R
This section is hard to search for because of frequent use of /r/ in morphology.

English

 * roarer. while there are other words, like rarer, rearer, and perhaps longer ones, roarer has lexicalized beyond just being a suffix attached to a base word.

French

 * serrurerie

Italian

 * fuoriorario


 * /ririri/ occurs as the form for ridere in some southern Italian dialects

Japanese

 * 変えられる is pronounced kaerareru, and since -reru is a verb ending, there are words with 4 /r/'s in a row such as 降りられる, /orirareru/; the passive of this verb is perhaps about as common as expressions like "paper plate" in English.

Portuguese

 * pirarara, a large catfish.

Rotokas

 * urauraaro

Spanish

 * tararira, another fish

Finnish

 * sisäsiisti
 * kahdeksasosa, one eighth
 * varallisuusasema (possibly shorter words exist)
 * seisoisi
 * luksusasunto
 * kurpitsasose

Spanish

 * desasociar. This list is intended to be by sound and not by spelling, so any /s/ counts. can have inflected forms like desasociases with 5 consecutive /s/'s.
 * desasosegar,  from des- + sosegar with a linking /a/; also inflects as above
 * desasieses

T
This section is difficult to search for. Japanese may have compounds involving /tatsu/ or /tachi/, while English words are dominated by morphology.

English
Examples abound because of morphological affixes, for example substitute is just one of many such words, and statutory is another. Content-only words seem hard to come by. teetotal is reduplication.


 * tater tot (nonrhotic)

Finnish

 * tähtitieteitten, with 4 /t/-onset syllables in a row, plus a fifth behind an /h/ and a geminate at the end. Probably unbeatable
 * totota
 * tartuntatauti, three /t/'s in a row and two more in earlier onsets

Greek

 * ᾰ̓ᾱᾰτώτᾰτος three /a/'s and three /t/'s

Italian

 * zozzezza, with three /ts/ clusters in a row. I would say this should count as a unitary phoneme

English

 * ovoviviparous

Finnish

 * tavuviiva

Rotokas

 * vavavu, having a harsh taste. likely related to vavuvira.
 * vovavae, five (cardinal).
 * vuvuva, firelight.

Glottal stop
See also below.

Hawaiian

 * ʻōʻōʻāʻā, a bird. The English entry is all we have right now. Apparently this combines both reduplication and sound symbolism, but I still think it deserves mention.

Hiatus
In theory, the null consonant Ø could be considered here, as it counts for alliteration purposes in English poetry. Just as geminate consonants are common, so too are long vowels. It is difficult sometimes to draw the line between a vowel sequence, a long vowel, and a diphthong. For example, Finnish long vowels are almost never considered to be sequences of two short vowels, yet  Finnish diphthongs  often are, even though many of them arose from historical long vowels. Thus riiuuyöaie could be counted as having five, seven, or nine vowels. Moreover, some languages have triple length vowels. I will leave this section ungraded as to number.

Note that many Polynesian words that appear to have long vowel sequences may actually have glottal stops. For example, the town of Kaaawa, Hawaii is actually Kaʻaʻawa and Kalaeoio is Kalaeʻōʻio. It is possible that Rotokas' vowel sequences are real, both in the sense that they have no glottal stops and that the /i/ and /u/ do not reduce to glides.

Blackfoot

 * aaápan though it is possible that /aa/ would be considered a single long vowel rather than a sequence
 * aaattsistaotsipiiis, rabbitbrush plant. this language may have many such words

Dharug

 * guuu-wi, source of English cooee

Estonian

 * õueaiaäär, the edge of a fence surrounding a yard. From õu + aia + äär. Not a likely word to come up in conversation, but I wanted to post it here since it appears in a print book with incorrect spelling and no diacritics. There is also a grammatically and semantically valid sentence ao äia õe uue oaõieaia õueaua ööau.

Finnish

 * riiuuyöaie
 * hääyöaie . Someone has taken this even further with hääyöaieuutinen

Ancient Greek

 * τῷ οἰηΐῳ  possibly as many as seven vowels in a row (note the subscript iotas).
 * Αἰαία, source of English Aeaea

Italian
Some people might analyze /j/ and /w/ as consonants.
 * cuoiaio
 * ghiaiaiuolo

Japanese

 * -をお送りします begins with three short /o/'s in a row, and most often follows a word ending in a vowel, which may provide yet another /o/.
 * 鳳凰を追おう, pronounced /hōō o oō/, meaning "let's chase the fenghuang". This is due partly to the many loans from Chinese which merge various rimes into /ō/, and partly to loss of /p/ in native Japanese words.
 * 東欧を覆おう, pronounced /tōō o ōō/, meaning "let's cover Eastern Europe".

Ojibwe
honorable mention for wiijiiw ... a palindrome with five dotted letters in a row, though it is a /dʒ/ in the middle, so this is not hiatus by even the loosest definition of the term (some people would accept /i:ji:/ as a sequence of two /i:/ for example).

Rotokas

 * oaaoa, family

Tupí

 * 'ybaaîa, any citrus fruit

Yoruba

 * possibly analyzable as a long-short vowel sequence

Repeated vowels across consonants
This is just a loose list without strict criteria. Some languages have vowel harmony, for example.


 * Gk Nephelegeretes, six /e/'s in a row, used in poetry

English consonant shells

 * The consonant shell /k_k/ may have the most possible values for the medial vowel among all English words. Along with /p_p/ it can claim to have all of the possible vowel slots filled, but the /k/ setup has more total vowels because English has words ending in IPA /ʊk/ but not in /ʊp/.  Thus there is a contrast between cook and kook but poop stands alone.
 * As many of the /k/ words are quite rude, I shall type up the /p/ words instead:
 * Short vowels: pap, pep, pip, pop, pup. Long vowels: pape, peep. pipe, pope, poop, pupe.    A counterpart to each of these words with /p/ replaced by /k/ exists, plus /k/ also has the word pair cook~kook, as above.

Check to see if there exist shells with higher numbers due to allowing diphthongs, e.g. town. No words end in /auk/ or /aup/ so I consider the /p/ and /k/ vowel slots filled.

Note, none of these words distinguish between COT and CAUGHT, because that is not a robust distinction and is merged in many dialects as well. t_n:
 * tan, ten, tin, tawn, ton.  tain, teen, tine, tone, toon, tune.   others: town, *toin.  So t_n actually has more than p_p and is tied with k_k, but still does not count as "full" because there is no -oin word, and -oin is permissible so it can't be exempted.

k_n:
 * can, ken, kin, con, cun.  cane, keen, kine, cone, coon.  others: coin.  tied with /p/ but still has gaps.

p_t:
 * pat, pet, pit, pot, putt.  pate, peat, pight, pote, poot, put.  others: pout.

m_s:
 * mass, mess, miss, moss, muss.  mace, *meese, mice, mos, moose, muce.  others: mouse

One reason for the dominance of /p_p/ and /k_k/ may be sound symbolism. In fact it is possible that every single one of the /p/ words derives from either onomatopoeia or expressive language (e.g. baby talk) rather than having a conventional PIE etymology. Similarly, w_p has almost as full of a set as these others, but more than half are onomatopoeia.

However,  toit and tout are words, so maybe /t/ is the winner:
 * tat, tet, tit, tot (and taut), tut. Tate, teat, tight, tote, toot.  Consider also toot(sie).  Then toit and tout which are unreachable for /p/ and /k/.