Wiktionary talk:Votes/2014-01/Treatment of repeating letters and syllables

If a string happens to be both a valid, attested, non-repetitive/elongated word in one language and an attested, repetitive/elongated word in another language, I presume the entry will not be a hard redirect but will instead have two L2s with the second one being a soft-redirect. Is it worth spelling this out, or is it obvious? (Nothing is obvious to literal-minded readers of CFI.) It is not implausible that pairs of words like that exist; e.g. Estonian has and Elfdalian has, which could be expanded to  (although judging by a Google Books search, it hasn't been yet). - -sche (discuss) 21:12, 22 January 2014 (UTC)
 * I think it's obvious that if there is a word in another language that interferes with the hard redirect, then you can't have a hard redirect. --WikiTiki89 21:26, 22 January 2014 (UTC)
 * Yes, I agree that this is an obvious case. The proposal is intended to deal with the question of whether an attested string that is otherwise unused should exist at all. I suppose that with respect to the elongated form, you would also have a line for that sense. I think once we get past, say, four consecutive instances of the same vowel or syllable, the chances of such a coincidence become exceedingly remote. bd2412 T 23:13, 22 January 2014 (UTC)
 * Just to make it clear, I have added "no other meaning in any language" to the proposal. Cheers! bd2412 T 02:26, 23 January 2014 (UTC)

Maximum number of repititions
As I said in the RFD discussion, I don't think this vote should set a hard maximum for the number of repetitions. For some words, it may make sense to have four repetitions, while for others it may not even make sense to have three. For example, for, four la's is very common and should probably be included. Whereas for "ooow", three o's is no different from two, "oow". The maximum should be decided on a case-by-case basis. --WikiTiki89 02:47, 23 January 2014 (UTC)
 * In languages that allow double letters, three letters might not be as common as four because it stands out more. I don't know how this is in Dutch, but it's worth looking into. 02:49, 23 January 2014 (UTC)
 * I did a quick Google search for "jaaaren", "jaaaaren", "jaaaaaren" (from ). Four a's is most common. 02:51, 23 January 2014 (UTC)
 * I would suggest that "oow" is far more likely to coincide with either an actual word in another language, or at least an abbreviation in English (OOW, as it happens, is naval parlance for "Officer On Watch"). I think that any instance of two repetitions of a vowel is more likely to exhibit this problem than three repetitions, which is an unusual sequence in many languages. As for tralalalala, that could actually be considered a slightly different word from tralalala, if the stresses would be grouped differently. If not, then it adds nothing to the definition already existing for tralala. bd2412 T 04:03, 23 January 2014 (UTC)
 * I didn't mean we should redirect "oow", but that we should redirect "ooow" to "oow". Either way, I think you should remove the word "three" from the vote. I don't want to vote on a number. --WikiTiki89 04:05, 23 January 2014 (UTC)
 * I would be fine with increasing the number from three to four, but if it didn't set some number, the verbiage that is proposed for addition to CFI would just be an effectless philosophical musing — an observation that sometimes words are repetitively elongated and that we have the faculty of deciding whether or not to have entries for them. In the past I've supported the removal of effectless philosophical musings from CFI (e.g. the section on "the slippery slope"), and I wouldn't support the addition of one. If there isn't consensus for a specific number, we can continue doing what we've been doing, deciding in each case by RFD how many repetitions to keep. (In any case, I see no harm in having a soft redirect from to / rather than a hard one.) - -sche (discuss) 04:37, 23 January 2014 (UTC)
 * Well this would basically justify what we've been doing. Some people were asking whether CFI allows redirecting hahahahahahahahahahaha to hahaha since it was unclear. And now we are making that explicit, that doing that is allowed. The number itself should not have to be fixed. --WikiTiki89 04:48, 23 January 2014 (UTC)
 * Strictly speaking, CFI says if hahahahahahahahahahaha is attested, then it can have an entry. I would also take it as a given that if there is an unusual situation wherein something like "tralalalala" is worth keeping as a standalone entry, then we can deviate from the rule in that instance. However, the rule should not be "hahahahahahahahahahaha and pleeeeeease and gooooooooooo, being attested, are entitled to have entries until a discussion results in some change to that status"; it should be that such constructions redirect unless a discussion says they should not. bd2412 T 05:00, 23 January 2014 (UTC)
 * If you put what you just said explicitly into the vote, I will be satisfied. --WikiTiki89 05:06, 23 January 2014 (UTC)
 * That seems reasonable. Please feel free to adjust the wording. Cheers! bd2412 T 13:25, 23 January 2014 (UTC)

Other language entries with repetitions include jajaja and wwwww. Just putting it here as food for thought. TeleComNasSprVen (talk) 04:34, 26 January 2014 (UTC)
 * For jajaja, at least, jajajaja and jajajajaja are attested, as are other additions up to jajajajajajajajajajajajajaja. bd2412 T 04:51, 26 January 2014 (UTC)

My edits to the vote
I have edited the vote to suit my taste for simplicity, with the intent to leave the meaning intact. As per User talk:BD2412‎, the creator of the vote does not take objections to these edits. --Dan Polansky (talk) 16:30, 25 January 2014 (UTC)
 * Your edit made me realize that there's another minor problem. If a form with more than three repetitions is attested, but the form with three repetitions is not, we cannot redirect the former to the latter. --WikiTiki89 16:35, 25 January 2014 (UTC)
 * I would hope that this hardly ever happens and that the clause enabling consensus to override the regulation on an ad-hoc basis should deal with such cases. I mean this clause: "The above treatment may be overriden by consensus, for example where a variation having four repetitions is more common, or where an additional repetition would cause the word to shift to a different pronunciation or intonation." The clause does not say when exactly the treatment can be overriden; it merely lists two cases when such an overriding may be desirable. --Dan Polansky (talk) 16:40, 25 January 2014 (UTC)
 * Ok, so there is no default action in that case. --WikiTiki89 16:42, 25 January 2014 (UTC)
 * I am not sure what you mean. The regulation applies per default, unless overriden by a case-by-case consensus to grant an exception. --Dan Polansky (talk) 16:44, 25 January 2014 (UTC)
 * There is no regulation for that case. Unless you mean the default action is to redirect to a redlink. --WikiTiki89 16:45, 25 January 2014 (UTC)
 * Oh, I see what you mean. I think the sound thing to do would be to redirect to a bluelinked repetitive form that is a high-frequency one. I don't think we need to amend the vote for that. Let's keep the regulation simple and wait for experience before we complicate it further, no? --Dan Polansky (talk) 16:52, 25 January 2014 (UTC)
 * Which is why I'm ok with assuming that there is no default action in this case and these cases should always be discussed. --WikiTiki89 16:57, 25 January 2014 (UTC)

Letters / syllables
I would support redirects in the case of repeated letters but not in the case of repeated syllables (or less happily, anyway). Am I the only one that feels they are rather different things? Ƿidsiþ 16:44, 25 January 2014 (UTC)
 * So you would not support redirecting hahahahahahahahahahaha to hahaha? --WikiTiki89 16:46, 25 January 2014 (UTC)
 * More to the point, since hahahahahahahahahahaha is attested, you would default to having an entry for that in accordance with the current CFI? bd2412 T 00:11, 26 January 2014 (UTC)
 * "Letters and syllables" cannot cover all the different forms that a language can take such as Japanese "moras", so I suggest something like "attested series of repeating characters in an entry formed from a base lemma..." where emphasis is the only semantic meaning derived from such a repetition. I also think that the last sentence is quite superfluous and merely reaffirms the current state of Wiktionary: "The above treatment may be overriden by consensus, for example where a variation having four repetitions is more common, or where an additional repetition would cause the word to shift to a different pronunciation or intonation." You could just say something like if it has any semantic meaning that would separate it from the base lemma the sense for the entry should obviously come first. TeleComNasSprVen (talk) 04:31, 26 January 2014 (UTC)
 * Also it's possible that it's not only a single syllable that's repeated, such as kapowkapowkapow. --WikiTiki89 04:38, 26 January 2014 (UTC)
 * My preference of "characters" has a much wider interpretive and vague definition that can encompass most of these. TeleComNasSprVen (talk) 04:41, 26 January 2014 (UTC)
 * I only get a single Google Books hit for kapowkapowkapow. I am not keen on going beyond letters and syllables until there's a case showing the need for anything more than that. bd2412 T 04:53, 26 January 2014 (UTC)
 * Well in English these sort of things are usually written as separate words, like "Kapow! Kapow! Kapow!", but they don't have to be and I don't know how other languages usually do it. --WikiTiki89 05:25, 26 January 2014 (UTC)
 * Then I say we limit this proposal to repetition of letters and syllables as used in languages that have letters and syllables, and consider multi-syllable or other multi-character repetitions on an individual basis as they come up in attestable words, if they ever do come up. <i style="background:lightgreen">bd2412</i> T 15:37, 26 January 2014 (UTC)

Shortcut
why "pumping"? --WikiTiki89 16:27, 27 March 2014 (UTC)
 * User:Wikitiki89: This is a reference to the pumping lemmas, which are kind-of related to this issue, and a makes a nice pun on "lemma". — Keφr 16:54, 27 March 2014 (UTC)
 * And I tried so hard to suppress all memories of my Theory of Computation course... --WikiTiki89 17:00, 27 March 2014 (UTC)
 * Pages can have multiple shortcuts; we could always give it another, more literal shortcut, like WT:REPEATING or WT:HAHAHA or WT:GOOOAL. Either of the latter two would be great to cite in discussions, wouldn't it? - -sche (discuss) 18:37, 27 March 2014 (UTC)