Wiktionary talk:About Karelian

Choosing a canonical representation
Karelian is a rather varied language and there are several isoglosses that span its territory. There is no real standard spelling so that means that the same word can be spelled in widely different ways. This is a problem for Wiktionary because we try to include all spellings, but it would quickly become an inconsistent mess if we did that. So we should probably pick one particular spelling variety (preferably one that's widely used and recognised) and stick with it. Others would then be linked back using and. I've left some questions on the main page about particular things to be addressed. —CodeCat 18:19, 3 December 2014 (UTC)
 * I'd recommend primarily following North Karelian standards (so *s > š, no voicing, no *A-raising), and secondarily Olonets Karelian (Livvi), which does all of these in the opposite way (and also differs in inflection). It might even be worth considering if Livvi deserves treatment as a separate language entirely? It has its own ISO 639-3 code and everything.
 * ‹c č y› are the current standard in both for /ts tʃ y/, so that should probably be enforced as the primary orthography at least. --Tropylium (talk) 00:04, 4 December 2014 (UTC)
 * … we do seem to already have Category:Livvi language, so that solves some issues. --Tropylium (talk) 00:12, 4 December 2014 (UTC)
 * With a-raising I assume you mean for ? That was what I intended in any case. I see you added -oa vs -ua but I don't know if that's a separate issue or connected. In any case, so far I've mostly gone by karjalan kielen sanakirja, which is conservative in its lemmas. The lemmas don't show voicing or loss of final -n, and reflect post-Finnic *-aa as -oa, *-oa as -uo. They also have s rather than š. It also uses y and strangely tš. The usage examples in each entry are given in a much wider variety of spellings, presumably from different dialects.
 * Regarding palatalisation, I think we might want to look at that more widely than just Karelian. Other Finnic languages have similar issues. —CodeCat 01:44, 4 December 2014 (UTC)
 * with final *-a > -u is a very typical Livvi form, yes.
 * Being a dialect dictionary, KKS is better considered as presenting transcriptions than orthographic variants (hence tš, ttš, ü, in accordance with Uralic Phonetic Alphabet transcription). I guess its lemmas could be considered some kind of abstract "proto-Karelian" forms? — a kind of Proto-Finnic rewritten to include various widespread Karelian innovations. Which can also end up being a bit anachronistic, as occasional divergent forms found in Livvi typically descend from its Old Veps substrate, not from the historical Old Karelian.
 * Maybe the biggest problem is how much South Karelian should be acknowledged. Where it systematically differs from North, we could just add words like and mark them as dialectal (and, since there is no single Standard Karelian, probably vice versa for specifically Northern forms like ). But like you mention, the various isoglosses tend to fall along different lines, and this produces plenty of transitional forms.
 * Really I think Wiktionary's M.O. to separate all words into a pre-decided discrete set of languages is not too well-suited for dialect continua for which no standardized form exists. To claim that e.g. in each of Finnish, North Karelian and Ingrian there exists a word aika does not seem as accurate as that there exists a single word aika shared (in at least some case forms) by all three. Separating them strikes me as a bit akin to hypothetically splitting a lemma such as into separate entries for British/American/Australian/etc. English — in that it only duplicates content and makes it poorly manageable. And yet, any proposal for merging the languages entirely into a single entity called "North Finnic" or something would probably completely tank, since plenty of words do sometimes diverge even quite strongly (e.g. concepts loaned from Russian vs. Swedish). --Tropylium (talk) 21:33, 5 December 2014 (UTC)
 * For historical languages, we often choose some kind of norm. For Old English for example we generally use Late West Saxon as the norm, while the Middle Dutch spellings (which vary as least as much as those in Karelian) are all concentrated onto a single lemma form, which may or may not be actually attested. For Old Norse we even use a normalised form that does not actually correspond to any written form of the time, but which is "modernised" to correspond more closely to modern Icelandic. So there is some leeway for us in choosing whatever representation suits us, as long as we don't deviate too far from what is actually used (Old Norse is an exception because the modernised form is widely used by sources, whereas the original spellings are rarely covered).
 * I think it would be best if we decided on some kind of standard written form, meaning that we choose a representation for a particular set of isoglosses (preferably one corresponding to a real spoken dialect). We don't necessarily have to say that every single form is dialectal, because people may not write in the representation of their own dialect anyway. It's quite conceivable that an Olonets Karelian may nonetheless follow the more northern varieties orthographically and write -a even though their own dialect has . So in effect, in the absence of a standard, we would create one for Wiktionary. —CodeCat 23:19, 5 December 2014 (UTC)
 * Historical (and reconstructed) languages are one thing, living ones are another, with rather different user bases. I do not think it's Wiktionary's business to start endorsing one dialect area's standard over another as "unmarked Karelian". Minor dialect differences can and should be smoothed over, yes, but when actual competing standards are being developed (in this case mainly North Karelian vs Livvi, plus I think there is even a Tver-based standard South Karelian in the works too), they are going to need some kind of acknowledgement.
 * But what probably solves quite a few things (or perhaps sets us up for quite a bit of RFV troubles down the road…) is that we document written usage, not linguistic field research, even when the research notes have been edited into nice dictionaries. --Tropylium (talk) 19:01, 6 December 2014 (UTC)

Continuation
Okay, so, after eight years this issue has become pressing, because at the moment our Karelian entries are a complete mess. As I see it there are three options: Personally, I prefer the first and third options, since that would prevent us from maintaining all the potential dialectal forms. I understand however how the first might be problematic due to political issues (why would we prefer one standard over the other?) and the thirddue to entry duplications across languages (e.g. šana having three near-identical entries).
 * 1) Standardising on one literary standard (for lack of other options, either literary Tver or Viena Karelian).
 * 2) Duplicating entries, just like Serbo-Croatian does with the Ijekavian/Ikavian etc.
 * 3) Splitting the language into at least two (North and South Karelian) and perhaps three (North, South and Tver Karelian) languages.

However, what are your thoughts? Thadh (talk) 14:06, 27 May 2022 (UTC)


 * As I have stated before, I would prefer option 2. Option 3 would be my secondary choice, but I wouldn't have a separate Tver Karelian entry.
 * I don't like the first idea since there are many words which are only attested in South Karelian (and possibly Livvi), but not Viena Karelian e.g. perzehattara, by that logic it should be in an unattested, possibly nonexistent form *peršehattara. Doing that doesn't seem very sensible. Kapulakone (talk) 08:26, 30 May 2022 (UTC)
 * Well, you could always use a label for such words, i.e. . The fact that we prioritise one literary standard over another doesn't mean we won't be able to give regional forms. Thadh (talk) 11:58, 30 May 2022 (UTC)