Wiktionary talk:About Proto-Samic

Transcription
The first issue is probably closeness of transcription. Lehtiranta (1989–2001) uses a relatively broad phonemic transcription: A more narrow transcription also exists. I am aware that my working index table at User:Tropylium/Proto-Samic uses a non-standard compromise between these two, and it probably won't be the best choice. I'll convert it to a different system once I'm done with the initial dumping and we have an agreement on which one. I don't have a strong opinion which of the two mainstream choices is more accurate or neutral (I believe neither of them is optimal), but simply in terms of convenience of typing the diphthong transcription is certainly better than the monophthong transcription.
 * long close-mid vowels *ē, *ō
 * long open-mid vowels *ɛ̄, *ɔ̄ (for the latter actually the Uralistic equivalent ᴒ̄, but I think we can agree that this is less than optimal)
 * long mid stem vowels *ē, *ɔ̄
 * palatal consonants *ć, *ń (though, for some reason, still *š)
 * long close-mid vowels *ie, *uo
 * long open-mid vowels *ea, *oa
 * long stem vowels *ē, *ɔ̄
 * the sibilants /tʃ ~ tɕ/, /ʃ ~ ɕ/ marked as postalveolar *č, *š; the palatal nasal /nʲ ~ ɲ/ marked as *ń

Also for /ə ~ ʌ/ the traditional Uralistic transcription *e̮ has been used in most sources, though I've seen several recent papers use the typographic simplification *ë and we're probably safe off with it (esp. since we're also using *ï ~ *ë for Proto-Uralic).

There is also a standard phonetic transcription, used in many works such as Sammallahti (1998). This explicitly marks consonant gradation and other such phenomena and is probably best left for possible Pronunciation sections.
 * I favour the narrower transcription of the vowels, as this more clearly reflects their real phonetic value and is less abstract (meaning easier to understand for laypersons). But for non-initial syllables I think *ē and *ō should suffice, unless there is a reason not to do it that way.
 * Suits me. Another point here is that ‹ē› and ‹ë› are visually fairly similar and in risk of being confused. No particularly good way to fix this in unstressed syllables though (of course we could e.g. use *ə for *ë but that would be getting into idiosyncratic transcription).
 * It would be also possible to dispend with length marking entirely, if *ë and *o were reconstructed as *ɪ and *ʊ. This is a relatively non-standard approach though. --Tropylium (talk) 18:42, 24 October 2014 (UTC)
 * We could even just use plain *e. But the standard practice in Uralic seems to be to denote "inverted frontness" (fronting of a back vowel, backing of a front vowel) with a diaeresis diacritic. At least I'm assuming that based on the use of *ë and *ï in Proto-Uralic reconstructions. —CodeCat 18:59, 24 October 2014 (UTC)
 * It's a "standard practice of the typewriter era". The official standard has always been that fronting (front ü ö ä, central u̇ ȯ ȧ) is transcribed distinctly from backing (i̮ e̮ ə̑). OTOH, this concerns mainly phonetic transcription. --Tropylium (talk) 20:26, 24 October 2014 (UTC)
 * For the palatal consonants I'm guessing that the difference between ´ and ˇ is meant to reflect the Uralic origins. But if there is no contrast between the types in Samic, then I don't think it matters that much which one we use as long as we're consistent. But it is probably more sensible to use what the Sami languages themselves use. I know that Northern Sami uses č and š rather than ć and ś so that's a good argument for favouring č and š. I don't know what NS uses for ń though.
 * Some of the more eastern Sami languages do contrast /tʃ/ with /tɕ/, but this is written as e.g. Kildin ч vs чь. If compatibility with modern languages is the argument in favor of diphthong transcription, then sure, hačeks should be also used here. There is a slight compatibility issue backwards, since PU *ć goes to *ć/*č, while PU *č goes to *c, but I guess we can ignore the needs of the linguistically uneducated Proto-Uralic speakers ;)
 * Most Sami languages use nj for *ń, which does not seem to provide a useful alternative (while there is no cluster *nj in Proto-Samic, *lj *rj *vj are still found). --Tropylium (talk) 18:42, 24 October 2014 (UTC)
 * Proto-Germanic merged inherited *Xʷ (labial-velar obstruent) and *Xw (cluster of velar obstruent + labiovelar approximant) into a single sound which was realised as a labial-velar obstruent. But we denote it as *Xw (*hw, *kw, *gw) because there was no contrast, even though other sequences like *sw and *dw were true clusters rather than single labialised phonemes. So if there is no cluster *nj in Proto-Samic then using *nj for *ń seems fine. On the other hand, this may be too unorthodox? —CodeCat 18:59, 24 October 2014 (UTC)
 * It would be workable I'm sure, but yes, it would also be an unorthodox choice. Perhaps alike to using v instead of β in Proto-Germanic. --Tropylium (talk) 20:26, 24 October 2014 (UTC)
 * Then let's just use *ń. It's not really all that inconsistent, at least not compared to combining *ć and *š. Do those two phonemes have different articulation in any Sami language? —CodeCat 20:39, 24 October 2014 (UTC)
 * I don't recall having seen any difference in the articulation of *č and *š reported, at least. --Tropylium (talk) 00:40, 25 October 2014 (UTC)
 * One more thing to consider is whether to reflect consonant gradation, and if so, how. —CodeCat 15:52, 24 October 2014 (UTC)
 * We'd need some way of notating it at least for hypothetical declension tables in the future. --Tropylium (talk) 18:42, 24 October 2014 (UTC)
 * But do we notate it in entry names? Or do we only show it by adding extra diacritics to the headword and links, like we do for Latin? —CodeCat 18:59, 24 October 2014 (UTC)
 * Entry names could get amazingy ugly if shown with all diacritics. The standard transcription requires symbols up to ᴣ̌́̀, i.e. "small capital ezh with haček, grave, and acute accents" which stands for [d̥ʑ̥ˑ]; the allophone of *č in coda position before an open unstressed syllable beginning with a voiced consonant. I definitely support not doing that & I am not sure if there is any need to add this level of detail to the headword and links either. The catch, after all, is that consonant gradation in PS was not even phonemic. --Tropylium (talk) 20:26, 24 October 2014 (UTC)
 * But we do show it for Proto-Finnic. On the other hand, Proto-Finnic only had three levels of length, with the fourth represented as voicing. We probably can't use that trick for Samic. We could use some kind of diacritic (inverted breve?) below the letter to indicate weak grade. Below is more convenient because then it doesn't clash with the haček or acute of any of the other symbols. So we'd have strong t, tt and weak t̯, t̯t̯. How is that? It's not too intrusive but still clear. Whether we include that in entry names is still open, but I think it's doable. —CodeCat 20:37, 24 October 2014 (UTC)
 * Gradation in Proto-Finnic has the convenient feature of being qualitative for single consonants yeah, where it most often comes up. OTOH in Proto-Samic it is normally reconstructed as entirely prosodic, and handbooks etc. usually transcribe gradation by marking the strong grade as half-long (or overlong). This would mean adding half-long marks to essentially all lemmas, if we mark it in them.
 * Now in most dictionaries of Sami languages, strong clusters and geminates are marked by ˈ (the IPA stress mark… don't ask me why). So we could do *kuokˈtë 'two', *kolˈmë 'three' etc. Or perhaps use a plain apostrophe to avoid the impression of final stress. But this might look a bit weird in CVCV roots.
 * {| class="wikitable"

! Superscript line !! Apostrophe !! UPA half-length diacritic !! IPA half-length diacritic
 * + Some ways of transcribing strong grades
 * + Some ways of transcribing strong grades
 * *kuolˈē || *kuol'ē || *kuol̀ē || *kuolˑē
 * *kolˈmë || *kol'mē || *kol̀më || kolˑmë
 * }
 * It's a distinct issue from this if we should note stop lenition/preaspiration (*p = [p]- ~ -[b̥]-, *pp = [ʰpː]). I still think that at least for these details, the pronunciation section would probably be a better place. Using ‹b d g› would, sure, again bring the transcription closer to the modern-day languages, but that seems misleading since these aren't voiced stops. OK, they're not (phonemically) voiced in e.g. Northern Sami either, but are in e.g. Southern and Skolt, not to mention IPA and most non-Samic languages written in the Latin alphabet. --Tropylium (talk) 00:40, 25 October 2014 (UTC)
 * That's why I proposed using a diacritic to mark the weak grade, so that it would only show up in a minority of cases. And I think the inverted breve below looks better visually than any of your proposals, to be honest. But if we have to choose one of them, I think the UPA diacritic is best because it doesn't visually break up the word. —CodeCat 00:58, 25 October 2014 (UTC)
 * If we were to not add the diacritics to the entry itself, it really does not matter which grade we mark. Plus probably the majority of inflected forms have at least one weak grade. --Tropylium (talk) 03:35, 25 October 2014 (UTC)
 * That's why I proposed using a diacritic to mark the weak grade, so that it would only show up in a minority of cases. And I think the inverted breve below looks better visually than any of your proposals, to be honest. But if we have to choose one of them, I think the UPA diacritic is best because it doesn't visually break up the word. —CodeCat 00:58, 25 October 2014 (UTC)
 * If we were to not add the diacritics to the entry itself, it really does not matter which grade we mark. Plus probably the majority of inflected forms have at least one weak grade. --Tropylium (talk) 03:35, 25 October 2014 (UTC)

Order of descendants
It's probably best to keep these in a specific order. Currently I've been adding them in alphabetical order, grouped by East followed by West. But it seems more common in various sources to list them in geographical order, from westernmost (Southern) to easternmost (Ter). This ordering does have the advantage that languages that are closer geographically are also listed closer together in the list. —CodeCat 20:01, 24 October 2014 (UTC)
 * I do find SW-to-NE order better; it also has the benefit that it's easier to see at a glance which descendants have not been listed (or attested). If there is no overarching policy about this, I say let's go with the scholarly standard. --Tropylium (talk) 20:36, 24 October 2014 (UTC)

Reference templates
I've made the extremely basic Template:R:YSS recently. A handy thing to add have would be a page parameter — or rather, a double page parameter: each entry line spans two pages, with PS to Lule Sami on the left and North to Ter Sami on the right. I could look into adding this myself as well, though.

The YSS's successor, the Álgu database would also be handy to cite, but I don't want to have to write hundreds of references by hand, so a parametered template would again be handy. Individual entries have URLs such as http://kaino.kotus.fi/algu/index.php?t=sanue&sanue_id=49810 with reference only to the lexeme ID, not the headword itself, and a template approach might have to look something like. --Tropylium (talk) 20:51, 24 October 2014 (UTC)

Pre-vowel shift Sami
A couple of experts on Sami history (e.g. Pekka Sammallahti) have considered it possible that most of the Sami vowel rotation only took place during the dialectal Sami period. I'm considering the possibility of providing intermediate pre-vowel shift forms, whenever they are certain to have existed. E.g. for, "From earlier *amta-, from PU *ëmta". --Tropylium (talk)