User:Joonas07/et-pron

Hello Thadh (and any other interested users)!

The following is an attempt at a guide to perfect the Estonian pronunciation module. The old one I have linked my sources below, but a lot of this is my own prescriptivism, as (1) academic publications about Estonian don't usually use or even mention IPA, which is used on this platform, and instead use a transcription system which is more intuitive for native speakers, and (2) I believe Wiktionary should handle things a little differently than normal publications.

Note: I'm describing my ideal, so I have not considered how hard or time-consuming coding this can be. Let me know about any complications in that aspect.

Template
There is no reason Estonian should be inferior to Finnish, much less the nearly extinct Ingrian, so I believe Estonian deserves to have much of similar infrastructure. Namely, it should be possible to automatically generate rhymes and hyphenations, such as in and. Since you are the creator of the latter, I'm sure you'll know the exact implications of it and any other features that could be added better than I do.

I'm not an expert on Ingrian, but I feel like Estonian pronunciation has more in common with Ingrian than with Finnish, in that syllable length cannot be always identified from the orthography for example. However, Estonian does not really have dialectal pronunciations and the Estonian template should, like Finnish, also support audio files (I don't believe any exist for Ingrian?). If there's any other features that could be added, they should be.

Rhymes
Rhymes should be generated from the last stressed vowel of the term, no matter if primary or secondary stress. Rhymes should be generated from the phonemic transcription and stripped of any extra markers, such as diphtong markes or voicelessness markers. If that's not possible to code, that's fine, but in that case we need to replace most, if not all, rhymes of words with diphtongs or semivoiced consonants.

Hyphenation
Here's a link to an automatic Estonian hyphenator. The code is not open-source though. It shouldn't be too different from Finnish. There is a section about hyphenation in Eesti grammatika, so let me know if you want me to translate that.

Phonemic
ng/nk

Phonetic

 * lehm -
 * lehma (gen.sg) -
 * lehma (par.sg) - -- since the two forms are not orthographically distinguished, the half-length marker should also appear in the phonemic transcription, otherwise not (cf. lehm above)
 * lehma (par.sg) - -- since the two forms are not orthographically distinguished, the half-length marker should also appear in the phonemic transcription, otherwise not (cf. lehm above)


 * õhkama -
 * viht -
 * viht -

Multiword terms
Currently, the template does not support multiword terms. Primary stress should be generated in front of every word.

There should also be a way to note non-primary stressed multiword terms.

Optional palatalization and stress
Quite a lot of words in Estonian have optional palatalization.

I've been transcribing them as. Hence, there should also be two rhymes generated:.

Some foreign words can be stressed either on the first syllable or the last syllable.

These words have usually reached the Estonian language via German, therefore traditionally have retained their oxytone stress. However, as a result of nativization attempts and, more recently, under the influence of English (cf. : ), I would say in my personal experience that you'd be very hard-pressed to find any speakers stressing the final syllable. So since we're transcribing Modern Standard Estonian, maybe that should just be ignored and the stress should be placed on the first syllable like the majority of speakers pronounce it?

Example words
Feel very free to use them (and any other words specified at any of the rules above) as testcases.