Module talk:fr-pron

chaîner conjugation chart mistakes and testing the module
(moved from WT:Tea room/2016/September)

On the conjugation chart for the French verb chaîner, for any form that is pronounced /ʃɛn/, for some reason the module is adding and extra n and producing /ʃɛnn/ instead. Does anyone know how to fix this? do you know? 2WR1 (talk) 01:46, 2 September 2016 (UTC)
 * Fixed. This was a bug in Module:fr-pron. I also created a set of testcases here: Module:fr-pron/testcases. It compares the pronunciation generated for a set of words to the expected pronunciation. I can't guarantee that all the "correct" pronunciations listed are actually correct; they are what was on the Template:fr-IPA page. Please fix up the entries appropriately and add new ones; that way I can fix the module to produce the correct pronunciations per the test cases. Benwing2 (talk) 02:54, 2 September 2016 (UTC)
 * Thanks so much! What do you mean by fix up the entries and add new ones? Where? 2WR1 (talk) 05:51, 3 September 2016 (UTC)
 * Take a look at Module:fr-pron/testcases. Please take a look at the existing pronunciations (I suspect some of them are wrong) and edit that file to fix them up or add new ones. It's a module but it should be pretty clear by looking at the code how to add new test cases, and there's documentation in the module describing how to do so; let me know if it's not clear. Thanks! Benwing2 (talk) 13:34, 3 September 2016 (UTC)
 * Oh, okay, I'm looking at it now, I think I understand. What exactly does this module do? Is it for creating automatic pronunciations for verbs? I'm just not sure what kind of things I'm meant to be adding, but I'll go through and make sure all the given pronunciation are correct. I don't quite understand the "expected-actual-differs at", I clicked on the link for the documentation and it just brought up the same page. 2WR1 (talk) 17:55, 3 September 2016 (UTC)
 * The documentation, such as it is, is in comments inside the module code. Edit the module code and look around lines 17-30. The purpose of the module is to test the pronunciations output by Module:fr-pron, which is supposed to automatically generate the pronunciation of French words given their spelling. The "Expected" column is what the pronunciation should be, and corresponds to the second argument to  (the one I've labeled as "PRONUNCIATION"). The "Actual" column is what Module:fr-pron actually generates. What Module:fr-pron/testcases does is compare the two; it puts a green check mark in the left column if they're the same, and a red X otherwise. Currently all tests have green check marks, which is why it says "All tests passed" at the top; but if you change the expected pronunciation of a word, it will cause that test case to fail, and the "All tests passed" will change to "N tests failed" for some number N, indicating how many test cases have failed. The idea is to add as many tests as possible, especially for hard aspects of French pronunciation that should be handled correctly (e.g. I recently added the word accueil, which I wasn't sure the module would get correct, but it did). This way, I know where the module is broken, and if I change the module, I can make sure that it doesn't introduce any new bugs. Benwing2 (talk) 18:32, 3 September 2016 (UTC)
 * OK, I added a new test case that caused a failure: jeûne. Benwing2 (talk) 18:35, 3 September 2016 (UTC)
 * Oh, okay, I think I get it now, thanks! sorry about that. So I should be adding pronunciations that aren't quite intuitive. 2WR1 (talk) 18:51, 3 September 2016 (UTC)
 * Right. In some cases the pronunciations are simply exceptional, like seconde pronounced as if written segonde or femme pronounced as if written famme or fils pronounced as if written fisse, and there are no rules that can cover such situations short of just hard-coding the exceptions; in those cases you need to use the respelling feature (see the current entry for femme for example). Another such case would be fille vs. ville; I'm not sure how the module handles them currently but I imagine you would have to respell ville as vile. Benwing2 (talk) 18:55, 3 September 2016 (UTC)
 * Oh, okay, let me go back and use the respelling feature on some of those. What is this module used for encoding? Does it help with the automatic pronunciations on conjugation charts?, because I noticed some of that for some of the tests that failed, like emmuseler. Should I be adding more verbs or just anything? 2WR1 (talk) 19:31, 3 September 2016 (UTC)
 * You can add anything you want. It should ideally handle all types of words correctly. Currently it's definitely used for verb forms in conjugation charts and maybe elsewhere as well; ideally it should be used everywhere in place of manually specifying pronunciations, similar to what's done with Module:ru-pron (which is more mature and has a much larger set of test cases in Module:ru-pron/testcases). Benwing2 (talk) 19:50, 3 September 2016 (UTC)
 * BTW I think the noschwa=y flag is quite broken. For example, parlerons always needs a schwa in it while donnée should never have a schwa (except maybe in singing) and I think racheter shouldn't have a schwa either (although I'm not completely sure). I think the module should be smart enough to figure out in most cases whether a schwa is mandatory, optional or prohibited although I don't exactly know what the rules are; maybe you can add some test cases covering this (no need to use the noschwa flag for all cases you think should be handled correctly by default). Benwing2 (talk) 19:54, 3 September 2016 (UTC)
 * Also, one way to test changes to the module before saving is to enter Module:fr-pron/testcases (without any quotes) in the text box underneath "Preview page with this template" and click "Show preview". Benwing2 (talk) 20:03, 3 September 2016 (UTC)
 * Okay, thank you for all the information! Ya, the thing with the schwa can be kinda confusing, sometimes it just depends on the person or how fast you're speaking or whether pronunciation is harder or easier with or without it. I've done a lot of work on the charts on the French orthography wikipedia page, so I know a lot of minor and exceptional values for a lot of letters and combinations of letters, I'll use that as a reference. 2WR1 (talk) 23:28, 3 September 2016 (UTC)
 * Do you think it would be better to continue discussion about the module on your talk page instead of the tea room? 2WR1 (talk) 23:37, 3 September 2016 (UTC)
 * I've moved the discussion here, which seems the best place. Benwing2 (talk) 03:46, 4 September 2016 (UTC)

French testcases module
(moved from User talk:Benwing)

Hi, so I'm trying to work on the testcases module to fix the IPA and I need a little help. I was changing the code to have to correct IPA and the I saved the page to see if what I was doing was correct and it said that a bunch of the tests had failed and in the expected-actual categories the expected had been changed when I was trying to change the actual but I can't find where in the code one can change just the actual pronunciation. 2WR1 (talk) 18:16, 3 September 2016 (UTC)
 * Hi. What you did is correct. The "actual" is what the module Module:fr-pron actually produces. The "expected" is what it should produce, which should always be the correct pronunciation. If the "actual" is wrong, it means Module:fr-pron needs to be fixed to match the expected pronunciation. Benwing2 (talk) 19:05, 3 September 2016 (UTC)

-ille
(moved from User talk:Benwing)

There seems to be a big problem with how the module treats -ille/-ill- when it follows a consonant. That is it always treats it as /j/ as in travaille, but never /ij/ as in fille. I noticed this with the testcases and recently when I made a page for the french verb biller and the auto-conjugation chart is giving pronunciations like /bj/ for "bille" and /bje/ for "billez". Also, would you mind pinging me when you reply, it's kind of hard to get back to your discussion page to check if you've replied, thanks! 2WR1 (talk) 05:20, 4 September 2016 (UTC)
 * I fixed this, mostly. There's still a problem with forms like bille in biller, I think because the conjugation module passes 'bill' rather than 'bille' to the pronunciation module. I'll fix this tomorrow, have to go to sleep now. Benwing2 (talk) 07:37, 4 September 2016 (UTC)
 * Haha, okay, same here. Thanks for all the work you've done on this! Goodnight! 2WR1 (talk) 07:47, 4 September 2016 (UTC)
 * Fixed. Benwing2 (talk) 14:34, 4 September 2016 (UTC)

Questions about French pronunciation and request for more test cases
I am trying to fix up Module:fr-pron to reflect the test cases. Some requests: Thanks! Benwing2 (talk) 02:27, 5 September 2016 (UTC)
 * 1) Can we add more test cases for gn, including initial gn-? Should initial gn- be pronounced /gn/ or /ɲ/ by default?
 * 2) I am thinking of adding a way to force sounds to be interpreted as-is, maybe an apostrophe or hyphen between consonants that would otherwise be processed specially, e.g. gn, or maybe respelling with one of the letters doubled. What do you think?
 * 3) Can we add more test cases for ng? Currently I have things set up so that final Cing for C = consonant is pronounced /Ciŋ/ whereas other final ng are equivalent to final n. With a few more additions this should even make shampooing be pronounced correctly as champoin. (Can you add this as as testcase?)
 * 4) Can we add test cases for gu before a vowel other than e?
 * 5) /a/ vs. /ɑ/: I am thinking of making a before s or z (including when silent) be pronounced as /ɑ/. Is this correct? This would require that words like chasser (where /a/ apparently occurs) be respelled, e.g. chacer.
 * 6) Pronunciation of mn: Words like damner and automne are pronounced with silent m but in hymne and gymnaste the m is pronounced. Is there a rule here? Is the rule specific to ymn? What about imn (hypolimnion, limnologie)?
 * 7) Pronunciation of dinosaure (respelled dinosore): The first o is pronounced /o/, apparently, not /ɔ/. Is there a rule here? Is it that o before /z/ is always /o/? Does this apply when it's not in the last syllable? Apparently o before final /s/ is sometimes /o/ (grosse but not rosse?), but if this is not consistent then we'll need to respell grosse as grausse.
 * 8) européen is pronounced with final /ɛ̃/ not /ɑ̃/. Is there a rule here? Does it apply specifically in éen and ien, or is it all vowels + en?
 * 9) monsieur respelled messieu: I think this won't quite work because e before ss is sometimes a schwa, sometimes not (cf. blesser). Either this needs to be spelled mecieu or mǝssieu with explicit schwa, or I need to add some other way of forcing a schwa, I think.
 * 10) Final -um: parfum vs. album, maximum, aquarium. Should the default be /ɔm/ or /œ̃/?
 * 11) Final c, pronounced or not: I'm thinking that final c should be pronounced as /k/ by default, with words like tabac, estomac, blanc where it's silent requiring respelling. The alternative is to require donc etc. to use respelling as donque etc. Or we could make final vowel+c pronounced but final nc silent. What do you think?
 * 12) Final ct: respect, instinct vs. correct, direct. What should be the default? Or should I treat -rect(s) one way and other final -ct(s) another way? BTW can you add test cases involving various final consonant+s clusters?
 * 13) bœufs, œufs, œil, œils? Are there any other words that behave like these?
 * Benwing2 (talk) 02:28, 5 September 2016 (UTC)

So I'll add some more testcases where you requested and I'm also making my way slowly through the vowels and vowle combinations. Also, I noticed that the module needs to have ph as /f/ for default. And the careful vowels should be default pronounced (excluding, of course, -er, and -il) because final -c and -f especially are usually pronounced, words in which they are silent should be considered exceptional. 2WR1 (talk) 02:55, 5 September 2016 (UTC)
 * 1) Yes, I'll add more for gn, I don't know of any native French words beginning with gn, so that's hard to say, the only word I know is gnou /gnu/ and that's not native.
 * Ya, it would probably be a good thing to be able to make it not recognise a certain diphthong.
 * Ya, I can definitely add more for ng, it's a hard one though, because natively it's always treated as a normal n followed by a g. There are a few loans from English in which the final ng is pronounced /ŋ/ like shopping. In the case of shampooing, that should work as long as the module can recognise the loanword spellings of sh and oo.
 * 1) I can do that.
 * 2) This rule generally works, but not always. I guess we are going by the sort of "standard" and not the modern French French accent in which /ɑ/ has merged with /a/ and /œ̃/ with /ɛ̃/, I think that's the best way to do it, because if you don't use that accent it doesn't matter as long as you're aware of the mergers. I think I can add some exceptions to this rule as well. This brings up a good question though, what do I do if a word has two legitimate pronunciations, like pays for instance? There's no way to encode that in the testcases.
 * 3) I would say that the words with silent m's are exceptions, the rule should be /mn/.
 * 4) I would say the /o/ in dinosaure is exceptional. The general rule for o is that when it is phonologically final or before /z/ it is /o/. Before /s/ it is generally /ɔ/, though there are some exceptions. This rule also applies to eu/œu where it is /ø/ when phonologically final or before /z/. Also, eu should always be /œ/ before a pronounced consonant, situations where this is /ø/ (like neutre or déjeuner) are exceptional.
 * 5) The rule is final ien and éen are /jɛ̃/ and /e.ɛ̃/, so I would say it is a rule. And -yen also counts for this rule. I think there needs to be something in the code that says that i and y are treated the same, because that's generally how it is in French. I saw that the testcases were sometimes producing /j/ for y when it was being used as a vowel.
 * 6) Yes, I think "mecieu" works better as a respelling. But it's true, sometimes it's hard to know with an e before a double consonant whether is /e/ or /ə/. I feel like more often than not it's /e/, but I feel like if I could somehow include an explicit schwa in the respelling, that might be helpful.
 * 7) For -um, I would say the default should be /ɔm/, parfum is sort of an exception. But this should exclude un because that's always /œ̃/ except for in skunks and punch.
 * 8) Final c should be default pronounced. Though it could be argued that after a nasal vowel it's default silent.
 * 9) It's really hard to say with -ct, if there has to be a rule I would say default pronounced and default silent after a nasal vowel. And yes, I'll add some more final consonant+s clusters.
 * 10) œuf and bœuf are the only two words that when they are plural the f is suddenly silent. And œil is the only native French word (non-Greek) that uses œ not before u. I think that the default for œ should be /e/ (not including œu).
 * I think that the best thing to do with final -c might be to say default pronounced and default after nasal vowel silent. 2WR1 (talk) 03:09, 5 September 2016 (UTC)
 * Thanks for all your comments. In response:


 * 1) I think I'll keep gn- as /ɲ/, and have you write g'n to get /gn/ (not yet implemented).
 * 2) The current pronunciations seem to distinguish /ɑ/ vs. /a/ and /œ̃/ vs. /ɛ̃/ (although not /ɛː/ vs. /ɛ/, which is still preserved by Quebec French), so I've tried to do the same.
 * 3) As for cases with two pronunciations, this isn't a problem. You can add test cases for both using appropriate respelling.
 * 4) The issues with y should be fixed. It's a little more tricky than just mapping y -> i because they differ when following a vowel.
 * 5) I made -um default to /ɔm/.
 * 6) I made -c silent in -nc but pronounced elsewhere, similarly for -ct.
 * 7) I'll make œ be equivalent to e except in œu, does that sound right?
 * Also, what about x? As currently implemented, x is /gz/ word-initially (xylophone) and between e and a vowel (various words in ex-) but not elsewhere (maximum, exciter, etc.). Does this sound right? Benwing2 (talk) 22:48, 5 September 2016 (UTC)
 * Benwing2 (talk) 22:49, 5 September 2016 (UTC)

As for x, I think the best way to explain it is initial=/gz/, before a voiceless consonant=/ks/, medially=/gz/, phonologically finally=/ks/ (as in sexe and ixe), and finally=silent. Anything else should be considered exceptional, and I put a lot of the exceptions in the test cases with appropriate respellings. 2WR1 (talk) 00:08, 6 September 2016 (UTC)
 * 1) Yes, I think that's the best thing to do, it seems that there are a good bit of French colloquialisms that begin with gn- and those are pronounced /ɲ/, the /gn/ is pretty exceptional.
 * 2) I agree, that's the best way to keep it.
 * 3) Okay, that works for most I think, but there was one situation that comes up a lot that I was thinking of and I'll mention that in a separate comment.
 * Oh, ya, that's true, I see.
 * 1) Good!
 * 2) I think that's the best general rule.
 * 3) Yes, I think that's the best way to handle œ.
 * OK, this is slightly simpler than what I implemented, and it means that maximum will need respelling as macsimum because otherwise it would have /gz/. Is this correct? Benwing2 (talk) 00:30, 6 September 2016 (UTC)
 * Yes, exactly 2WR1 (talk) 00:39, 6 September 2016 (UTC)

h-aspiré
Would there be anyway to mark whether a word is pronounced with an h aspiré when spelled with an h at the beginning? And what about words like oui onze and yaourt that have h aspiré despite not being written with an h? Or does this not even really matter for the trascriptions in the same way silent consonants that can liaison aren't marked? 2WR1 (talk) 22:05, 5 September 2016 (UTC)
 * I could add support for this. If I recall, French dictionaries tend to put a * before words with h aspiré. I could implement this, i.e. you could put a * before a word and the pronunciation module would preserve it and maybe somehow add a footnote indicating what the * stands for (although I'm not yet sure where to put the footnote). Benwing2 (talk) 22:39, 5 September 2016 (UTC)
 * Ah, that's a good idea! I think that should work well! 2WR1 (talk) 00:11, 6 September 2016 (UTC)

Some module additions
Some new things: Benwing2 (talk) 23:25, 5 September 2016 (UTC)
 * There's a pos= flag (for "part of speech") that can be given in Module:fr-pron/testcases. The only recognized value currently is "v" for "verb"; this changes the way that final -ai and -ent are rendered.
 * You can put a ' to separate sounds that should be pronounced distinctly. This is intended especially for gn but should work in other cases; if not, let me know and I'll fix it.
 * I added respelling with "short" ă ĕ ŏ eŭ (the symbol is a breve, which is conventionally used in phonetics to indicate short vowels) to force the "short" versions of these sounds (/a ə ɔ œ/, as opposed to the "long" versions you get with â ê ô eû); this is useful e.g. before /z/, where the long versions of a o eu are selected by default.
 * Benwing2 (talk) 23:26, 5 September 2016 (UTC)
 * These are all really good changes! I wonder, would the apostrophe thing work for nasals? Like with English loanwords like week-end would respelling it "wike'nde" make the en be rendered as /ɛn/ instead of /ɑ̃/? Also, are these spellings (the apostrophe and breve) to be used in the actual spelling of the word or in the respelling? 2WR1 (talk) 00:15, 6 September 2016 (UTC)
 * The apostrophes and breves are to be used in the respelling; the actual spelling of the word should stay what it normally is. The idea is to support respelling in verbs also, so that e.g. the conjugation template for damner could be written or similar and it would generate the correct pronunciation for the whole paradigm, although it doesn't quite work this way yet. As for the apostrophe, it should definitely work for nasals, and also to force the u to be pronounced in gu + vowel, although I haven't tested these yet and they might be broken. Benwing2 (talk) 00:26, 6 September 2016 (UTC)
 * Oh, okay, that makes sense. I can add some testcases for these and try the new respellings on some as well. 2WR1 (talk) 00:41, 6 September 2016 (UTC)
 * So it seems that initial emm is being taken as default /ɑ̃.m/, but I think the default should be /e.m/ and then the apastrophe can be used for words like emmener to respell them as "em'mener". 2WR1 (talk) 00:56, 6 September 2016 (UTC)
 * OK. I added the change to spell emm- as /ɑ̃.m/ (and enn- as /ɑ̃.n/) but I'll change it. I guess that not all words in emm- or enn- work this way? Benwing2 (talk) 01:07, 6 September 2016 (UTC)
 * No, I think these are special cases. You would expect /ɛ.m/-/ɛ.n/ or /e.m/-/e.n/ from these. You'll notice there's a problem that came up with the word henné because of this. 2WR1 (talk) 01:32, 6 September 2016 (UTC)

-emment
I think a good use of the pos= would be to mark an adverb so that -emment is correctly transcribed as /a.mɑ̃/. 2WR1 (talk) 00:17, 6 September 2016 (UTC)
 * OK. Is this needed, however? Can I make -emment be /a.mɑ̃/ by default, or are there words where it isn't pronounced this way? Benwing2 (talk) 00:29, 6 September 2016 (UTC)
 * Hm, that's a good point, I can't really think of any words in which it is different. If there was a verb whose infinitive ended in -emmer then the 3rd person singular present form would cause a problem, but I can't think of any like that. 2WR1 (talk) 00:43, 6 September 2016 (UTC)
 * So the verbs in -emmer would have pos=v, and in this case the -ent gets chopped off very early, well before the rule that applies to adverbial -emment (in fact the special verbal handling of -ai and -ent is the very first thing the module does). Benwing2 (talk) 01:22, 6 September 2016 (UTC)
 * Okay, then that shouldn't be a problem, I don't think. The only other thing would be if there were any words that weren't adverbs that ended in -emment and were pronounced like /ɛ.mɑ̃/, but I can't think of any and I don't think it likely. 2WR1 (talk) 01:28, 6 September 2016 (UTC)

ai and ê
I mentioned early a situation in which two pronunciations are often common. This is with verbs which have an ai or ê in them. For instance, aimer, baisser, mêler, and pêcher can be /e.me/ or /ɛ.me/, /be.se/ or /bɛ.se/, /me.le/ or /mɛ.le/, and /pe.ʃe/ or /pɛ.ʃe/. So I don't know how this situation should be handled. Also, the verb payer sot of fits into this category as well. 2WR1 (talk) 00:37, 6 September 2016 (UTC)
 * OK. There are two things I can do here. If it's consistent enough, I can make the module generate both pronunciations by default. But maybe this should be handled explicitly? That way you'd write e.g. and it would (eventually) display both pronunciations. If I do this then I'll probably need to add a rule that disallows /e/ in closed syllables, maybe only final closed syllables, since almost certainly words like baisse, aime, mêle are always pronounced with /ɛ/. Is it ever the case that /e/ occurs in closed syllables? Looking through the list of existing lemmas I see words like déstresser, prégnant, tchétchène but arguably these are all syllabified with /e.str/, /e.tʃ/ etc. In fact the page for prégnant gives two pronunciations, /pʁɛɡ.nɑ̃/ and /pʁe.ɲɑ̃/, where the former has /ɛ/ despite the spelling (cf. also former spelling événement, corrected to évènement). Benwing2 (talk) 01:19, 6 September 2016 (UTC)
 * Yes, I think your idea sounds good in the cases of ai and ê, I'm not too familiar with code, so I don't know how you're doing this, but if it's easier to say that these are default /ɛ/ and specify the /e/ pronunciation with a second respelling, then maybe that's better. But I think é should always be default /e/, things like événement and vénerie and céderai are exceptions. 2WR1 (talk) 01:26, 6 September 2016 (UTC)
 * The reason I want to make é be /ɛ/ in some cases is that otherwise it gets tricky to specify the pronunciation of baisser, with the alternation between baisse = /bɛs/ and baisser = either /bɛ.se/ or /be.se/. But this depends on the rule regarding é -> /ɛ/ being totally consistent in closed syllables. Benwing2 (talk) 01:44, 6 September 2016 (UTC)
 * Hmmm, but the instances in which é is pronounced /ɛ/ are so seldom that they can optionally be rewritten with è instead, and even in these cases é wasn't in a closed syllable in the first place. I think it's better to treat ai and ê as default /ɛ/ and the verbs in which they are sometimes pronounced as /e/ could be specified some other way. 2WR1 (talk) 03:31, 6 September 2016 (UTC)
 * Right. é pronounced as /ɛ/ is rare in natural words but would be common in respellings generated by the verb module. What I was hoping to do was to allow you to specify and have it work properly. Without any special rules, that spec would generate two respellings for the 1sg/3sg present indicative form, baisse /bɛs/ which is right and bésse /bes/ which is wrong. To fix this, I need to do one of 3 things:
 * make a more complex pron= specification, so you'd have to write something like to indicate that one possible pronunciation has /ɛ/ throughout while the other has /e/ alternating with /ɛ/;
 * make a special rule in the conjugation code to convert generated forms like bésse to bèsse before passing to the pronunciation module;
 * make a special rule in the pronunciation code to pronounce written bésse as /bɛs/.
 * Benwing2 (talk) 03:44, 6 September 2016 (UTC)
 * I feel like either of the first 2 make the most sense. I'm looking through some French sources to make sure that all the conjugated forms of these such verbs (excluding the single syllable ones) can be pronounced these two ways. I'll tell you once I've confirmed. 2WR1 (talk) 05:49, 6 September 2016 (UTC)

sh
Should loanwords from English spelt with an sh be respelt as "ch" or should the module facilitate that spelling? 2WR1 (talk) 01:04, 6 September 2016 (UTC)
 * I'll make sh be pronounced as ch, no reason not to. Benwing2 (talk) 01:08, 6 September 2016 (UTC)

ouïr
For the word ouïr and its derivatives/conjugated forms, the module seems to be having trouble, do you think it's best to just respell the words without the diæresis in these cases? 2WR1 (talk) 01:22, 6 September 2016 (UTC)
 * No, the module should handle ï correctly. I see it has all sorts of issues with it, I'll fix them. Benwing2 (talk) 01:27, 6 September 2016 (UTC)
 * Okay, sure thing! Thanks 2WR1 (talk) 01:30, 6 September 2016 (UTC)
 * Can you summarize the rules regarding ë, ï and ü? It looks like ï prevents a preceding u from turning into a glide in amuïr but doesn't prevent ou from turning into a glide in ouïr, is that because in the former case there's a preceding consonant and in the latter there isn't? Benwing2 (talk) 01:40, 6 September 2016 (UTC)
 * ï is basically treated as a normal i after u and ou. Only in the infinitive of amuïr does it do anything, in it's conjugated forms and derivatives like amuïssement it's as if the verb is "amuir". I think the best thing would be to have ï treated like normal i after u and ou and to write amuïr as "amu'ir" in the respelling. For ë, it's kind of hard, it can be either /e/ or /ɛ/ and I wouldn't say there's any clear rule. Maybe finally=/e/ and elsewhere=/ɛ/? That makes the most sense because it covers from Israël to canoë (the only words I can think of where medial ë is pronounced /e/ are the two alternative (and somewhat archaic) forms of spelling for goémon (and it's derivative goémonier) and goéland). But then a completely different way it is used is when after a gu- it marks the u as being pronounced in words such as aiguë and ciguë and ambiguë. Also, ï can act the same way, but only in the word ambiguïté, which I think should be seen as an exception and rewritten "ambigüité" for the testcase respellings. Then ü after a vowel is treated basically the same as ï in the same position, but it's much rarer, it can be seen in names like Ésaü and Saül. But then it can also be used after g to mark a pronounced u. This is a product of the 1990 spelling reform (which I'm not a fan of, but should still be seen as a legitimate alternative) and produces alternatives like aigüe, cigüe, and ambigüe. 2WR1 (talk) 02:17, 6 September 2016 (UTC)
 * Are there any words besides aïe/haïe where aï is not followed by a spoken vowel and is pronounced as /aj/ and not /a.i/? I'm asking partly because I at first thought that aï + cons in non-final syllables was regularly /aj/, which means I could respell slicer as slaïcer, but that doesn't seem to apply in e.g. haïrai and haïssais, so this respelling won't work. What should the respelling be? slaillcer won't currently work because the ill -> /j/ change applies currently only between written vowels (although that could be fixed; I doubt there are very many other words with a sequence like llc). slaillecer won't currently work either because the internal e generates a schwa which isn't yet removed (but should be). Benwing2 (talk) 03:03, 6 September 2016 (UTC)
 * aïe and haïe (the interjection) are the only words I know of like this where it would be a final /j/. Maybe you could make it like "slail'cer" or something, or possibly use the letter ÿ in respellings to force a /j/. I think this could be good because only rare proper nouns use ÿ and when dealing with those it's probably best to manually transcribe the pronunciation, it shouldn't be something dealt with by the module normally, so maybe it's best to repurpose it for this? 2WR1 (talk) 03:37, 6 September 2016 (UTC)
 * It looks like slailcer will work fine currently, no need for an apostrophe. I'd rather do this than repurpose ÿ because I think it will be more generally understood than ÿ, which for most French speakers will be the same as ï. Benwing2 (talk) 16:36, 6 September 2016 (UTC)
 * Okay, that makes sense, as long as il works! 2WR1 (talk) 17:45, 6 September 2016 (UTC)

a -> â implemented
I implemented this only before single s and z when either final or followed by silent -e or -es. I originally implemented it before all s and z but it broke lots of test cases, e.g. astéroïde, assez, asien, asthme, etc. Currently it doesn't fire on double ss e.g. basse (with /ɑ/), casse (with /a/). What do you think? Should I leave it as-is, widen it (e.g. to include -sse), narrow it further (so that e.g. it only affects words with absolutely final s, such as cas/bas/pas, or only affects words with silent final /s/ or pronounced final /z/), or scrap it (requiring all words where a is pronounced /ɑ/ to be respelled with â)? Benwing2 (talk) 04:00, 6 September 2016 (UTC)
 * This is a hard one, especially because in a lot of French French these have merged, sometimes it's hard to know when to use which. Generally I would say a before /z/ would be /ɑ/ (with exceptions of course), then word-final -as is more often /ɑ/. But it's hard to say with -asse, if these are a feminine form of a masculine adjective spelt -as, then yes. When looking through a lot of words that end this way, I think one could say that more of them ar /ɑ/ than /a/, so maybe it would be best to default to /ɑ/ before double s and specify (maybe with the apostrophe?) thats it's /a/ in the respellings. 2WR1 (talk) 06:02, 6 September 2016 (UTC)
 * Good idea to use an apostrophe. I didn't think of it but it will work totally fine. I can make -asse default to /ɑ/ if you want. The only tricky thing about all this is that verbs like casser will by default end up with /a/ in cassons but /ɑ/ in casse(s), which seems wrong; so I'm thinking we should maybe make a default to /ɑ/ before all /z/, final or not, and before absolutely final -s, and this will avoid problems with verbs. Benwing2 (talk) 20:47, 6 September 2016 (UTC)
 * So then words like passer and basse would have to be respelt "pâsser" and "bâsse"? 2WR1 (talk) 22:31, 6 September 2016 (UTC)
 * Yeah. Either we need to respell passer as pâsser and basse as bâsse, or we need to respell chasser as cha'sser or chăsser and chasse as cha'sse or chăsse. No way around that. What I'd like to avoid is having to spell both passer and chasser using respelling, which would be required if I made passe have /ɑ/ by default but passer have /a/ by default. Benwing2 (talk) 23:07, 6 September 2016 (UTC)
 * Ya, I see that, that's less complicated. 2WR1 (talk) 01:04, 7 September 2016 (UTC)

A few more test cases
Benwing2 (talk) 23:54, 6 September 2016 (UTC)
 * 1) In syndrome and expansion, the module and the test case divide the syllables differently. Where should the syllable division be?
 * 2) In Gamay, should final -ay be treated like -ai? More generally, how should y not followed by a vowel be treated? Same as i? How many words are there like this, besides Gamay and pays (which is exceptional in any case)?
 * 3) In aoûtien, should there be a rule that treats -tien and -tienne as if spelled -cien and -cienne? This would run into problems with chrétien, I think.
 * 4) In pied, should we treat final -ed as if written -é? (Currently we treat it as if written -è, consistent with -et.) Are there any other words in -ed?
 * Benwing2 (talk) 23:55, 6 September 2016 (UTC)

2WR1 (talk) 07:02, 7 September 2016 (UTC)
 * 1) Let me think about this and research it a little more and get back to you.
 * 2) Final -ay should be treated as -ai. I'm trying to think of examples other than pays for this, I can't come up with anything that isn't like a plural of a loanword. I think I would call pays an exception and respell it "péi" and "pèi".
 * 3) This is something important that I was going to add some test cases for soon. A medial -ti- should usually be treated as /sj/ nation, initiation, croatien, etc. But there needs to be a way of stopping this when it is a vowel form a vowel whose stem ends in -ti (or just t when -ions and -iez are being added), in words like portions which is pronounced two different ways depending on whether it's a noun or a verb, also things like pitié, augumentions, and partiez, and any verb who has -tenir at the end, like soutenir where forms like soutiens would be produced. Perhaps the pos=v tag could be implimented towards this?
 * 4) I would so in this case that a single syllable word whose only vowel is e (not counting glides) before a silent consonant should have the vowel /e/: nez, les, ces, et. pied would fit into this group but then there needs to be a rule that cancels this when the word ends in -ies, like vies or ries. But then words like muet break this. I don't think that any other single-syllable word engs in -ed like pied though, and -ez is always e already in the module, so those few words are taken care of. How does the module handle les, ces, des? If these words are all covered, than maybe it's best to view pied as a bit of an exception and respell it "pié".
 * OK. There's currently a rule making -tion pronounced as /sj/. If you want I'll expand it to apply to all -ti- between vowels, except when pos=v. This can be stopped by inserting an apostrophe between the t and i. As for les, ces, etc., there's a rule that applies to single-syllable words consisting of ending in -es following zero or more consonants, and as you noted, another rule covering -ez. This handles all but pied and et. I can add rules for these words or we can use respelling. BTW, as for syllabification, the module currently keeps together [bcdfgpstv] + l or r (that is, bl, br, cl, cr, etc.) as well as /dʒ/ and /kw/ when syllabifying, and breaks up CCC as C.CC (where C = any consonant), and similarly CCCC is broken up as C.CCC, except that CsC is syllabified as Cs.C. The handling of 3+-letter clusters like this probably needs work. Benwing2 (talk) 08:41, 7 September 2016 (UTC)
 * Turns out "croatien" isn't an actual word, so ignore that example. 2WR1 (talk) 21:17, 9 September 2016 (UTC)

some questions about verbal pronunciations
How are the following pronounced?

Benwing2 (talk) 23:31, 8 September 2016 (UTC)
 * étudions vs. étudiions
 * étudierons vs. étudierons
 * aimerons vs. aimerions
 * céderons vs. céderions
 * citerons vs. citerions

2WR1 (talk) 06:46, 9 September 2016 (UTC)
 * /e.ty.djɔ̃/ - /e.ty.di.jɔ̃/
 * /e.ty.di.ʁɔ̃/ - /e.ty.di.ʁjɔ̃/ (I'm assuming you meant étudierions for the second)
 * /ɛ.m(ə).ʁɔ̃/ - /ɛ.mə.ʁjɔ̃/ (schwa not optional in the second form because it's needed for ease of pronunciation)
 * /sɛ.d(ə).ʁɔ̃/ - /sɛ.də.ʁjɔ̃/
 * /si.t(ə).ʁɔ̃/ - /si.tə.ʁjɔ̃/
 * Also, I'm pretty sure that étudierons and  étudierions can be pronounced as /e.ty.djə.ʁɔ̃/ and /e.ty.djə.ʁjɔ̃/ as well. Basically the e can always be omitted in these forms, the only reason i didn't include it for the forms ending in -/ʁjɔ̃/ is because conjugation charts I've seen don't show dropping the schwa as an option for these forms. I'm sure in actual speech you could do so and it would sound perfectly fine (that's how schwas usually function in French), but I think the idea is that it's too hard to pronounced such a complicated cluster without the schwa. 2WR1 (talk) 07:09, 9 September 2016 (UTC)

help with pronunciation of internal schwa
Can any of you help with the pronunciation of non-final schwa? I know the rules are somewhat involved as to whether pronouncing such a schwa is mandatory, optional or disallowed, but I don't know what those rules are exactly. So far what I've implemented is that an internal written schwa is deleted when following a vowel (e.g. in jouerons, étudierons, agréerons), and also in the sequence VCəCV (i.e. at most one consonant on either side, with a vowel on the outside of both consonants); but the schwa is not deleted when the two consonants are the same when voicing is ignored (i.e. in sequences /təd/, /dət/, /tət/, /dəd/, /pəb/, /bəp/, /pəp/, /bəb/, etc.) and not deleted in the sequence /pəz/ (since the schwa seems mandatory in empeser, repeser, soupeser). However, I imagine at least some of these deleted schwas are actually optional rather than forbidden. Can you help with the rules, and/or point me to the appropriate references (note, I don't have access to JSTOR any more)? If you could also add test cases to Module:fr-pron/testcases that would be great, but not required. Thanks! Benwing2 (talk) 17:07, 11 September 2016 (UTC)
 * I'll look into this and see if I can find some definite rules and get back to you. 2WR1 (talk) 20:31, 11 September 2016 (UTC)

é + C + schwa
How are the words dépecez, clamecez, blesserez, blesseriez, aimerez, aimeriez pronounced (including all possible variants, esp. variants between /e/ and /ɛ/ and between schwa and no schwa)? I thought there was a rule disallowing /e/ when there was a schwa in the next syllable (hence céderez is pronounced as if written cèderez), but that doesn't seem to apply to dépecez. Thanks! Benwing2 (talk) 20:14, 11 September 2016 (UTC)
 * /de.pə.se/; /klam.se/; /blɛ.sə.ʁe/, /blɛ.sə.ʁje/, /ɛ.m(ə).ʁe/, /ɛ.mə.ʁje/. With blesser and aimer the e/ai can be /e/. I see what your saying about disallowing /e/ before a schwa, I don't know if there is exactly a rule for that (at least for the verbs with e/ɛ), but dépecer is definitely /de.pə.se/, maybe this has something to do with the dé being a fixed prefix? Let me look into the schwa rules some more and I'll try to get you some more definitive answers. 2WR1 (talk) 05:19, 12 September 2016 (UTC)
 * So presumably /kla.mə.se/ is also allowed? Is /blɛs.ʁe/ also allowed? (What about épicerie, can that be either /e.pis.ʁi/ or /e.pi.sə.ʁi/?) This schwa stuff is very confusing. Benwing2 (talk) 06:11, 12 September 2016 (UTC)
 * Yes, /blɛs.ʁe/ is also allowed. Not /kla.mə.se/, but this is an exception: clamecer is an alternative spelling of clamser, and the intermediate e should therefore be forgotten. You'll never find sentences such as il clamèce. Lmaltier (talk) 16:58, 15 September 2016 (UTC)

ti as /sj/
I implemented ti as /sj/ only for -tial, -tien and -tion, when followed by a sound other than /s/. There seem to be a lot of exceptions for other cases of ti, e.g. moitié, pitié, various words in -tier. Benwing2 (talk) 20:12, 13 September 2016 (UTC)
 * Ya, that makes sense, there aren't many words like Croatie. Did you also cover the feminine form -tienne and tiale? ? And what about the plural forms for all of these? The only thing that could get tricky is -tions which can be a verbal ending for verbs whose stem ends in t, like partir and its form partions, so maybe the pos=v could be implemented for this? 2WR1 (talk) 04:03, 14 September 2016 (UTC)
 * It should cover -tien and -tial anywhere in a word. We will have problems with retiens and retient, so maybe I'll disable all these changes when pos=v, as you suggest. Benwing2 (talk) 05:05, 14 September 2016 (UTC)
 * What about verbs like conditionner, initier, initialiser, is there a way that these internal instances could be preserved? And I guess chrétien would probably have to be rewritten as "crét'ien" or something? 2WR1 (talk) 05:50, 14 September 2016 (UTC)
 * I just checked and it seems to be working for both conditionner and initialiser, but not initier, so maybe initier just needs to be rewritten? But it's important that pos=v doesn't remove these. 2WR1 (talk) 05:52, 14 September 2016 (UTC)

I implemented ti as /sj/ only for -tial, -tien and -tion, when followed by a sound other than /s/. What does this mean? There are some rules but, nonetheless, the pronunciation of a French word should never be derived from its spelling by a software, except for suffixes in conjugation bots, etc. The result would be catastrophic. An example: the ti in maintien is pronounced tj, not sj. Lmaltier (talk) 17:05, 15 September 2016 (UTC)
 * The idea is to provide default rules for rendering the pronunciation of words. If the rules don't work, the word should be provided with a pronunciation respelling. So femme would be respelled famme, and second would be respelled segond, etc. For the case of -tien, either I would make it pronounced as /sjɛ̃/ by default, in which case you'd have to respell maintien something like maint'ien (where the apostrophe is used to force the literal pronunciation of a cluster that would otherwise be interpreted specially), or we'd make it pronounced as /tjɛ̃/ by default, and we'd have to respell a word like haïtien as haïcien. Benwing2 (talk) 23:56, 15 September 2016 (UTC)
 * BTW the primary use of this module currently is in verbs, where the large majority can have their pronunciation derived automatically but some will require respelling. This is a lot easier on editors than requiring every verb to have its full pronunciation written in IPA. Benwing2 (talk) 23:59, 15 September 2016 (UTC)

désh-, ress-, transV-
I added special hacks for désh- /dez/ (otherwise it becomes /deʃ/), ress- (to make it have a schwa), and trans- + vowel (to make it have /z/). Sound good? Benwing2 (talk) 18:54, 14 September 2016 (UTC)
 * Benwing2 (talk) 18:54, 14 September 2016 (UTC)

eCC
It's tricky to know how to handle eCC, i.e. e + double consonant. Sometimes it's /e/, sometimes /ɛ/, sometimes it can be either. What I currently implement is that initial (h)eCC- is /e/ (effacer, emmental, errer, henné, etc.), and non-initial -eCC- is /ɛ/ (mett(r)ons, etc.). Any ideas for this? Benwing2 (talk) 03:30, 15 September 2016 (UTC)
 * Yes, I think thats a very good way to handle it. Your CC here is the same consonant and not a consonant cluster, right? 2WR1 (talk) 08:03, 20 September 2016 (UTC)
 * Yes. e+Cl or e+Cr is normally a schwa (except maybe word-initially, not sure what happens there), and e+two other consonants is always /ɛ/. Benwing2 (talk) 14:31, 20 September 2016 (UTC)
 * Okay, that's good> I can't think of that ever happening word-initially anyway. 2WR1 (talk) 07:12, 21 September 2016 (UTC)

recevais, décevais, concevais -- optional schwa or not?
In recevais, décevais and concevais, is the schwa optional or not, i.e. are /ʁǝ.s(ǝ).vɛ/, /de.s(ǝ).vɛ/ and /kɔ̃.s(ǝ).vɛ/ correct or is the schwa required? Benwing2 (talk) 05:12, 15 September 2016 (UTC)
 * Reping in case the last one didn't work. Benwing2 (talk) 05:17, 15 September 2016 (UTC)
 * Sorry, that's beyond my French-speaking abilities. Maybe one of knows. —Aɴɢʀ (talk) 08:07, 15 September 2016 (UTC)

fr.wikt indicates ʁǝ.sǝ.vɛ, de.sǝ.vɛ and kɔ̃.sǝ.vɛ knowing that ǝ are optional in actual pronunciation (provided the word can be pronounced, of course). Pronunciations such as ʁsǝ.vɛ and kɔ̃s.vɛ are possible. Lmaltier (talk) 16:52, 15 September 2016 (UTC)
 * Yes, in some regional accents, some "ǝ" get skipped. However, i'd definitely put the complete pronunciation as Lmaltier indicated at first, because when you utter it syllable by syllable, you do pronounce the "ǝ", even if your accent would have you skip it within speech. When you teach the word to a foreigner or even to a native child, that's how it's meant to be pronounced (and in the south, they pronounce every syllable, to the amusement or the impatience of the others). French aims at being much more absolute than, say, literature ; following that example, we would have "/ʁ(ǝ).sǝ.vɛ/, /ʁǝ.s(ǝ).vɛ/", and that gets a bit complicated. Finally, i think that that is precisely the idea of the schwa ; if it's there, it tells all by itself, and parentheses are redundant. --Jerome Potts (talk) 18:41, 15 September 2016 (UTC)
 * The problem with writing every schwa is that some schwas must be pronounced, some are optional and some are almost always omitted. For example, the schwa is apparently optional in aimerons but not aimerions, and not apparently in recevrons either, and almost definitely not in pamplemousse (the first schwa), while in aime the schwa is nearly always dropped (except perhaps in some singing styles), and it would be highly misleading to write /ɛ.mǝ/. So IMO we have to compromise in how to handle the schwa. Benwing2 (talk) 23:49, 15 September 2016 (UTC)
 * Ya, I totally agree, it gets really hard to decide which should be included and which shouldn't. Maybe you should ping those other users so they see your comment. It seems like they have some good input. 2WR1 (talk) 08:01, 20 September 2016 (UTC)
 * At the end of words, it's better to omit it, as it's always omitted (when the word is pronounced alone, but the pronunciation given is for the word used alone). There are cases where it's not optional, such as above-mentioned cases (first e of pamplemousse, etc.) but it's because the word cannot be pronounced otherwise (you cannot pronounce [plm] inside a word). Lmaltier (talk) 12:45, 21 September 2016 (UTC)

Pronunciation of ai, ê in aimer, blesser, emmêler, empêcher, etc.
Some questions:
 * 1) In emmêler, empêcher, the TLFi says for example "[ɑ̃pɛʃe] ou p. harmonis. vocalique et malgré l'infl. de l'accent circonflexe [ɑ̃peʃe]". Does the "vocalic harmonization" to /e/ apply only for the endings -er, -ez, -ai, -é, or in all unstressed forms?
 * 2) In aimer, the TLFi says "[eme], j'aime [ʒ ε:m]. On trouve également [εme], seul ou à côté de [eme]". For blesser, it similarly says "[blεse] ou [ble-], (je) blesse [blεs]. [ε] ouvert pour l'inf." Are these two variants found only in the infinitive (and maybe in the endings -ez, -ai, -é) or in all unstressed forms?
 * 3) Does the [e]/[ε] variation apply to all verbs with -ai- and -eCC- (i.e. e + double consonant) in the root? Some other verbs to consider: encaisser, laisser, souhaiter, traiter, trainer, baiser, niaiser, aider, plaider, clairer, aiguilleter etc.; dresser, presser, seller, nieller, regretter, pirouetter, errer, enterrer, empierrer, moyenner, greffer, cheffer, etc. Benwing2 (talk) 06:26, 23 September 2016 (UTC)
 * Benwing2 (talk) 06:27, 23 September 2016 (UTC)
 * the sound [e] is normally always changed to [ɛ] before a schwa. When not before before a schwa, the sound [ɛ] may tend to become [e] in some cases, as mentioned above, but there are regional and personal variations. For aimer, I would have written [εme] as the first pronunciation. Variants such as [blεse] ou [blese] are not limited to the infinitive, but I don't understand why you mention "unstressed forms". Stress is normally very weak in French, you can omit it, and its possible position (when used) is very free, it may depend on what you want to express. Some TV programmes have their own characteristic "stress style". Do you mean that, if there is a clear stress on the first syllable, aimer is always pronounced [εme]? Possibly, but I'm not sure at all, this might be a quite original statement. I don't know if studies exist about this issue. For the 3rd question: I think you cannot generalize, but it's a difficult question. And this variation e/ε is not limited to verbs. As mentioned above, it's more a regional or personal issue (but, in many cases, [e] or [ɛ] is always used by everybody). And it may happen that people tend to use [e] while they are convinced that they use [ε]. Lmaltier (talk) 07:35, 23 September 2016 (UTC)
 * What I mean by "unstressed forms" is where the sound in question occurs in syllables other than the last one. In blesse/blesses/blessent, aime/aimes/aiment, only [ɛ] occurs. You make the same point that [e] becomes [ɛ] before a schwa, although that doesn't always apply, e.g. dépecer has [e] before a schwa I think. But it does always apply in the last syllable. Benwing2 (talk) 08:53, 23 September 2016 (UTC)
 * Benwing2 (talk) 08:53, 23 September 2016 (UTC)


 * In theory, it's [ɛCe] ; in practice, it tends heavily toward [eCe], per the "vocalic harmonization" : in essence, an [e̞]. In the south, it's pretty much a straight [e]. Lmaltier's "[some] people tend to use [e] while they are convinced that they use [ε]" is right on. How about an unconventional middle-of-the-road [e̞Ce] ? --Jerome Potts (talk) 10:43, 23 September 2016 (UTC)

Yes, it's right, in dépecer, é is pronounced [e]. I should have written "the sound [e] is normally always changed to [ɛ] before a schwa when the schwa is not pronounced". But there are exceptions: in dépecer, e is normally pronounced, but if omitted, the sound [e] would still be kept. Lmaltier (talk) 19:25, 23 September 2016 (UTC)
 * What happens in verb forms like payerons and payerions? The French Wiktionary in has /pej(ə).ʁɔ̃/ [missing a dot] and /pe.jə.ʁjɔ̃/ but I wonder if it shouldn't be /pɛ.j(ə).ʁɔ̃/ and /pɛ.jə.ʁjɔ̃/, in keeping with the general rule that /ɛ/ not /e/ occurs before a schwa (cf. céderons /sɛ.d(ə).ʁɔ̃/ céderions /sɛ.də.ʁjɔ̃/). Benwing2 (talk) 17:24, 24 September 2016 (UTC)
 * Also, is paierons pronounced /pe.ʁɔ̃/, /pɛ.ʁɔ̃/ or both? Benwing2 (talk) 17:46, 24 September 2016 (UTC)
 * BTW thank you all of you for all your comments! Benwing2 (talk) 17:47, 24 September 2016 (UTC)
 * In fr:payerions and fr:paierons, I mention \pɛ.jə.ʁjɔ̃\ and \pɛ.ʁɔ̃\ (the fr.wikt \\ convention means that variations are possible). Lmaltier (talk) 18:46, 25 September 2016 (UTC)
 * Thanks. I take it that \pe.jə.ʁjɔ̃\ and \pe.ʁɔ̃\ are not possible? This is what I've implemented. Benwing2 (talk) 20:26, 25 September 2016 (UTC)
 * pe.jə.ʁjɔ̃ does not seem normal. pe.ʁɔ̃ is possible. But you should not try to mention all possible personal or regional variations. Lmaltier (talk) 05:55, 26 September 2016 (UTC)

liaison test cases
Can you add test cases for liaison? I've added some support for liaison; you need to explicitly indicate places with liaison using the liaison marker ‿. See the existing test cases for vous avez, s'en aller and dort-il for examples. Thanks! Benwing2 (talk) 05:59, 30 September 2016 (UTC)
 * Is that pronunciation right on grand arbre? I think it is... — JohnC5

schwa mid-word between two consonants
One source I read said that schwas cannot be pronounced in the middle of a word between two consonants, esp. at a morpheme boundary. Hence rétablissement must be /ʁe.ta.blis.mɑ̃/ and not */ʁe.ta.bli.s(ə).mɑ̃/, and fruiterie must be /fʁɥi.tʁi/ and not */fʁɥi.t(ə).ʁi/, and gâterons must be /gɑ.tʁɔ̃/ and not */gɑ.t(ə).ʁɔ̃/, and presumably also aimerons must be /ɛm.ʁɔ̃/ not */ɛ.m(ə).ʁɔ̃/, and deuxièmement must be /dø.zjɛm.mɑ̃/ not */dø.zjɛ.m(ə).mɑ̃/, etc. Is this true? Benwing2 (talk) 04:03, 8 October 2016 (UTC)
 * I would rather write /fʁɥit.ʁi/ and /gɑt.ʁɔ̃/. This affirmation is excessive. The schwa is generally omitted in such cases, but it can be pronounced, especially in some regions, and when reading poems. Lmaltier (talk) 05:12, 8 October 2016 (UTC)
 * At a certain point we have to make judgment calls about when to write schwas and when not. I decided to leave out the schwa in the sequenece VCəCV within a single word, except when one of the vowels is itself a schwa. I may make an exception for schwas following dé-, since the schwa seems more likely to be pronounced in dépecer, décevais, etc. Benwing2 (talk)

à double tranchant
(moved from Talk:à double tranchant)

Is it correct that the schwa is optional? This is what frwikt says, but that would produce a cluster /bltr/ that seems unpronounceable. (I'm not referring to the colloquial pronunciation that would reduce double to /dub/.) Benwing2 (talk) 20:24, 5 October 2016 (UTC)
 * It's optional, except when it would be impossible to pronounce the word without it. Here, the schwa may be omitted (a dubl tʁɑ̃ʃɑ̃) (inside a word, I agree that /bltʁ/ would be impossible) but I feel it's usually pronounced because of the bl tʁ. Lmaltier (talk) 20:58, 5 October 2016 (UTC)
 * Well, I'm not sure. Let's rather say that both are possible. Lmaltier (talk) 05:14, 8 October 2016 (UTC)
 * I would say that this should be represent without the schwa, because each word on its own does not have a schwa. A french person would know that if a cluster seems too difficult, adding in a schwa is always an option, so one may very well pronounce it that way, but it doesn't have to be reflected in the IPA transcription. Also, I don't think that that pronunciation is impossible, difficult perhaps, but with the rest of the word on either side, doable. 2WR1 (talk) 06:06, 11 October 2016 (UTC)
 * The thing is, our pronunciations aren't meant for French speakers but for English speakers, who won't know if a cluster is too difficult or not. That's exactly the purpose of indicating whether a schwa is mandatory (or at least usual), optional or omitted. If a French speaker includes a schwa most of the time, we should write the schwa; if they feel that the schwa can equally as well either be pronounced or omitted, we should write an optional schwa; if they feel that the schwa would usually be left out, we should leave it out. In this case, it sounds like we need at least an optional schwa, because leaving it out is "difficult". Benwing2 (talk) 20:58, 11 October 2016 (UTC)
 * I guess so, I just think that's sort of hard to quantify because it could vary so much from region to region, speaker to speaker. 2WR1 (talk) 03:28, 12 October 2016 (UTC)

à force de
My current algorithm produces /a fɔʁ.s(ə) də/ with optional schwa. The same algorithm produces parles-tu /paʁ.l(ə) ty/, parlent-ils /paʁ.l(ə)‿.til/ but montres-tu /mɔ̃.tʁə ty/. The rule here is that final schwa after two consonants is optional when the next word begins with a consonant, except when the schwa follows certain clusters like /tʁ/, /bl/, /tm/ etc. (basically, /p b t d k g s z f v ʃ ʒ/ followed by /ʁ l m n/). Does this sound right to you? Would it be better to render /a fɔʁs də/ (as frwikt does), /paʁl ty/, /paʁl‿.til/ etc. with no schwa? Benwing2 (talk) 19:47, 8 October 2016 (UTC)
 * I don't think that it would be better. C'est une question de choix. JackPotte (talk) 20:51, 8 October 2016 (UTC)
 * I don't think that this is necessary, in either case. A schwa is always optional at the end of a French word, at the very least for emphasis. Like I mentioned previously, a French speaker knows this and can insert a schwa if they want to, I think it's a bit excessive to include it in the transcription. 2WR1 (talk) 06:10, 11 October 2016 (UTC)

h-aspiré with asterisk
I notice that one of the testcases that has still failed is the one for haïr that I put the asterisk before as you suggested. Did you want me to just remove that or were you thinking about implimenting something with that? 2WR1 (talk) 04:59, 17 October 2016 (UTC)
 * I am still deciding what to do with that so leave it for now. I will probably use it in reflexive verbs, for example. Benwing2 (talk) 05:47, 17 October 2016 (UTC)

geignais
The pronunciation of geignais is given as /ʒi.ɲɛ/ in the table at geindre. I would have expected the first vowel to be /ɛ/, as in peignais (see peindre). Is this a mistake in the module? Hftf (talk) 03:21, 21 October 2016 (UTC)
 * Yes, this is a mistake, thanks for pointing it out. Benwing2 (talk) 15:56, 21 October 2016 (UTC)

intransitif/-ive
, could you change this to predict in intransitif/-ive? I suspect line 197 should be more general. — justin(r)leung { (t...) 05:05, 19 February 2017 (UTC)
 * Pinging as well. — justin(r)leung { (t...) 05:43, 19 February 2017 (UTC)

Geminate consonants in conjugations of courir, mourir
It is widely mentioned in french linguistics books that the future and conditional conjugations of courir, mourir have a long /ʁ/, e.g. elle courra /ɛlkuʁʁa/. This is specific to these two verbs as pouvoir pronounces just one /ʁ/ even with two r's spelt: elle pourra /ɛlpuʁa/. The reason is that for courir, mourir, their future and conditional stems end in /ʁ/, while the conjugation endings begin with another one. Any fix?

A reference: https://books.google.co.uk/books?id=17jiE4jm-XUC&pg=PA149&lpg=PA149&dq=french+long+consonant+courra&source=bl&ots=vDrWDVufeo&sig=Un4jjwIzV5SrXGge83x4Kvl3d0E&hl=en&sa=X&ved=2ahUKEwj4-OK_kqnfAhXLfFAKHVzXDQ8Q6AEwEnoECAIQAQ#v=onepage&q=french%20long%20consonant%20courra&f=false Rethliopuks (talk) 10:20, 18 December 2018 (UTC)
 * That's correct. ? The conjugation tables need fixing. Per utramque cavernam 10:37, 18 December 2018 (UTC)
 * Fixed. Benwing2 (talk) 04:38, 20 December 2018 (UTC)

-psych-, -chr-
Hi. I changed the module to handle -psych- like -psik-. Apparently that is wrong in psychique? Are there very many such words? They could be respelled using 'psysh' but if there are very many, maybe I should undo this change. I also changed the module to handle -chr- like -cr- and initial neur- as /nøʁ/, are there any exceptions to that? Benwing2 (talk) 18:53, 21 May 2022 (UTC)
 * BTW The ones I can find through some basic searching are psychique along with related psychiquement, psychisme; psyché, maybe also bradypsychie, tachypsychie. Currently there are 118 total French words in -psych- in Wiktionary. Benwing2 (talk) 19:00, 21 May 2022 (UTC)
 * Not many words like that, no, so I think making /k/ the default as you did is the right call.
 * About and, these are specialized terms I've never heard before, but probably /ʃ/ (compare , , ).
 * About -chr- and neur-: sounds like the right call too, I can't think of any exception off the top of my head. PUC – 20:51, 21 May 2022 (UTC)
 * I also changed -chl- and -chn- to have hard /k/ by default (-chlor-, chlamydie, cochléaire, etc.; -techn-, -arachn-, etc.). I didn't change -chm- because there are few Greek-derived French words in -chm- and several in -chm- with /ʃ/. I fixed up all the words with -chr-, -chl- and -chn- with soft /ʃ/ in them. I am now looking into changing -aur- to be /ɔʁ/ by default instead of /oʁ/. BTW based on your statement below about -isme I will probably fix the module to generate two pronunciations for words in -isme. Benwing2 (talk) 08:50, 22 May 2022 (UTC)

-isme
A specific IP has been adding pronuns for awhile for words in -isme to give them the pronunciation /izm/. I have recently been removing them but I'd like to know if a pronunciation in /izm/ is possible along with /ism/ (which appears standard). Benwing2 (talk) 19:04, 21 May 2022 (UTC)
 * It's possible and in fact very common, in my experience. /-ism/ might be becoming somewhat of a purism. PUC – 20:54, 21 May 2022 (UTC)

rhymes
One more ping about rhymes ... I am thinking of implementing, similar to which automatically generates rhymes and hyphenation as well as pronunciation, given respelling. One thing strange is that in Wiktionary, words with glides /j/, /w/, /ɥ/ before the rhyming syllable have the glide included in the rhyme. E.g. annuel has the rhyme given as /-ɥɛl/ rather than /-ɛl/ as I'd expect; similarly hier has rhyme /-jɛʁ/ and joindre has rhyme /-wɛ̃dʁ/. To make it even more confusing, fois has rhyme /-a/ not /-wa/. Are these glide-initial rhymes correct? If so, is there a rule indicating when to include the glide and when not to include it? Benwing2 (talk) 19:13, 21 May 2022 (UTC)

alternative pronunciations
Hi. I am in the process of making the module automatically add alternative pronunciations in certain cases: This can be disabled using 1.
 * 1) Any time there is an /ɛ/ in an unstressed open syllable, the module will generate a second pronunciation with /e/, e.g. aimer will automatically get /ɛ.me/ and /e.me/.
 * 2) Conversely, any time there is an /e/ in an unstressed closed syllable, the module will generate a second pronunciation with /ɛ/, e.g. prévenant will automatically get /pʁev.nɑ̃/ and pʁɛv.nɑ̃/.
 * 3) Any time there is an oral /ɑ/, the module will generate two pronunciations, one with /a/ (given first) and another with /ɑ/ (given second).
 * 4) Any time there is a word ending in /ism/ or /is.mə/, the module will generate a second pronunciation with /izm/ or /iz.mə/, respectively.

The module detects if such alternative pronunciations are given explicitly and will remove duplicates, e.g. if aimer is specified with respellings aimer and émer, the module will generate /e.me/ only once.

Hope this is OK with you, if not let me know. Note also that maybe we should do something similar with /ɔ/. I've noticed that in prefixes like géo-, micro-, neuro-, etc., frwikt is very inconsistent in using /ɔ/ or /o/, which suggests that they have merged in normal speech. (TLFi generally recommends /ɔ/ in all such cases.) But I'm a bit reluctant to do this because I feel intuitively that there's a difference between e.g. and. What do you think?

I have written the code to do this but Module:fr-verb, which uses Module:fr-pron, still needs some fixes to handle the multiple pronunciations, which I'm currently working on. I'll let you know when I'm done. Benwing2 (talk) 20:27, 29 May 2022 (UTC)

3 questions
@User:Benwing2: Hello! I am trying to reuse this module (with attribution) on some other wiktionaries. Problems I have spotted so far: Thanks. Taylor 49 (talk) 13:21, 23 July 2022 (UTC)
 * The module is currently linked to Module:User:Benwing2/fr-pron, apparently for debugging purposes. Is this temporary or permanent? Wouldn't it be better vice-vera, ie placing the debug and comparison stuff into "User:Benwing2/fr-pron", and keeping the stable version "clean"?
 * There is currently no documentation about the module (ie exported functions), still there is about the template (except exported functions of the module), and currently all tests do fail. Is this a work-in-progress? I have seen the section above "alternative pronunciations", and the recently introduced line  that probably caused this breakage.
 * I can see 152 pronunciation modules for various languages ( excluding English :-D ), but there seems to be a lot of redundancy, ie every single module has its private way to tackle the common tasks (merging automatic pronunciations with the manual ones, brewing the section, ...). Is there some effort to optimize and streamline this?

In response to your questions:
 * 1) The link to Module:User:Benwing2/fr-pron is for testing purposes. Specifically, it works like this: (1) I create a new version of the module in Module:User:Benwing2/fr-pron. (2) I set   in the production module. (3) The MediaWiki software notices the change to the production module and (over time) regenerates the pronunciation of all pages using the module. As the production module generates the pronunciation of individual pages, it also executes the code in Module:User:Benwing2/fr-pron to generate the new version's pronunciation, compares them, and if they're different, adds the page to Template:tracking/fr-pron/different-pron using the tracking mechanism in Module:debug; otherwise it adds the page to Template:tracking/fr-pron/same-pron. The comparison code can't be placed entirely in the sandbox module because it relies on MediaWiki's generation of pronunciation for all pages that use fr-IPA, which only happens with the production module. The alternative is for me to use a bot to generate the pronunciation of all pages, calling both the production and sandbox modules and comparing them in the bot script. I have in fact done something like this, especially when rewriting inflection modules, but it is a relatively slow process (it takes 1-2 seconds to call both modules from a bot and there may be around 100,000 pages to process).
 * 2) My apologies for the broken test cases; I forgot about the existing unit test mechanism when I made some recent changes to the module for use in Module:fr-verb. As for documentation, I don't always manage to document everything, but I try to do so, and generally I do so in the code itself rather than in a place like Module:fr-pron/documentation, because the code comments are more likely to get updated correctly when the code is changed. As an example, see Module:inflection utilities, which has quite a lot of documentation in the code but almost none in Module:inflection utilities/documentation. If you have specific questions about how anything works, please let me know and I'll try to answer them.
 * 3) There is not actually so much redundancy across the various pronunciation modules. Nearly everything in this particular module, for example, is specific to French, and the actual formatting of the pronunciation(s) is in Module:IPA, which is shared across the pronunciation modules. You also have to keep in mind that many of the modules in general are created by amateur programmers who don't know much about code reuse and refactoring, so they tend to copy modules and modify them rather than factor out the common code. For example, almost all of the inflection modules that I've written make use of Module:inflection utilities, but mostly it's only my modules that use this (along with a few other modules that people have created by copying one of my modules). That said, there is a lot more code reuse than you might expect given the situation; for example, you will find things like Module:links and Module:parameters used by hundreds or thousands of other modules. Benwing2 (talk) 04:06, 25 July 2022 (UTC)


 * @User:Benwing2: Thanks.
 * 1. I see.
 * 2. I will check the code again (ASAP or maybe when the tests are fixed) and ask if I fail to adjust it correctly or have more specific questions. ;-)
 * 3. I have already worked with Module:parameters before and succedded to simplify it drastically. :-D But there does not seem to exist an overall solution for for example merging autogenerated pronuciations with manual ones, regression tests (you seem to be very pedantic in this respect), or merging the pagename with template parameters. Well, I have seen several languages (eo, fi, fr) and they solve same tasks differenly.
 * Taylor 49 (talk) 13:31, 25 July 2022 (UTC)
 * When you say "pedantic" what are you referring to in this case? (Usually "pedantic" has negative connotations.) Also, Module:parameters has many features that are used by various modules; you can of course simplify it by taking out features, but you will find soon enough that some modules need these features. And we do have a unit test module in Module:UnitTests. Finally, I'm not really sure what you mean e.g. by having a common module for features like "merging autogenerated pronunciations with manual ones", or at least I don't understand what this feature would involve; maybe you can explain more? Manual pronunciations are specified using IPA directly, while autogenerated pronunciations come from the module. As for the broken unit tests, let me see if I can fix them soon. Benwing2 (talk) 01:32, 26 July 2022 (UTC)


 * @User:Benwing2 I meant "pedantic" without negative connotations of course ... since you test all use cases of the module before approving a new version. Taylor 49 (talk) 21:53, 26 July 2022 (UTC)

The use of +
Hi. I've just added the IPA for. I wasn't sure if I needed + to make it /-kt/ but it seems to have worked correctly without it, according to the note next to "suspect": "final -ct after a vowel is pronounced by default".

Could the doco explain better how to use +, please? Anatoli T. (обсудить/вклад) 23:04, 20 December 2022 (UTC)