Thread:User talk:CodeCat/Characters to strip/reply (13)

Would people actually write a Serbo-Croatian ā with separate diacritics? Actually I just tried it out, and the software also normalises diacritics itself. So a + combining macron can't actually ever appear in an entry. And for Serbo-Croatian, the following would work: So there will only be 6 replacement pairs and not 60.
 * Replace "[áàȁȃā]" with "a". These are all normalized by the software so unless we decompose them, they will always be composed and this will work.
 * and so on for e, i, o, u.
 * Replace the combining diacritic forms of the above with "", which will take care of the Cyrillic variants (for which no composed varieties exist in Unicode).