Thread:User talk:Kephir/Module:ar-translit, Module:ko-translit/reply

As for Korean, I noticed that the romanisation scheme is very simplistic and does not match WT:AKO, so I rewrote it based on the Revised Romanization of Korean article on Wikipedia. I was pleased that by doing this, I made it pass all the testcases but two, and was wondering whether maybe the current output for these two is acceptable, and how to change the algorithm if not. And whom to ask that. So, good thing you approached me about this. I think this algorithm cannot be made any simpler, with all the consonant combinations it has to take into account.

As for Arabic, I only changed the string literals to use numeric escapes, because editing mixed-directional text is always a mess. (By the way, I spotted two U+200E LEFT-TO-RIGHT MARK characters which were probably put there by accident, so I removed them.) Perhaps we could define some character constants (i.e., etc.) and later use them in the code so that the intent becomes more clear. Some unit tests would be also helpful here.

I also try to go over a string only one time, instead of several, and use a simple iteration instead of a (not really) full regular expression engine, for performance. Imagine a requests page full of strings to transliterate. (I would rewrite Module:ru-translit the same way. But testcases first. We have very few unit tests. We only notice errors when pages break with the oh-so-helpful Script error s.)