User:OrenBochman/bots/ipa

IPA-BOT

 * 1) A bot to automate IPA entry generation.
 * 2) the spelling.
 * 3) a phonemic model.
 * 4) all the existing IPA data.

Features

 * 1) knowledge based version (rule based).
 * 2) start with a languages that have simple spelling to sound maps like Hungarian and Swahili.
 * 3) add phonemic adjustment
 * 4) assimilation
 * 5) elision
 * 6) data base version (statistical).
 * 7) HMM based on input output data.
 * 8) use existing text to do.
 * 9) per language on/off flag
 * 10) check flag - add a template for human checking (for proper nouns).
 * 11) hybrid
 * 12) use both models and some discriminator

Issues
Q.A. - train and test on 95% / 5% split of existing annotation per language.

Other Features

 * 1) poll:
 * 2) is there interest in generating TTS voice files for entries?
 * 3) is there interest in generating hyphenation as well?

Resources

 * 1) open source TTS projects with language models, scripts for tts.
 * 2) Mbrola
 * 3) Sphinx
 * 4) Hspell
 * 5) CMU dict for English.
 * 6) mallet to graphic models.