Appendix:Bulgarian hyphenation

Hyphenation and syllabification of Bulgarian on Wiktionary
These two features of pronunciation are handled by the template, which outputs both hyphenation and syllabification, or just "Hyphenation" if the two happen to be the same. To see which of the two means what, please see the sections below.

Hyphenation
Hyphenation, generally, is a system of rules that decides at which points a word can be broken over two lines with a hyphen. This is generally used in word processing, for example, to delineate the boundaries at which a word can be split onto a new line and still look pleasing. Contrarily, syllabification concerns dividing the word into the spoken syllables that make it up.

For Bulgarian, the rules of hyphenation are published by the in their orthographic dictionary, and codify the precepts by which a valid hyphenation must abide.

The rules are as follows:


 * 1) A consonant between two vowels links with the second vowel. For example,.
 * 2) In a sequence of two or more consonants between two vowels, at least one consonant stays with first vowel and at least one with the second vowel. For example,  and.
 * 3) Two equal consonants are separated. For example,.
 * 4) In a sequence of two or more vowels, the first vowel stays before the hyphen. For example  and.
 * 5) In a sequence of three or more vowels, the last vowel stays after the hyphen. For example,, but not.
 * 6) The letter  between a vowel and a consonant stays with the vowel. For example,.
 * 7) When a sequence of two or more consonants follows, at least one consonant links with . For example,  (not ).
 * 8) The letter  between two vowels links with the second vowel. For example.
 * 9) No hyphenation before or after.
 * 10) When the letters  denote a single consonant, then they are not separated. For example,  (not ), but.
 * 11) There must be at least one vowel before and after the hyphen.
 * 12) One letter does not stay alone.

These are adapted and reproduced from this article by the University of Sofia. The rules above apply to the 1983 specification of the hyphenation standard, but they are also forward-compatible with the latest 2012 standard, which introduces the following two changes:
 * 1) Rule 5 is rescinded.
 * 2) A hyphenation that violates the above rules, but is more morphologically consistent (i.e. better separates the word on its  boundaries) is allowed.

Because the two additions are merely permissive, and not compulsory, our algorithm still produces valid results.

As the University of Sofia identifies, the hyphenation rules as of 1983 do not impose any requirement of morphological sense, which can make some hyphenations look strange. Please be aware that there may actually be numerous valid hyphenations of a word, but our algorithm as used on Wiktionary will only ever choose one.

Syllabification
Syllabification is the process of breaking down a word into its spoken syllables, each of which has (in Bulgarian) a vowel in the middle, optionally with consonants before and after. For example, can be syllabified as. Unlike hyphenation, which focuses on where a word can be orthographically split, syllabification is concerned more with the phonetic aspect, and so is beholden to the below general phonetic rules:


 * 1) Each syllable must have exactly one vowel.
 * 2) A new syllable is formed when the sonority of sounds stops decreasing.
 * 3) The sonority scale for Bulgarian is defined to be the following:
 * 4) Fricatives (в, ф, ж, ш, з, с, х): 1
 * 5) Stops (plosives; б, п, г, к, д, т) and affricates (ч, ц): 2
 * 6) Sonorants (л, м, н, р, й, ў ): 3
 * 7) Vowels (а, ъ, о, у, е, и, ю, я): 4.
 * 8) Anything else (not sounds, e.g. punctuation): 0.

The above make up basically the most pertinent rules of syllabification. We also perform some smaller adjustments, as there are times when this general process is not perfect:
 * 1) Certain prefixes, such as, , and , would be incorrectly handled by this algorithm, so we treat them specially to ensure they always appear in their correct form at the beginning of a word.
 * 2) There are also three limited cases where a sequence should be broken according to sonority rules, but in Bulgarian it isn't. These are ств, св, and вс.
 * 3) Certain consonant clusters, e.g. км, цн, тн, згн, adhere to the rising sonority principle, but are unnatural as the onset of a syllable (in Bulgarian, at least). For each consonant cluster, we make sure it is not one of these, but if it, we break it up differently to make the syllabification more natural.