Wiktionary:About Pali

Background
Pali is an Old-Indo-Aryan language that is similar to Vedic Sanskrit. It is also the language of the earliest Buddhist scriptures.

Scripts
Pali uses the following scripts:


 * Roman
 * Devanagari
 * Burmese
 * Thai
 * Khmer
 * Sinhalese
 * Tai Tham
 * Lao
 * Bengali (but seemingly nothing durably archived)
 * Chakma
 * Brahmi

Wiktionary uses the template to show forms of words in different scripts. See for an example.

Roman
The Roman script as used on Wiktionary uses the 33 consonant letters k, kh, g, gh, ṅ (velars); c, ch, j, jh, ñ (palatals); ṭ, ṭh, d, ḍh, ṇ (retroflexes); t, th, d, dh, n (dentals); p, ph, b, bh, m (labials); y, r, l, v (semivowels), s, h, ḷ and ṃ (niggahita) and the 8 vowels a, ā, i, ī, u, ū, e and o. This arose as an application of the IAST Variant systems may be encountered with 'ŋ' for 'ṃ' (Pali Text Society (PTS) or 'ṁ' for 'ṁ' (ISO 1519) and á, í and ú for the contrastively long vowels.  Very occasionally, the palatals and retroflexes may be denoted by italicising the letters for the velars and dentals.  Finally, as the difference between 'ṅ' and 'n' is not phonemic, the PTS texts write 'n' for both.

Devanagari
Devanagari uses the same 33 consonant letters (except that niggahita is a combining mark, commonly known as anusvara) and 8 vowel letters, except that they occur both as independent vowel and as dependent vowels (with the rules of an abugida). There is no indication that anusvara is used as an abbreviation for a homorganic nasal.

Burmese
In principle, the Burmese script uses the same 33 consonant letters (except that niggahita is a combining mark, commonly known as anusvara) and 8 vowel letters, except that they occur both as independent vowel and as dependent vowels (with the rules of an abugida). However, the dependent vowels corresponding to 'ā' and 'o' each have two different forms, which are encoded separately. The choice is context sensitive, and the rules for Burmese itself are known to have varied over time. An additional encoding complication is that the subscript forms correpsonding to 'y', 'r', 'v' and 'h' are encoded in Unicode as combining marks known as 'medial consonants'. Finally, Unicode has separate letters for the geminates 'ññ' and 'ss'.

It is entirely possible that a Mon tradition may use a different letter for 'jh' to the Burmese tradition.

Thai
As an abugida, the Thai script uses the same 33 consonant letters (except that niggahita is a combining mark) and 8 vowel letters. Instead of independent vowels, the dependent vowel uses a dummy consonant, which in Thai represents an initial glottal stop. The vowel 'a' is the default vowel, and the absence of a vowel is marked by the 'diacritic' called phinthu. There is variation in the positioning of a preposed vowel in a consonant cluster - 'yho' may be found both as โยฺห and ยฺโห.

The Thai script may also be used as an alphabet. The vowel has two, separately encoded forms, one for open syllables and one for closed syllables. The niggahita is written with the letter ngo ngu, making it formally indistinguishable from 'ṅ'. Context resolves the difference between 'ṃ' and 'ṅ'.

Khmer
The Khmer script uses the same 33 consonant letters (except that niggahita is a combining mark, commonly known as anusvara) and 8 vowel letters, except that they occur both as independent vowel and as dependent vowels (with the rules of an abugida). Early in the 20th century, subscript 't' and 'ṭ' came to be written the same. It is not known how subscript 'ṭ' is normally encoded nowadays.

Sinhala
The Sinhala script uses the same 33 consonant letters and 8 vowel letters, except that they occur both as independent vowel and as dependent vowels (with the rules of an abugida). Consonant clusters are written as conjunct or touching letters; conjunction and touching are encoded differently, and are different from normal cluster notation in the Sinhalese language, which uses the sign al-lakuna, which also serves to mark various vowels as long.

Sinhala Rendering
The rendering of conjuncts and touching letters may be poor, with systems falling back to a glyph or glyph modification and, usually, vowels written on the left appearing within the cluster. For Windows 7, the font Iskoola Pota works well, but it may have problems with later operating systems (for which it is probably unlicensed anyway). There is a freeish font LKLUG_T which works adequately for Pali with browsers; it does not work well with Word 2016. (That appears to be a problem with the rendered, not the font.) There have been adequate fonts for Macs, but their current status is unknown.

Tai Tham
The Tai Tham script uses the same 33 consonant letters and 8 vowel letters, except that they occur both as independent vowel and as dependent vowels (with the rules of an abugida). However, western practice (Burma and Northern Thailand) writes 'p' using U+1A37 TAI THAM LETTER BA while eastern practice (north eastern Thailand and Laos) uses U+1A38 TAI THAM LETTER HIGH PA. As with Burmese, the dependent vowels corresponding to 'ā' and 'o' each have two different forms, which are encoded separately. The rules for which one to use vary from place to place. There is also variation in the writing of subscript 'ṭh', 'b', 'm' and 's', and this is captured in the encoding. Additionally, the obvious way of encoding subscript 'l' and 'r' is used for the vernaculars, not for Pali. Pali uses 'medial' consonants for the subscripts of these two letters, e.g. . Finally, as in the Burmese script, there is a special letter for 'ss'.

Lao
With the addition of the Buddhist Institute letters for Pali, the Lao script can be used in the same way as the Thai script.

However, Pali is also written using the limited repertoire available to Lao. These schemes are alphabets, not abugidas. The missing consonants are occasionally supplied by the use of nuktas, which may be encoded using U+0EBA LAO SIGN PALI VIRAMA, but normally a consonant that would be pronounced the same in Lao borrowings is used. The distinction between ຍ and ຢ may be used to represent the distinction between Pali 'ñ' and 'y', but often they are both written with ຍ. Pali 'ññ' may be written ນຍ.

Bengali
The Bengali script may use the same 33 consonant letters (except that niggahita is a combining mark, commonly known as anusvara) and 8 vowel letters, except that they occur both as independent vowel and as dependent vowels (with the rules of an abugida). However, there are complications arising from the ancient loss of the distinction between 'b' and 'v' in this script. Combining Pali and Sanskrit, the following combinations for 'r' and 'v' are known:
 * 1) ৰ and ৱ (Assamese)
 * 2) র and ৰ
 * 3) র and ব (Preserving Bengali merger of 'b' and 'v')

Brahmi
Devanagari uses the same 33 consonant letters and 8 vowel letters, except that they occur both as independent vowel and as dependent vowels (with the rules of an abugida). The relevant inscriptions are not early Brahmi, so the spelling is probably as expected.

Nouns
Nouns use the headword template, with a parameter for gender (1). In action the template looks like this:



Nouns are in their stem form, which varies based on gender and declension systems, but is generally the main headword in most Pali dictionaries. As some dictionaries instead list words by the nominative singular, it is considered appropriate to create an entry for the nominative singular when it differs from the stem. Similarly, an entry for the 'alternative citation form' in -u may be created for nouns whose stem ends in -'ar'.

Declensions are handled with the template, or shorter for auto-detection. The basic table can handle only one word of one script and one gender, but can accept manual overrides and can incorporate multiple spelling systems.

Tasks

 * Add links to feminine forms where applicable; for example, sīha.