Nepali Phonetics and Character Mapping for Building Speech Technology
Script to Phoneme Mapping
Script | Phoneme (IPA) | Script | Phoneme (IPA) |
---|---|---|---|
अ | /ʌ/ | आ | /aː/ |
इ | /i/ | ई | /iː/ |
उ | /u/ | ऊ | /uː/ |
ए | /e/ | ऐ | /ẽ/ |
ओ | /o/ | औ | /au/ |
अं | /ʌ̃/ | अँ | /ã/ |
क | /k/ | ख | /kʰ/ |
ग | /ɡ/ | घ | /ɡʱ/ |
ङ | /ŋ/ | च | /t͡s/ |
छ | /t͡sʰ/ | ज | /d͡z/ |
झ | /d͡zʱ/ | ञ | /ɲ/ |
ट | /ʈ/ | ठ | /ʈʰ/ |
ड | /ɖ/ | ढ | /ɖʱ/ |
ण | /ɳ/ | त | /t̪/ |
थ | /tʰ/ | द | /d̪/ |
ध | /dʱ/ | न | /n/ |
प | /p/ | फ | /pʰ/ |
ब | /b/ | भ | /bʱ/ |
म | /m/ | य | /j/ |
र | /r/ | ल | /l/ |
व | /w/ | स | /s/ |
ष | /ʂ/ | ह | /ɦ/ |
श्र | /ʃr/ | ज्ञ | /d͡zɲ/ |
क्ष | /kʃ/ |
JSON Representation
{ "vowels": { "अ": "/ʌ/", "आ": "/aː/", "इ": "/i/", "ई": "/iː/", "उ": "/u/", "ऊ": "/uː/", "ऋ": "/r̩/", "ॠ": "/r̩ː/", "ए": "/e/", "ऐ": "/ẽ/", "ओ": "/o/", "औ": "/au/", "अं": "/ʌ̃/", "अँ": "/ã/", "अ:": "/ʌh/" }, "diacritics": { "ा": "/aː/", "ि": "/i/", "ी": "/iː/", "ु": "/u/", "ू": "/uː/", "ृ": "/r̩/", "ॆ": "/e/", "े": "/e/", "ै": "/ẽ/", "ॉ": "/o/", "ो": "/o/", "ौ": "/au/", "्": "virama", "ः": "/h/" }, "consonants": { "क": "/k/", "ख": "/kʰ/", "ग": "/ɡ/", "घ": "/ɡʱ/", "ङ": "/ŋ/", "च": "/t͡s/", "छ": "/t͡sʰ/", "ज": "/d͡z/", "झ": "/d͡zʱ/", "ञ": "/ɲ/", "ट": "/ʈ/", "ठ": "/ʈʰ/", "ड": "/ɖ/", "ढ": "/ɖʱ/", "ड़": "/ɽ/", "ढ़": "/ɽʱ/", "ण": "/ɳ/", "त": "/t̪/", "थ": "/tʰ/", "द": "/d̪/", "ध": "/dʱ/", "न": "/n/", "प": "/p/", "फ": "/pʰ/", "ब": "/b/", "भ": "/bʱ/", "म": "/m/", "य": "/j/", "र": "/r/", "ल": "/l/", "व": "/w/", "श": "/ʃ/", "ष": "/ʂ/", "स": "/s/", "ह": "/ɦ/", "श्र": "/ʃr/", "ज्ञ": "/d͡zɲ/", "क्ष": "/kʃ/" }, "digits": { "०": "0", "१": "1", "२": "2", "३": "3", "४": "4", "५": "5", "६": "6", "७": "7", "८": "8", "९": "9" }, "punctuation": { "!": "exclamation", "\"": "quotation", "'": "apostrophe", ",": "comma", ".": "period", ":": "colon", "?": "question_mark", "।": "danda", "_": "underscore" } }
What were the rules ?
Rules were inspired from wikipedia article: https://en.wikipedia.org/wiki/Nepali_phonology
Consonants
Spoken Nepali has 30 consonants in its native system, though some have tried to limit the number to 27.
Bilabial | Dental | Alveolar | Retroflex | Dorsal | Glottal | |||
---|---|---|---|---|---|---|---|---|
Nasal | m (म) | n (न/ञ) | (ɳ (ण)) | ŋ (ङ) | ||||
Plosive/ Affricate |
Voiceless | Unaspirated | p (प) | t̪ (त) | t͡s (च) | ʈ (ट) | k (क) | |
Aspirated | pʰ (फ) | tʰ (थ) | t͡sʰ (छ) | ʈʰ (ठ) | kʰ (ख) | |||
Voiced | Unaspirated | b (ब) | d̪ (द) | d͡z (ज) | ɖ (ड) | ɡ (ग) | ||
Aspirated | bʱ (भ) | dʱ (ध) | d͡zʱ (झ) | ɖʱ (ढ) | ɡʱ (घ) | |||
Fricative | s (स/श/ष) | ɦ (ह) | ||||||
Trill | r (र) | |||||||
Approximant | (w (व)) | l (ल) | (j (य)) |
Vowels
Nepali has 11 phonologically distinctive vowels, including 6 oral vowels and 5 nasal vowels. In some contexts, intervocalic "h" leads to breathy-voiced vowels.
Front | Central | Back | ||||
---|---|---|---|---|---|---|
Oral | Nasal | Oral | Nasal | Oral | Nasal | |
Close | i (इ) | ĩ (ई) | u (उ) | ũ (ऊ) | ||
Close-mid | e (ए) | ẽ (ऐ) | o (ओ) | |||
Open-mid | ʌ (अ) | ʌ̃ (अँ) | ||||
Open | ä (आ) | ã (आँ) |
Normalization Rules
- Handling Halanta (्): A consonant with a halanta removes the inherent vowel "अ".
- Combining Matras: Replace the inherent vowel "अ" with the vowel corresponding to the matra.
- Clusters and Special Cases: Map clusters like क्ष, ज्ञ, श्र appropriately.
Post a Comment