SC-Voice: Raveena meets Slow Amy


While we are discussing Sinhala pronunciations, is the change of an “a” at the end of a word into what sounds to me like an “er” another Sinhahala innovation, or is that in the Pali?
E.g. Sambuddhasa -> Sam-bud-dha-ser


I am afraid I have no idea about that. My conjecture would be that it probably is, since there is no /a/ reduction in other positions, but don’t quote me on that.


Some nice replies on this thread, thanks all!

Indeed; it certainly doesn’t stop you from becoming a Buddhist professor, for I have heard several pronounce Theravada with an opening fricative!

Excellent, thanks, that’s really handy.

On tangentially related note, the group originally behind the Mahasangiti Tipitaka have always wanted to reform the Thai pronunciation of Pali. They have a website and app promoting it.

I haven’t looked into it in detail, but I see that they are using a “corrected” phonetic Thai spelling for Pali:

Again, just to remember, Thai, Sinhala, Roman, or any other system, records Pali quite accurately, and they are fine to use, so long as you remember that the pronunciation of letters when used in Pali is not always the same as the native language. It’s easy to learn these standards if you take a little time.


I think we can agree to disagree; what is important is that there is no significant alteration to the meaning, despite a subtle shift in the sound.

with metta,


By the way, given the vast amount of experience and deep knowledge appearing in this thread, what I need help with the most is assembling a few key words that highlight correct pronunciation. For example, I am using ananda (unhappy) vs. ānanda (extremely happy) as a contrasting voice test pair. What other words should we use for testing voices?


It is more like, “I don’t hear what you are hearing.”

In my experience both learning to speak Sinhala and teaching Sinhala speakers to speak English, the t/th/ṭ/ṭh thing is quite maddening. Because each language has sounds that the other doesn’t and Sinhala speakers have been taught from an early age that certain sounds are equivalent when in fact they are not.

As a native English speaker, I cannot hear the difference between a dental and retroflex t. So much so that I had to work out a hand signal for my teachers and I to use to distinguish between them. I can make the sounds (because of training myself), I just can’t hear them. What was most frustrating was when my teachers would get frustrated and start saying things like, “No no, t as in ‘think, think, think!’” when they were actually saying “tink, tink, tink’”!

It is such a fascinating situation as long as you don’t get frustrated. For example, in the Sinhala language aspirated consonants have not been aspirated for almost a thousand years. So they still have all the letters, but much to the consternation of students, the sound is the same. So for example k and kh are pronounced exactly the same in Sinhala. And (using the academic way of transliteration) ṭ ට and ṭh ඨ are pronounced exactly the same.

The only time I am able to get a native Sinhala speaker to understand why we can’t use th to spell metta is by asking them, “So if you use ‘th’ to spell metta, then how would you write the mahaprana (aspirated) letter ථ ?” And they don’t have any answer because in Sinhala they don’t have aspirated consonants as a sound. It is at that point that I walk them over to the book case and show them the transliteration table in the front of every Buddha Jayanti Tipitaka volume so they can see how you represent mahaprana letters in English.

On the Sinhala speaker learning English side, it is equally frustrating. Ask any Sinhala speaker (or Indian language speaker of any kind) to say “thinking thoughts” and they will say “tinking tots.” Only after about 30 minutes of coaching can I get them to blow air between their upper teeth and tongue. Then it takes another 30 minutes to get the buzzing sound in “this, these, those.” I can count on one hand the students I have had who take the time to actually train themselves to make these sounds correctly and use them when speaking.

This all leads to some bizarre situations, like when the American born child of Sinhala parents pronounces his own name incorrectly. Poor Meth. In Sinhala, his name means loving-kindness, a shortened form of the Pal word metta. But of course his parents would spell metta as meththa. So his name has to be Meth. The parents pronounce his name like the past tense of “meet.” But every other person the child meets pronounce his name like the shortened form of the street drug methamphetamine. And because he spends 8 hours a day at school with these people, he also pronounces it that way. But because his parents can’t hear the English “th” sound, they never correct him.


This is all making me a bit dizzy. I need clear direction here!

Do we use IPA meθθa or metta for “t”? I have absolutely no grounds for choosing for myself since I am going by my ear and what it can hear. I seek a pronunciation that eliminates confusion. Are there words with “t” and “th” that can be confused? The AWS Polly voices give us very little room for pronunciation finesse. What we need is to eliminate confusion when hearing two words that can almost sound alike. This is ALL we can do with current technology.


I think Mat and I would agree on using /meθθa/ as the closest phonological approximation given the current technological limitations, but other people may have other opinions on the matter.

Another famous example of this phonolgical confusion is how the native speakers of Japanese cannot hear the difference between /r/ and /l/.

There are other curious cases. For example, for the speakers of Russian and German the closest approximation of /ð/ in this would be /z/, so they tend to pronounce this word as /zɪs/. My Romanian wife and all of her fellow native speakers of Romanian use /d/ instead, so they tend to pronounce it as /dɪs/. Earlier in the history of the Slavic languages the /θ/ sound was perceived as similar to /f/ much like I think is pronounced as /I fink/ in some colloquial varieties of British English. So, the Greek Θεόδωρος, which became Theodore in English, became Fyodor in Russian. On the other hand, the contemporary Slavic speakers do not associate these sounds together, resorting to /t/ and /s/ instead. Mindblowing.


:heart: I sink I must fank you. :pray:

Note: I take it there are no confusing “t”/“th” word pairs as there are for “a” with ananda vs. ānanda?


If you read ‘th’ as /θ/, then no, Pali doesn’t have them as it doesn’t have the sound.


What about Satthi vs. Sati? Do they sound different or the same?

Here is Raveena.


They do sound different. tth in satthi is an aspirated (breathy) dental geminate. If we ignore the gemination (doubling) of the consonant, it is pronounced as /tʰ/, whereas the t in sati is pronounced as /t/. I am not sure it would be possible to represent this phonological difference with the British English since the English t is almost always aspirated, but you could try using /tʰ/.

The second pronunciation variant reflect the gemination but sounds a bit weird, because there is no gemination in English. I don’t know, maybe other people have an opinion whether they like the second of fourth variant better.


Yes. The en-US and en-GB voices are terrible with aspirated h. For this reason, we will need to use different IPA for these voices.

In the Raveena sound clip, I think we can use (3,4) for en-IN, but (1,2) for en-US and en-GB. Both pairs generate different sound files, so the differences will be audible to those who can hear.

 Raveena says 
  1. <phoneme alphabet="ipa" ph="sɐtɪ"/>.
  2. <phoneme alphabet="ipa" ph="sɐtthɪ "/>.
  3. <phoneme alphabet="ipa" ph="sɐθɪ"/>.
  4. <phoneme alphabet="ipa" ph="sɐθhɪ "/>.

Here is Amy saying the exact same phonemes.


I’ve changed sc-voice to have a Pali IPA lexicon (i.e., “ipa.pli”) per voice. To some extent, this will allow us to customize the Pali pronunciation of each voice within the dialectal constraints of that voice. This is a requirement because for TTS services implemented by AWS Polly (and presumably for other TTS services in general), IPA phoneme vocalization is voice-dependent.

Raveena’s "θ"does NOT sound like Amy’s “θ”. And Raveena’s “t” does NOT sound like Amy’s “t”. Notably, none of these is the correct Pali “t”. So we will need to transcribe Pali “t” to “θ” for Raveena and “t” for Amy. Although this learning is wonderful, it merely applies to just one letter. Similar subtleties no doubt lie in wait for many other Pali letters.

Since Raveena’s voice has been chosen as closest to Pali, I will freeze the current implementation of Amy and Salli voices until we have more time to work on each of their IPA lexicons (yes, Amy and Salli will differ between themselves and each from Raveena). My head is a bit maxed out already trying to learn Pali, speak Pali and teach what I know to Raveena alone. This means that I will be focusing on Raveena alone as the single supported voice for V1. We will of course have navigate and recite variants of Raveena. But Amy and Salli will just be “beta voices” until we have time to work on them.

I think it’s important to get one voice working well so that we can explore the entire sc-voice function. Amy and Salli are still with us, but just as they are, almost there, but not quite. Apologies for the delay on Raveena’s sisters. I had not realized Pali TTS could be so tricky.


LOL I can’t relate to that at all :wink: - “speedy gonzalez” and “road runner” are more like it. Amazing progress and lots of hard work :star_struck:


Here’s Wikipedia phonetic correspondences page:


Thank you. This is indeed one of the resources I have been using. Sadly, AWS Polly has its own undocumented understanding and does not recognize “bʱ” for Amy and Salli. We therefore have to flail and guess at what works and have empirically found that “bh” works somewhat (the “h” is a bit too separated and long). I found a Pali pronunciation page also quite helpful.

Since each TTS service has its own set of voices, we will be able to set up IPA lexicons for each voice/service combination. Hopefully TTS services will evolve to support the full IPA alphabet so that we can use the reference you linked.


Today Google announced general availability for WaveNet voices. This is a step forward, but is not immediately useful to us. Our own research has shown that en-IN voices are best suited for the demands of Pali. Google’s WaveNet voices are still all EU/USA, and we will need to wait for the en-IN WaveNet voice(s) if that is on their roadmap. This means that we will continue with Raveena and AWS Poly for the immediate future.


I would like to say something about ‘C’ . It is in Roman Ca which short and cha is more pressure when we pronunciation. For example Anicca which is similar to Ca . On the other hand, miccha it is similar to pronunciation cha.


Although I find Raveena’s pronunciation of Pali to be pretty good comparatively, her English is kind of hard to follow, for me, personally. Maybe a different voice for English and then Raveena steps in when some Pali pronunciation is needed? Might be strange to have two voices though, I guess.