SuttaCentral Voice Assistant

Tags: #<Tag:0x00007fc4565d92c0>


She is interested in Buddhism, she likes the ideas of those qualities developed on the Buddhist path, but she has never been a systematic reader of suttas. Maybe because they simply are not very much available in German? That might be one reason. Anyway, while discussing this matter I just started wondering why I’ve never provided any suttas to her (except for the mettasutta in Saarländisch… :wink:)—and I guess the answer probably is: for the most part I don’t like the existing German translations. Ever since I discovered Bhikkhu Bodhi’s translations I prefer reading suttas in English; and even more so now with Bhante Sujato’s translations.

Okay, maybe I just ask her if she is interested in engaging in this project and test-listening a few suttas, and then I can think about a choice of suttas to start with. Bhikkhu Mettiko’s are certainly the best translations we have, so this limits the scope to the Majjhima Nikaya.

And I will also ask her this question. :grin:


Invite sent to Aminah-SC (?)

Yes. Thanks for volunteering to lead the search!

Yes. I’m not doing anything in German now. You and Sabbamitta would be investigating to see what such needs might be for next year. If Sabbamitta’s friend is interested in Buddhism, she’ll need to be interviewed once or twice to help explain what her needs and interests her. Then you and Sabbamitta will talk to agree on possibilities and at some point someone will ask me about feasibility and then someone will tell me to do something according to the Release Plan. which I gladly cede to you. Questions to focus on for the interviews would be the relative value of MP3 alone vs entire SCV functionality. For example, if MP3 alone is quite valuable, then the effort to support foreign languages could rely entirely on existing unsupported unsegmented translations. I speak solely of feasibility. It is up to SC management about how translations should be handled moving forward. Are we only supporting segmented foreign translations? Or are the existing unsegmented translations renderable in SCV? I would argue for the latter solely in MP3 offline because we really should have segmented for SCV search. This is actually the reason for recommending an early investigation with Sabbamitta’s friend. If the friend is content with existing screen readers for German suttas, then there is zero need for any SCV work on foreign language MP3’s at this time. However, if there is preference for SCV German MP3’s then that would be good to know.


Yep. :slight_smile:


Last night I listened to MN 8 with Amy and Aditi, and I noticed some inconsistencies in Aditi’s pronunciation of the Pali.

  • In the recurrent word “purisapuggalassa” the “a” at the end of the first component “purisa” is sometimes pronounced as if almost not existent, and it sounds like “puris-pugalassa”. In other instances it is pronounced as I would expect it. The same inconsistency occors with other words that have a short “a” at the end.

  • In section 7, segment 49: “Iti kho, cunda, desito mayā sallekhapariyāyo, desito cittuppādapariyāyo, desito parikkamanapariyāyo, desito uparibhāgapariyāyo, desito parinibbānapariyāyo.” the word “desito” is sometimes pronounced with a short, sometimes with a long “e”.

  • In section 2, segment 3: “Atha kho āyasmā mahācundo sāyanhasamayaṃ paṭisallānā vuṭṭhito yena bhagavā tenupasaṅkami; upasaṅkamitvā bhagavantaṃ abhivādetvā ekamantaṃ nisīdi. Ekamantaṃ nisinno kho āyasmā mahācundo bhagavantaṃ etadavoca:” the word nisīdi is pronounced with a short “i” where it actually should be long.

I think there were still a few more inconsistencies, but I didn’t take note of them as my primary intention was to listen to the sutta in order to listen to the sutta… :grin:


I’ve also given a try to Russell, and he has the same problem with pronouncing “Jeta” as mentioned above.

I am certainly not biased against male voices—but I simply love Amy so much that I will give preference to her!


A few suggestions:
Metta sutta SuttaCentral
The first two paragraphs of… SuttaCentral
Kalama sutta SuttaCentral


Tell me about it! That is exactly how pretty much each of my visits go (except I do get lost in making notes and then eventually Karl adds me to his GitHub repo! :upside_down_face:)


Case in point: (and Karl this is a casual question out of curiosity not something to pause on at all) As promised I switched on the option to read both Pali and English. Just now listening to a sutta, I noticed that Raveena’s voice seemed noticeably smoother than Aditi’s. Is this because of the speed setting, or is it just that the AMZ Polly voices have different textural qualities, or something else?


  • The pronunciation of “marriages”; Raveena (example: DN1, Bodhi, 6/21, 61/94).
  • Hilariously, :) (as per: (He recalls:) ‘Then I… ) is read as “smile”; Raveena (example: DN1, Bodhi, 7/21, 7/20).
  • The pronunciation of “lives” in certain cases; Raveena (example: DN1, Bodhi, 7/21, 7/20 - I guess this is a really tricky one to get correct for).
  • The pronunciation of “Almighty”; Raveena (example: DN1, Bodhi, 8/21, 7/25).
  • The pronunciation of “unsatisfactoriness”; Raveena (example: DN1, Bodhi, 8/21, 24/25).
  • The pronunciation of “rationalist”, and “investigations” (I think Raveena, might be keen for us to start non-English support with French :wink: ); Raveena (example: DN1, Bodhi, 9/21, 12/14).


All AWS Polly voices rely on machine learning to infer cadence, inflection and pronunciation from context. These AI voices do an amazing job for their target languages, but still struggle with homographs that they have not been trained on. One classic example in the suttas is “lives” as in “my past lives” and “he lives now”. The AI voices understand the patterns of grammar usually but not always. Like us, they fail in a quirky, individual way. Unlike us, they cannot be fixed without retraining. Humans can correct their pronunciation adaptively. The AI voices would need to be entirely retrained to handle new subtleties. The AI voices would literally have to relearn everything for which they have already been trained along with whatever additional corrections are needed. Amazon is in charge of training the voices and would no doubt hesitate at the rather large expense and risk of retraining any existing voice. The risk is of negative impact to all existing voice applications globally. Not just our risk, but risk to everybody in the world using AWS voices.

Because the training of ML voices involves phrases and/or sentences (i.e., not just individual words), the pronunciation of an individual word is affected by surrounding words. This leads to inconsistencies in pronunciation of individual words such as desito. These inconsistencies have been quite vexing. I don’t know how to fix them. At best, the only control we have is over letter combinations. I have managed to nudge some words back into a believable universe by changing them slightly–e.g., “kkh” becomes “k.k\u02b0”.

  • The same inconsistency occors with other words that have a short “a” at the end.

Fortunately, consistent mispronunciations are tractable. We’ll need to know the rules for knowing short “a” at the end. Is it every ending “a” or only some? I need the letter combinations for ending “a” that need to be corrected. Once we determine those, Aminah can add them to the Release Plan as bugs.

Aminah, this can perhaps go in the Release Plan as a bug, since it seems to be an “always wrong everywhere” thing. I am really glad we have Sabbamita’s trained ears–I have no idea what Pali should sound like and have been clumsily relying on the internet, not real people speaking around me.

Thank you very much for listening for these. I personally have found the awareness of inconsistencies arises subtly with repeated listening. I listen to DN33 and MN44 regularly, but with more people listening, our coverage will improve.

I’m afraid we may need to grant Russell mercy because of his status as a tourist (like Amy and Raveena) in Pali land. Raveena with her Hindi phonemes tends to be closer if Russell and Amy prove too annoying.

Raveena was trained on and is speaking English with the odd Pali hiccough thrown in. This is why Raveena is generally smoother. Aditi has been trained to speak Hindi or English and is speaking a foreign language (Pali) entirely. This is why Aditi Pali is rough and feels like words stitched together because that is what they are.

All the voices have trouble speaking Pali-only. Raveena sounds even worse than Aditi. And Amy speaking Pali was truly, well, unspeakable–I wouldn’t want to inflict that on anyone. Yes, I share your sad dismay at Aditi’s choppiness–it is not correctable. What we need is an AWS Polly Pali voice tranined with human voice recordings of many Pali phrases. For such to happen, we would need to approach Amazon and request a Pali voice, which would no doubt require a lot of money (millions?), time and effort. Such a request would be feasible with a much larger user base and might happen in Aminah’s lifetime, but perhaps not mine. Do not give up hope for such, but be prepared to wait for the right time.

Consistent mispronunciations due to letter combinations and/or occurring at end/beginning of words are fixable and includable in the Release Plan as bugs. Each mispronunciation takes typically a day to work out. Yes. They are maddening, but that estimate can guide your judgement about value of each correction. Homographs (e.g., “lives”) should be documented in Mispronunciations and will not be fixed.


Many thanks for all the fascinating explanation.

Gosh! Good to know. The cost/benefit assessment is so important, but it’s impossible to have a sense of these things without the relevant background. Certainly, in my view these things don’t even come close to being worth fixing. Even with some of the more adventurous pronunciations there’s never any doubt about meaning… actually, I take a pretty similar attitude to my spelling; most people seem to mostly understand what I type despite sometimes accidentally using some entirely wild spellings. :laughing:


The things that are dangerous are the “sound-alikes”. The classic example is “ananda” and “ānanda”, which have opposite meanings. Let’s be on the lookout for these landmines and correct them. A misunderstanding here would be very bad.


Yes, absolutely agree. In fact, after posting I had a second thought remembering the catch Ang. Sabbamitta made. I think generally speaking getting the Pali as good as it can be (within reason) has real value, in so far as most people will probably be using it as an aid to get a sense of Pali because they don’t know Pali. With English (or whatever other native/high proficiency language) it’s very ease to ‘self-correct’ in one’s mind.


Thank you very much for the detailed background!

And unfortunately, it is about inconsitency here. Sometimes the end-a is pronounced properly, sometimes it kind of disappears.

What I know about Pali pronunciation is from some chanting in various monasteries where I have been staying. Rules for pronunciation can for example be found in this chanting book (should be on page 78):

As far as I can see, every end-a that has no horizontal line on top (like “ā”) should be short.

What I have found is only one instance of nisīdi, and I don’t know if there are any counterexamples. So not sure if it is “always wrong everywhere”.

?? Trained ears? I am really no Pali expert. As I said, what I know about pronunciation is from some chanting I have heard and done. What I am doing here is I listening to the suttas and reading the text along, so I can see the spelling and compare to what I hear.

I’m getting used to it when listening a bit longer, so I don’t mind so much.

Actually, I’d like to add, @karl_lew, having started to listen to entire suttas a few times now I find it really beneficial for me! I appreciate especially the Pali - English combination (even if this can make the suttas very long). I feel it makes grow my general feeling for Pali which started out delicately with my work on the Vinaya translation—I still didn’t do any systematic study, especially know nothing of the grammar.

Another potential bug: *sakhāra" sounds like sakhāra—I can’t hear the at all.


:heart_eyes: :sparkler: :tada:

Aminah, this is exactly the kind of mispronunciation that we should fix. The was in fact not voiced since it is not part of the supported IPA alphabet for Aditi. I’ve updated the Release Plan with the fix. I got lucky–most of these fixes involve endless trial and error listening.

Sabbamitta, the fix also conflates with , which may be a future disaster. As it is, these sounds are so similar to “ṁ”, that the Western ear will struggle a lot. I have elected to conflate the n’s in the hopes recognizing two out of three may be enough. However, there may be the dreaded sound-alikes with different meanings for vs. .


My goodness—usually, the Buddha doesn’t quite praise “faultfinding”, and here it’s even celebrated!

I am most happy to lend my—trained or otherwise—ear to SCV on a somewhat regular basis, now I’ve found out that it’s not only time consuming, but extremely beneficial to me. It gives a completely different experience of the suttas!

I’m not sure if I am able to distinguish between ṅ and ṁ by listening. Sometimes these two spellings are used alternatively, even inside the same text, as for example in Snp 2.1 (Mahāsaṅgīti) for the word saṅgha:

#SC 6 Idampi saghe ratanaṁ paṇītaṁ
#SC 7 Idampi saghe ratanaṁ paṇītaṁ
#SC 8 Idampi saghe ratanaṁ paṇītaṁ
#SC 9 Idampi saghe ratanaṁ paṇītaṁ

Will pay special attention what Aditi is making out of that!


Here is Aditi speaking:


With effort, I can hear the “m-ness” as “back in the throat” as in “chunk sunk bunk”
With effort, I can hear the “dotted n-ness” as “thunked through the nose” as in “wanton wonton”
Without effort, I can hear plain “n” as in “man tan”
Lacking first-hand knowledge of Pali, I cannot say if these are correct, but I can move my tongue in these three ways, as can Aditi.


I don’t hear the “ṁ” at all—just “sa-ghe”; the other two I can distinguish.


Keep listening and the differences will slowly appear. Being somewhat immersed helps a lot to hear the differences. The sounds are provably different using software comparison tools (e.g., cmp). The languages of our early childhood inform our phonetic forms. Yet diligent practice does work for adults. I was able to pronounce the “sju” of Swedish after a year of classes. This gives me hope for recognizing all the m/n sounds.


Can you make her say “sa-gha” for comparison? I do hear a clear difference between the three above, but the ṁ just seems to “disappear” for my ear.


MN 10, section 11, segment 3:

Aditi seems really a bit confused. She stumbles almost on each letter, changes pitch all the time, it’s hardly understandable; she also reads out “opening parenthesis” and “closing parenthesis” (not the numbers between the parentheses). For the last two or three items towards the end of the segment it is getting better and clearer again, and she doesn’t read out the parentheses thing.

It is a somewhat longer segment; maybe that overstrains her?