SuttaCentral Voice Assistant

Yep. :slight_smile:

1 Like

Last night I listened to MN 8 with Amy and Aditi, and I noticed some inconsistencies in Aditi’s pronunciation of the Pali.

  • In the recurrent word “purisapuggalassa” the “a” at the end of the first component “purisa” is sometimes pronounced as if almost not existent, and it sounds like “puris-pugalassa”. In other instances it is pronounced as I would expect it. The same inconsistency occors with other words that have a short “a” at the end.

  • In section 7, segment 49: “Iti kho, cunda, desito mayā sallekhapariyāyo, desito cittuppādapariyāyo, desito parikkamanapariyāyo, desito uparibhāgapariyāyo, desito parinibbānapariyāyo.” the word “desito” is sometimes pronounced with a short, sometimes with a long “e”.

  • In section 2, segment 3: “Atha kho āyasmā mahācundo sāyanhasamayaṃ paṭisallānā vuṭṭhito yena bhagavā tenupasaṅkami; upasaṅkamitvā bhagavantaṃ abhivādetvā ekamantaṃ nisīdi. Ekamantaṃ nisinno kho āyasmā mahācundo bhagavantaṃ etadavoca:” the word nisīdi is pronounced with a short “i” where it actually should be long.

I think there were still a few more inconsistencies, but I didn’t take note of them as my primary intention was to listen to the sutta in order to listen to the sutta… :grin:

2 Likes

I’ve also given a try to Russell, and he has the same problem with pronouncing “Jeta” as mentioned above.

I am certainly not biased against male voices—but I simply love Amy so much that I will give preference to her!

2 Likes

A few suggestions:
Metta sutta SuttaCentral
The first two paragraphs of… SuttaCentral
Kalama sutta SuttaCentral

2 Likes

Tell me about it! That is exactly how pretty much each of my visits go (except I do get lost in making notes and then eventually Karl adds me to his GitHub repo! :upside_down_face:)

2 Likes

Case in point: (and Karl this is a casual question out of curiosity not something to pause on at all) As promised I switched on the option to read both Pali and English. Just now listening to a sutta, I noticed that Raveena’s voice seemed noticeably smoother than Aditi’s. Is this because of the speed setting, or is it just that the AMZ Polly voices have different textural qualities, or something else?

Also:

  • The pronunciation of “marriages”; Raveena (example: DN1, Bodhi, 6/21, 61/94).
  • Hilariously, :) (as per: (He recalls:) ‘Then I… ) is read as “smile”; Raveena (example: DN1, Bodhi, 7/21, 7/20).
  • The pronunciation of “lives” in certain cases; Raveena (example: DN1, Bodhi, 7/21, 7/20 - I guess this is a really tricky one to get correct for).
  • The pronunciation of “Almighty”; Raveena (example: DN1, Bodhi, 8/21, 7/25).
  • The pronunciation of “unsatisfactoriness”; Raveena (example: DN1, Bodhi, 8/21, 24/25).
  • The pronunciation of “rationalist”, and “investigations” (I think Raveena, might be keen for us to start non-English support with French :wink: ); Raveena (example: DN1, Bodhi, 9/21, 12/14).
1 Like

All AWS Polly voices rely on machine learning to infer cadence, inflection and pronunciation from context. These AI voices do an amazing job for their target languages, but still struggle with homographs that they have not been trained on. One classic example in the suttas is “lives” as in “my past lives” and “he lives now”. The AI voices understand the patterns of grammar usually but not always. Like us, they fail in a quirky, individual way. Unlike us, they cannot be fixed without retraining. Humans can correct their pronunciation adaptively. The AI voices would need to be entirely retrained to handle new subtleties. The AI voices would literally have to relearn everything for which they have already been trained along with whatever additional corrections are needed. Amazon is in charge of training the voices and would no doubt hesitate at the rather large expense and risk of retraining any existing voice. The risk is of negative impact to all existing voice applications globally. Not just our risk, but risk to everybody in the world using AWS voices.

Because the training of ML voices involves phrases and/or sentences (i.e., not just individual words), the pronunciation of an individual word is affected by surrounding words. This leads to inconsistencies in pronunciation of individual words such as desito. These inconsistencies have been quite vexing. I don’t know how to fix them. At best, the only control we have is over letter combinations. I have managed to nudge some words back into a believable universe by changing them slightly–e.g., “kkh” becomes “k.k\u02b0”.

  • The same inconsistency occors with other words that have a short “a” at the end.

Fortunately, consistent mispronunciations are tractable. We’ll need to know the rules for knowing short “a” at the end. Is it every ending “a” or only some? I need the letter combinations for ending “a” that need to be corrected. Once we determine those, Aminah can add them to the Release Plan as bugs.

Aminah, this can perhaps go in the Release Plan as a bug, since it seems to be an “always wrong everywhere” thing. I am really glad we have Sabbamita’s trained ears–I have no idea what Pali should sound like and have been clumsily relying on the internet, not real people speaking around me.

Thank you very much for listening for these. I personally have found the awareness of inconsistencies arises subtly with repeated listening. I listen to DN33 and MN44 regularly, but with more people listening, our coverage will improve.

I’m afraid we may need to grant Russell mercy because of his status as a tourist (like Amy and Raveena) in Pali land. Raveena with her Hindi phonemes tends to be closer if Russell and Amy prove too annoying.

Raveena was trained on and is speaking English with the odd Pali hiccough thrown in. This is why Raveena is generally smoother. Aditi has been trained to speak Hindi or English and is speaking a foreign language (Pali) entirely. This is why Aditi Pali is rough and feels like words stitched together because that is what they are.

All the voices have trouble speaking Pali-only. Raveena sounds even worse than Aditi. And Amy speaking Pali was truly, well, unspeakable–I wouldn’t want to inflict that on anyone. Yes, I share your sad dismay at Aditi’s choppiness–it is not correctable. What we need is an AWS Polly Pali voice tranined with human voice recordings of many Pali phrases. For such to happen, we would need to approach Amazon and request a Pali voice, which would no doubt require a lot of money (millions?), time and effort. Such a request would be feasible with a much larger user base and might happen in Aminah’s lifetime, but perhaps not mine. Do not give up hope for such, but be prepared to wait for the right time.

Consistent mispronunciations due to letter combinations and/or occurring at end/beginning of words are fixable and includable in the Release Plan as bugs. Each mispronunciation takes typically a day to work out. Yes. They are maddening, but that estimate can guide your judgement about value of each correction. Homographs (e.g., “lives”) should be documented in Mispronunciations and will not be fixed.

3 Likes

Many thanks for all the fascinating explanation.

Gosh! Good to know. The cost/benefit assessment is so important, but it’s impossible to have a sense of these things without the relevant background. Certainly, in my view these things don’t even come close to being worth fixing. Even with some of the more adventurous pronunciations there’s never any doubt about meaning… actually, I take a pretty similar attitude to my spelling; most people seem to mostly understand what I type despite sometimes accidentally using some entirely wild spellings. :laughing:

2 Likes

The things that are dangerous are the “sound-alikes”. The classic example is “ananda” and “ānanda”, which have opposite meanings. Let’s be on the lookout for these landmines and correct them. A misunderstanding here would be very bad.

2 Likes

Yes, absolutely agree. In fact, after posting I had a second thought remembering the catch Ang. Sabbamitta made. I think generally speaking getting the Pali as good as it can be (within reason) has real value, in so far as most people will probably be using it as an aid to get a sense of Pali because they don’t know Pali. With English (or whatever other native/high proficiency language) it’s very ease to ‘self-correct’ in one’s mind.

1 Like

Thank you very much for the detailed background!

And unfortunately, it is about inconsitency here. Sometimes the end-a is pronounced properly, sometimes it kind of disappears.

What I know about Pali pronunciation is from some chanting in various monasteries where I have been staying. Rules for pronunciation can for example be found in this chanting book (should be on page 78):

https://www.abhayagiri.org/books/424-abhayagiri-chanting-book

As far as I can see, every end-a that has no horizontal line on top (like “ā”) should be short.

What I have found is only one instance of nisīdi, and I don’t know if there are any counterexamples. So not sure if it is “always wrong everywhere”.

?? Trained ears? I am really no Pali expert. As I said, what I know about pronunciation is from some chanting I have heard and done. What I am doing here is I listening to the suttas and reading the text along, so I can see the spelling and compare to what I hear.

I’m getting used to it when listening a bit longer, so I don’t mind so much.

Actually, I’d like to add, @karl_lew, having started to listen to entire suttas a few times now I find it really beneficial for me! I appreciate especially the Pali - English combination (even if this can make the suttas very long). I feel it makes grow my general feeling for Pali which started out delicately with my work on the Vinaya translation—I still didn’t do any systematic study, especially know nothing of the grammar.


Another potential bug: *sakhāra" sounds like sakhāra—I can’t hear the at all.

2 Likes

:heart_eyes: :sparkler: :tada:

Aminah, this is exactly the kind of mispronunciation that we should fix. The was in fact not voiced since it is not part of the supported IPA alphabet for Aditi. I’ve updated the Release Plan with the fix. I got lucky–most of these fixes involve endless trial and error listening.

Sabbamitta, the fix also conflates with , which may be a future disaster. As it is, these sounds are so similar to “ṁ”, that the Western ear will struggle a lot. I have elected to conflate the n’s in the hopes recognizing two out of three may be enough. However, there may be the dreaded sound-alikes with different meanings for vs. .

3 Likes

My goodness—usually, the Buddha doesn’t quite praise “faultfinding”, and here it’s even celebrated!

I am most happy to lend my—trained or otherwise—ear to SCV on a somewhat regular basis, now I’ve found out that it’s not only time consuming, but extremely beneficial to me. It gives a completely different experience of the suttas!

I’m not sure if I am able to distinguish between ṅ and ṁ by listening. Sometimes these two spellings are used alternatively, even inside the same text, as for example in Snp 2.1 (Mahāsaṅgīti) for the word saṅgha:

#SC 6 Idampi saghe ratanaṁ paṇītaṁ
#SC 7 Idampi saghe ratanaṁ paṇītaṁ
#SC 8 Idampi saghe ratanaṁ paṇītaṁ
#SC 9 Idampi saghe ratanaṁ paṇītaṁ

Will pay special attention what Aditi is making out of that!

1 Like

Here is Aditi speaking:

saghe
saghe
sanghe

https://scdd.sfo2.cdn.digitaloceanspaces.com/uploads/original/2X/1/1affd94ff16535eb008a9749a01ffb19e39a4dcf.mp3

With effort, I can hear the “m-ness” as “back in the throat” as in “chunk sunk bunk”
With effort, I can hear the “dotted n-ness” as “thunked through the nose” as in “wanton wonton”
Without effort, I can hear plain “n” as in “man tan”
Lacking first-hand knowledge of Pali, I cannot say if these are correct, but I can move my tongue in these three ways, as can Aditi.

2 Likes

I don’t hear the “ṁ” at all—just “sa-ghe”; the other two I can distinguish.

1 Like

Keep listening and the differences will slowly appear. Being somewhat immersed helps a lot to hear the differences. The sounds are provably different using software comparison tools (e.g., cmp). The languages of our early childhood inform our phonetic forms. Yet diligent practice does work for adults. I was able to pronounce the “sju” of Swedish after a year of classes. This gives me hope for recognizing all the m/n sounds.

2 Likes

Can you make her say “sa-gha” for comparison? I do hear a clear difference between the three above, but the ṁ just seems to “disappear” for my ear.

1 Like

MN 10, section 11, segment 3:

Aditi seems really a bit confused. She stumbles almost on each letter, changes pitch all the time, it’s hardly understandable; she also reads out “opening parenthesis” and “closing parenthesis” (not the numbers between the parentheses). For the last two or three items towards the end of the segment it is getting better and clearer again, and she doesn’t read out the parentheses thing.

It is a somewhat longer segment; maybe that overstrains her?

1 Like

:heart_eyes::heart_eyes::heart_eyes: winner!

Wow. You may have found another bug. There are 82467 ṃ and 6 ṁ in the suttas. Maybe the 6 are typos. In any event, I have now made the two dotted m’s sound the same. Here are the revised sounds. This is listening torture. The first two are soooo close but not.

saghe (not a real word)
sa ghe
sa ghe (not a real word)
sa n ghe (not a real word)

That’s worrisome. Added to Release-Plan to investigate. :white_check_mark:

4 Likes

Oh… please don’t be too enthusiastic about his one. ṃ and ṁ are not different sounds, they are just variants in spelling and mean exactly the same. Some scholars prefer one way, some the other. On SC you can toggle between the two by typing ALT+M, see Easter Egg 🥚 1 :sweat_smile:

I simply use ṁ because I have an easy keyboard shortcut for that one. :wink:

And please compare the “saṁghe” in your first voice example to the “saghe”, “saṁghe”, and “saṅghe” in the second—there’s a clear difference! I mean in the previous example I still can’t detect any “ng” sound at all while in all the latter examples there is one (slightly different for each), even in “saghe”. So I’m not sure what Aditi is doing here… :face_with_raised_eyebrow:

2 Likes