Voice 2.3 Aditi pronunciation fixes

So it seems I’ve voted for the sheep now, but I can live with the laundry peg just as well! :laughing:

1 Like

Oh! I found another way!

See doubled diphthong

1 Like

:+1: :clap:

Much better!!!

1 Like

Looks like we both like it. What is funny is that we have violated Pali pronunciation by making it a long “e”. :man_shrugging:

Well, I’ll implement that for now unless others outvote our choice. Thank you. :pray:

I hope it doesn’t pees-off anybody…

1 Like

“e” is always long in Pali—very few exceptions!

2 Likes

I just learned that “aṃ” at the end of Pāli words should be a nasal vowel (not [əŋ]):

Perhaps Aditi could be updated to reflect this?

3 Likes

I should imagine this would be extremely difficult to do, for one would need to distinguish the places where ṃ is pronounced as a nasalized vowel and the places where it isn’t.

For example, in Mahāsaṅgīti’s Vinaya Piṭaka alone there are 206 words where ṃ is followed by either k, kh, g or gh and so would be pronounced ṅ, 52 places where it’s followed by a labial consonant and so would need to be pronounced as m, 57 places where it’s followed by a palatal and so would be pronounced ñ, etc. etc.

3 Likes

if word[i+1] in "kg": return "ṅ"

Aditi is a robot and (I assume) loves long lists of clear rules. @karl_lew?

3 Likes

Pali has a LOT of nasals. Unfortunately, each of the AWS Polly voices only honors a subset of the IPA alphabet. Aditi is bilingual and handles both English and Hindi phonemes. But even with Aditi, it’s quite tricky to map the IPA we have to what we need.

Let’s take a look at some of the nasals that Voice uses:

A good example of the complexity is ñāṇaṃ

<phoneme alphabet="ipa" ph="ɲɑː ɳəŋ">ñāṇaṃ</phoneme>.

We have given up trying to match Pali exactly. Instead, we’ve tried to hammer out a mapping that is intelligible and distinguishable.

How do you think we should proceed?

Robots don’t program themselves or listen to themselves. It’s people painstakingly listening to audio over and over again trying to find and guess the best fit. The best way to help is to set up an AWS account and try out the various phonemes for various cases to make sure we have handled them all. Voice allows for phoneme customization as well as custom IPA for individual words.

Here is the Voice customization file. It’s easy to change, but quite difficult to understand how it should be changed. The robots themselves are inconsistent in pronunciation (witness struggle with “pe”, above, same IPA, different sounds. :see_no_evil: ).

3 Likes

The documentation page you linked to claims that the nasal vowels (ə̃, etc) are supported.

So I wonder what alphabet="ipa" ph="ɲɑː ɳə̃"> would sound like?

Nice! If the above works, a simple "aṃ$": "ə̃" might also?

2 Likes

it sounds like a Hindi Hiccup.

The problem is that the AI strategically looks beyond the IPA at surrounding context and makes assumptions about tonality, inflection and pace based on a Hindi/English context. We’re abusing an inapproriate tool

The better way to approach this is to record a Pali human speaker and craft a Pali AWS Polly voice from those samples ground up. Amazon could do that. I can’t. :cry:

Yes. Well, in theory at least. Then you have to listen to a lot of suttas to make sure that such a broad change sounds good all around.

3 Likes

:rofl: It sounds like a teenager’s voice cracking into falsetto :joy:

Which, experience shows, would also be no guarantee of correct pronounciation :joy:

Okay okay, point taken. Just wanted to bring you this (new to me) information. Do with it what you like. :smile:

3 Likes

The sound’s not bad but the second syllable is too short. The ṃ is supposed to double a syllable’s mora. Could you try putting a triangular colon at the end?

ɲɑː ɳə̃ː

Or maybe give this a try:

ɲɑː ɳɑ̃ː

Or this:

ɲɑː ɳɑ̃

4 Likes

3 Likes

Venerables @Dhammanando, @Khemarato.bhikkhu, I see we have a few new sound engineers. Welcome to the Voice team!! :grin:

4 Likes

I feel like the last one is actually pretty good (long, and sounds nasal to me) but considering the Voice team’s stated goal to

perhaps such a subtle distinction (between the nasal and oral /a/) would actually make Aditi less “intelligible and distinguishable,” even if it would make her more “intelligent and distinguished.” Poor Aditi might even have to suffer the same kind of “no! that pronounciation is wrong!” abuse that we meat-bag Pāli dispensers have to—and nobody (robot or not) needs that.

What do you think, Ven @Dhammanando?

2 Likes

The biggest problem is that if you change the pronunciation of a sound that appears many times in the canon, you don’t know what you are setting of. We could implement it on staging and then would have to do a lot of test listening of many suttas in order to see whether this is overall better than what we have so far.

In any case, if we are going to make any further changes to Pali pronunciation this should be done before I start building definite Pali VSMs—which is intended to happen as soon as the SC team has completed a full revision of the Mahasangiti text (no one knows how long that takes).

2 Likes

I’m not sure.

Maybe I’m just imagining it, but it seems like she’s speaking twice as fast as the lady in Karl’s recording. Is it possible to slow her down a bit?

2 Likes

We can slow down a voice, but only globally, not for one individual sound or word. This would have to be done by IPA changes, but as you are experiencing is not trivial.

2 Likes

1.429x as fast to be precise, and yes it’s possible I just don’t know how :joy:

I made those recordings strictly so you can hear how she pronounces the IPA. Please ignore her speaking rate (and pitch, which is “10% higher” than Voice’s production version of Aditi). :pray:

3 Likes