SuttaCentral Voice Assistant

I don’t hear the “ṁ” at all—just “sa-ghe”; the other two I can distinguish.

1 Like

Keep listening and the differences will slowly appear. Being somewhat immersed helps a lot to hear the differences. The sounds are provably different using software comparison tools (e.g., cmp). The languages of our early childhood inform our phonetic forms. Yet diligent practice does work for adults. I was able to pronounce the “sju” of Swedish after a year of classes. This gives me hope for recognizing all the m/n sounds.

2 Likes

Can you make her say “sa-gha” for comparison? I do hear a clear difference between the three above, but the ṁ just seems to “disappear” for my ear.

1 Like

MN 10, section 11, segment 3:

Aditi seems really a bit confused. She stumbles almost on each letter, changes pitch all the time, it’s hardly understandable; she also reads out “opening parenthesis” and “closing parenthesis” (not the numbers between the parentheses). For the last two or three items towards the end of the segment it is getting better and clearer again, and she doesn’t read out the parentheses thing.

It is a somewhat longer segment; maybe that overstrains her?

1 Like

:heart_eyes::heart_eyes::heart_eyes: winner!

Wow. You may have found another bug. There are 82467 ṃ and 6 ṁ in the suttas. Maybe the 6 are typos. In any event, I have now made the two dotted m’s sound the same. Here are the revised sounds. This is listening torture. The first two are soooo close but not.

saghe (not a real word)
sa ghe
sa ghe (not a real word)
sa n ghe (not a real word)

That’s worrisome. Added to Release-Plan to investigate. :white_check_mark:

4 Likes

Oh… please don’t be too enthusiastic about his one. ṃ and ṁ are not different sounds, they are just variants in spelling and mean exactly the same. Some scholars prefer one way, some the other. On SC you can toggle between the two by typing ALT+M, see Easter Egg 🥚 1 :sweat_smile:

I simply use ṁ because I have an easy keyboard shortcut for that one. :wink:

And please compare the “saṁghe” in your first voice example to the “saghe”, “saṁghe”, and “saṅghe” in the second—there’s a clear difference! I mean in the previous example I still can’t detect any “ng” sound at all while in all the latter examples there is one (slightly different for each), even in “saghe”. So I’m not sure what Aditi is doing here… :face_with_raised_eyebrow:

2 Likes

Whew. Then my fix was proper. Thank you!

The first sound example is wrong. You did indeed find a bug. :smiley:

It was wrong because the ṁ was passed directly through to AWS Polly Aditi, who shrugged her shoulders and said “What is that?”. Aditi does not know ṁ, it is not one of her phonemes. So we heard what Aditi guessed, which is…random. The fix was to translate ṁ to ŋ for Aditi. SCV already handled ṃ by translating it to ŋ.

I’m really glad you’re able to hear the differences in the latest sample. I really struggle.

Here are the Aditi phonemes for each of the four words:

  • sɐgʰe
  • sɐŋgʰe
  • sɐɳgʰe
  • sɐngʰe

I can see the differences. You can hear them. Maybe one day I will too.

The important thing is that there are audible differences. The actual pronunciation can vary between voices, but the sounds should be distinguishable for any given voice. Better voices will speak Pali more clearly so that we can all distinguish each syllable. Aditi works for now, but should be replaced with better alternatives as they become available.

1 Like

I only hear the differences as long as the words are in a nice list next to each other and I know the order in which they are going to be pronounced. If I would encounter one of them alone it would be really hard to tell which one it is!

But it would be interesting if my friend can distinguish them. Next time I meet her I can ask her.

Actually, that is what I think saghe should sound like; but idt doesn’t.

As for the word saṅghe / saṁghe, I think they are not different words, but the spelling preferences have changed over time, and so we sometimes find different versions in a text. But this is probably a guess, and maybe someone else has more reliable knowledge about this?

I think overall, Aditi is doing a good job, and the more I listen the more I get used to her way of pronunciation.

One other thing I have to get used to is when she inserts vowels where there are none, or seems to change the vowel. Or at least to my ear it sounds that way.

This happens for example when there are double consonants, but not always; it sounds as if there is a vowel in-between. Another case is when there is a tilde ~ over a letter, like ñ. Sometimes it sounds as if there is a vowel, mostly e, inserted, but not always. And in cases when a word ends with -iṁ, I’ve already come across instances where it actually sounded more like -aṁ.

I’m really not sure if this is only to my ear, or if other people notice the same?

2 Likes

I’ll defer to the ordained on this, but it does make sense given the span of time. There is also n with dot under in the mix. I have been working hard to apply specific and different phonemes to each of these subtle sounds using this as my guide.

Double consonants are one of the reasons Aditi and Raveena are candidate voices. The English voices don’t articulate the consonants separately. The AI voices are patterned on human voices, so we may be hearing the particular human’s intonation of double consonants. I think the problem here may be cadence. When we chant, each syllable normally takes a beat and long vowels take two beats. With chanting cadence, the space between consonants is heard as nothing at all because it is the space between sounds. Aditi has conversational sloppy cadence. That is regretable and also why eventually we would need to replace Aditi.

I speak Spanish and the ñ sounds just like Español. There is indeed a vowel turn in this odd sound. When we recite the Spanish alphabet, we say “énye” here. That is my understanding. Pali may have different requirements.

I have noticed this as well and have tried to fix it but apparently failed. Perhaps this sound combination does not occur in Hindi. Inconsistency of pronunciation is the sign that Aditi is making things up. This will be tricky to fix given her disability.

You and I speak multiple languages and may be more sensitive to audible differences. Single language people will struggle more I think.

Thank you for listening! You are providing great feedback. Aditi’s recitation will be common to all SCV listeners. My hope is that listening to Aditi’s Pali alternating segment-by-segment with the listener’s native language may bring us all together in recitation.

2 Likes

This is really a nice idea! :heartpulse:

1 Like

Hi Karl, I have a question / feature request: Is it possible that when starting to listen to a sutta the user can have some information on how long it would take to listen to this sutta (depending on which option is selected: translation only, or Pali and translation; and probably that’s also different for different voices)?

When reading a sutta as a sighted person I can very quickly have an overview how long my text is, and so have a basis to decide whether or not this is what I want to read right now. Listening to suttas on SCV doesn’t offer this information—or did I miss something?

I have no idea how big a challenge this is on the technical side, or if it is at all feasible—but would be very nice to have it! :heart_eyes:

2 Likes

@Aminah, this feature is subtle and we should all put our heads together.

What is know to me is the number of segments.

  • DN33 has 1158 segments
  • AN5.233 has 12 segments

DN33 English takes about 2hours to listen to. And I’m guessing that DN33 Pali/English would take 4 hours. That’s a lot of time. The thought occurred that one might want to break down such listening into manageable logical portions, especially for offline listening MP3 downloads. This consideration made me scratch my head at the prospect of splitting a long download at an arbitrary segment boundary and that seemed…abrupt and undesirable.

SCV breaks long suttas into numbered sections if they are present. DN33 has twelve such sections. The SCV sections are different than Bhante Sujato’s translation of sections. By Bhante Sujato’s translation, DN33 has three sections. Ones-Fours, Fives-Sevens, Nines-Tens.

With sections, we could break up long recitations at SCV-section boundaries (vs. Bhante Sujato section boundaries, which are supersets). Then we could have an option for people to give a desired time for a single MP3 file. We would use that length to break up suttas or reading lists of suttas at SCV section boundaries. E.g., it would break up DN3 between Threes and Fours.

Today, it is possible to get playing lists that would take a full day or longer to listen to and this translates to a challenge for the file download feature I am currently working on. I can’t think of anybody liking a 24 hour MP3. I personally would like shorter than two hours, perhaps 90 minutes at most.

In terms of priority, I think this has to be part of V1 because of file downloads.

And in terms of Sabbamitta’s request, that’s (0.5d) for just the SuttaPlayer time estimate.
I made you read all the above to get here.
:japanese_ogre:

3 Likes

When anusvara () is followed immediately by a consonant, it normally changes to the “class nasal”.

For the list of class nasals, see here:

So eg. ṁt becomes nt, ṁp becomes mp, and so on. Because of this rule, it always seems to me that in Pali, the correct spelling should be saṅgha. Sanskrit, however, may be different, but I don’t know enough about it to say.

In the case of ṁg or ṁk, the nasal () is in fact exactly the same sound as the anusvara, so the difference is a mere spelling convention.

Manuscripts vary between one and the other. In theory, this might be meaningful: we might find that, say, manuscripts from a certain lineage nasalize in a certain way; or that early and late manuscripts differ in some way. But I have never seen anything that suggests such a finding.

5 Likes

Bhante Sujato, thank you very much for this definitive table.

@Aminah, I shall need to work through this entire table for v1.0. Let’s call it (1w). This is not a minor undertaking. The challenge is to handle vowel/consonant rules. It is also critical to get Aditi’s pronunciation aligned with expectations. If you want a December release, then we should push file download into v1.1 to work on correcting Aditi instead. Or we can delay v1 till Jan.

3 Likes

I would rather guess that Pali / English would take more than double the time of the English. The Pali is very repetitive, and the English often abbreviated.

2 Likes

For your purposes, ṁ = ṃ = ṅ. Meanwhile, the pronunciation is already better than 95% of the Pali out there!

6 Likes

And ṇ leave as different I assume?

2 Likes

ṇ is a different sound: it’s retroflex, along with ṭ, ḍ, etc. But that is a fairly rare sound, not found in many modern languages, Tamil being an exception, I believe. So I would guess that the voices don’t support it. Hardly any modern speakers of Pali use it, either.

If we can get it happening, great. If not, it would be acceptable to substitute the retroflex with the corresponding dental sounds (i.e. treat them as if the underdot does not exist.).

3 Likes

Yes. We can get it. “s ɐ ɳ gʰ e”
And Sabbamitta can hear the difference while concentrating fiercely on the four-word audio clip above.
:grinning:
:pray:

5 Likes

saṇghe wasn’t in that list… only sanghe which would just follow Bhante @sujato’s suggestion to treat it as if the underdot does not exist.

2 Likes