SuttaCentral

SC-Voice: feedback for fine-tuning Portuguese TTS voice (aka Ricardo)

sc-voice
Tags: #<Tag:0x00007f788a6880a8>

#1

As discussed in the topic https://discourse.suttacentral.net/t/wanted-translator-for-sc-voice-interface SC-Voice is being upgraded to allow the reading of suttas found in Portuguese language in SC.

This topic is created to capture any feedback and suggestions speakers of Portuguese may have in terms of how to improve the pronunciation of by the text to speech (TTS) service, aka Ricardo.

At this stage the Portuguese TTS service is found at the staging server. For reference, see below link to listen to MN118:

https://35.176.116.11/scv/index.html?r=0.035696870896822386#/sutta?ips=6&lang=pt&locale=pt&maxResults=5&search=mn118&showId=false&showLang=0&vnameRoot=Aditi&vnameTrans=Ricardo


Wanted :female_detective: : Translator for SC-Voice interface
#2

Hi. :wave:


#3

Olá! :wave:

(I still added a tag “sc-voice”.)


#4

So, I found this website: Free Text-To-Speech for Brazilian Portuguese language and MP3 Download | ttsMP3.com and I played with a few words.

In a nutshell, the ê in êxtase should read like the e in cedo.

In the case of arahant, it should read it as Ricardo reads ararrante.

Hope the above makes sense to you!


#5

Ah. My ear can’t discern the subtlety of the “e”'s. Here is another attempt:

  • <phoneme alphabet="ipa" ph="'es.sta.se"/>

And an update for arahant:

  • <phoneme alphabet="ipa" ph="'aɾaɾɾante"/>

That link you posted uses the same Ricardo (i.e., AWS). Therefore, you should be able to play and use the exact phonemes given here. I tested it with the above. Glad you found the site. It makes our job much easier!


#6

Yes, we’re getting there.
I would say êxtase now sounds good.
For arahant it is not sound like what I get when I use the website to pronounce it spelt like ararrante. Can you just adjust the code to read arahant as ararrante whenever it finds it in the text?


#7

Now I understand. From the list of vowels found in the link I would say they did not code Ricardo to pronounce the peculiar ê, it only has e , and ɛ.


#8

Yes. We need to find the closest.

In addition, for arahant, we can get back the Portuguese “ch” by removing the ending IPA “e”. This sounds odd to me but perhaps the Portuguese listeners will see the “t” when hearing the “tch”:

  • <phoneme alphabet="ipa" ph="'aɾaɾɾant"/>

The primary goal is semantic fidelity–can the suttas be understood without ambiguity.


#9

it does not sound like what I get when I ask Ricardo to read aloud ararrante in that website… Is there a way to make it so? :thinking:


#10

Sometimes it’s just not possible to get it 100%; the main goal is that it is understood without ambiguity.


#11

Dont worry. I get it. :wink:

I just want to confirm if we could get this one right, as it is a quite frequent and relevant term.

As per above, it will be perfect if we can get the machine to read arahant as if it had been given ararrante instead.

:anjal:


#12

Here is another try. Unfortunately, we currently need to use IPA. The “sounds like” isn’t supported by Voice:

  • <phoneme alphabet="ipa" ph="'a.ɾa.han.t͡ʃĩ"/>

I do have a question about pronunciation of foreign words. The word arahant is a Pali word. Should we pronounce and spell Pali as much like Pali as possible? It’s not clear what our translation policy should be.

Here is Aditi speaking arahant directly. Note that it sounds quite different than ararrante:

  • arahant


#13

Very good! I would say you could drop the at the end. Making it:

<phoneme alphabet=“ipa” ph="'a.ɾa.han.t"/>


#14
  • <phoneme alphabet="ipa" ph="'a.ɾa.han.t͡ʃ"/>

And in context. I find it fascinating that the last “arahant” has a hard “t” just as in Pali. :open_mouth:

Anagarika @Sabbamitta, v1.8.7 has the latest Portuguese pronunciations.


#15

Thanks for your patience and time @karl_lew !

I will wait until it is released and found available in https://voice.suttacentral.net to share it in other online forums and invite for others to check it and provide their feedback here.

:anjal:


#16

Gabriel, I’ve sent gnlaera a Github invitation to join sc-voice. This should allow you to participate freely in issues relating to sc-voice. You’ll also be given full access to source code so that you can make changes directly as you see fit. Please coordinate your changes with the team to minimize merge conflicts (i.e., two people editing same file).

Anagarika, if Gabriel approves #187, we are ready for release pending your approval. All automated tests pass for v1.8.7.

Gabriel, Anagarika Sabbamitta is the Voice Product Owner (PO). She directs and approves Voice features and releases.


#17

Approved! Sadhu! Sadhu! Sadhu!
:anjal:


#18

Where would I see it?


#19

Hmm… I’m not sure. Are you logged into Github?

Here is the issue for you to close:


#20

I have just replied with a comment saying I approve it. I could not find a button to close it.
Sorry but I have no idea of how this GitHub thing works! haha