SC-Voice: feedback for fine-tuning Portuguese TTS voice (aka Ricardo)

Count on me. I found a few online resources that are very helpful in terms of writing down in IPA-speak how words should sound like.

Apart from the most obvious tools (Wikipedia, dictionaries, etc) I found https://easypronunciation.com to be very helpful. Mind that it has a per hour word limit unless you pay for it.

Happy to help!

:anjal:

3 Likes

:star_struck: Thank you so much!

3 Likes

@Gabriel_L, I just added the Portuguese ToC page. Isn’t there a Portuguese word for table of contents, like for example “indice”? The title should be different from the English title; we can’t have two pages with the same title. For now I made it “ToC-PT”, but if you have a better suggestion I am happy to change it.

I still added “(inglĂȘs)” to the bits that won’t be translated into Portuguese.

Please check if everything is okay.

@karl_lew, could you please set your iPhone to Portuguese for a minute and take that same screenshot for the Offline Listening page again? Thanks!

2 Likes

Reviewed both, made one change only. Good to go.

2 Likes

Did you change the title of t he ToC page? How did you call it? I can’t find it any more



:mag_right: Found it and changed the link from the PT Home page respectively, so that it can be accessed again! :grin:

2 Likes

@Gabriel_L, working on the Procurar Wiki page:

For these examples:

  • mn1/en/sujato is the English translation of MN 1 by Bhikkhu Sujato
  • mn1/en/bodhi is the English translation of MN 1 by Bhikku Bodhi

you have:

mn1 / pt / sujato Ă© a tradução para o inglĂȘs do MN 1 por Bhikkhu Sujato
mn1 / pt / bodhi Ă© a tradução para o inglĂȘs do MN 1 por Bhikku Bodhi

This cannot work, since there are no PT translations by Bhikkhu Sujato or Bhikku Bodhi. It should rather be mn1/pt/portuguese_translator, and in case there are more than one, you can give a second example, like for EN. (You can insert your and Marco’s name as the first translator, even if those translations are not yet available; and if there is already a legacy translation for that sutta this can be the second example; better: choose a sutta that has a legacy translation.)

Please also review the following paragraphs in the same sense. Thank you!


Also, we should maybe adjust the letters in the “Scoring relevance” section. For EN we have M (matches) and F (fraction). Maybe for PT it should be C (correspondĂȘncias) and F (fração)?


In the “RegEx” section:

EN has the example

e.g., root.*suffering

PT has

por exemplo, root.*Sofrimento

This too cannot work. The example should rather represent how this term is translated in the PT texts, preferably in the new segmented translations (that are of course not on SC yet, but will be at some point). The description should always represent what is found in the “real” product (the same for search term examples in another paragraph further up the page).


I edited _ConfiguraçÔes: Resultados da pesquisa_
to _ConfiguraçÔes: Buscar resultados_ as in the PT settings on Voice.


Find the page here: Home · sc-voice/sc-voice Wiki · GitHub


I still added this sentence to the “Índice e conteĂșdo” page:

Translating Voice UI gives some guidelines for translating the Voice interface into other languages (inglĂȘs).

If you could still translate it; thanks!!

1 Like

Reviewed it. Please leave as it is. Thanks
:anjal:

3 Likes

just a heads-up for @karl_lew and @sabbamitta, I have just added a few customwords to Voice.JSON

:anjal:

3 Likes

You’re just incredible, Gabriel. I never dived into the waters of editing pronunciation, and you seem to swim quite naturally there! :swimming_man: :fish:

3 Likes

Thank you Gabriel!

I have pulled your changes and will incorporate them into a new release.

3 Likes

Hi @Gabriel_L, Voice is going to look into some pronunciation issues from next week or so on. Rummaging around in our Backlog I still find this issue. At the time it was made I had understood that you didn’t manage to solve this yourself. Is this still the case, or can this issue be closed?

2 Likes

I don’t think it has been solved. Will check again in the coming days. :anjal:

3 Likes

I listened to MN140 and it seems the issue is not gone. :anjal:

1 Like

Thanks for checking!

1 Like

Note I also made a change to the phonetic mapping, added a particular one for the word “ereto” (which means straight/upright/erect).
:anjal:

2 Likes

@Gabriel_L, just to let you know, @karl_lew has built a new tool in the Voice admin section to make pronunciation editing easier. We also have an instruction on this in About Voice: About Voice

We are now both working on pronunciation, Karl for English, myself for German. Are you still finding Portuguese pronunciation bugs?

Thanks for the headsup. Not looking at it right now.
I will have a look at it in a couple of months.
Have you got any usage stats for Portugese? I assume not many people are using it now, right?

:anjal:

2 Likes

I can only look into the caches. Right now I see in production:

  • an_pt_beisert_ricardo 16.1MB
  • dn_pt_beisert_ricardo 0.7MB
  • mn_pt_beisert_ricardo 23.2MB
  • sn_pt_beisert_ricardo 3.2MB

Just for comparison, the biggest cache for a translation voice is at the moment:

  • sn_en_sujato_amy 848.5MB

The biggest Pali cache is:

  • an_pli_mahasangiti_aditi 2765.3MB

Pali has generally much bigger caches than translation.

2 Likes

That will work well for us. We are currently looking at English and German pronunciation with heads exploding. After a few months equanimity may grace us again.

3 Likes