Tts option

bksubhuti · April 7, 2019, 6:57am

It was brought to my attention that you have a tts interface. Why not make it standard for all the suttas that one views? The current interface brings you to a different page. it would be better if were better integrated into the sutta page viewing systems

I recently created a whole tts book. Perhaps you could do that with your voices.
Note: I prefer suttas to be read in a male voice because that was “written” in a male voice and mostly the Buddha.

But what voice are you using ? Usually you do open source stuff. I’ve never heard of an open source voice that good.

I’d like to see the books made into tts… and also a fast and slow version compiled… Once you get used to it… it is better to have a faster reader.

For now, I like my version because I can put it on my phone or sony voice recorder and walk with it.

Please consider releasing a book.

sabbamitta · April 7, 2019, 3:28pm

Thank you for making aware of your development. It is so great that various approaches for text to speech are popping up lately!

The SC-Voice interface has been developed especially to serve the needs of blind and visually impaired users; each supported Sutta on the SC main site has a link to the respective Voice page on its Suttaplex card, so users can choose whether they prefer to read or to listen. SC-Voice is an independent development, but is integrated into the main site to this degree right now and might be more in the future.

SC-Voice uses voices from AWS Polly with some adjustments for pronunciation errors. There is one voice for Pali (Aditi) and three for English (Amy, Raveena, and Russell, each with a different speed).

You can download any Sutta or Sutta playlist from Voice as MP3 with the voice of your choice, either Pali, English, or both languages segment by segment.

So far each of SC-Voice’s voices have their friends, and I am pretty sure your version will also find those who like it! Sadhu for your work!

For the latest developments of SC-Voice check here.

bksubhuti · April 7, 2019, 7:23pm

Thank you for the information. My guess is that you have a deal worked out with Amazon for donated voice streams. If it is possible, please write a script to compile books for download. The voices are very clear sounding and seem to come with free redistribution rights.

If you are paying per letter, it makes sense to compile all of the information once and stream it yourself. If you are not paying, all the more reason to make books.

It would be good if the audio button did not bring you to completely different page and server…

In any case,
Great job on finding quality tools for the website. It is very nice including the main site and also this discussion board.

sabbamitta · April 7, 2019, 8:34pm

@karl_lew and @Aminah, what do you think?

karl_lew · April 7, 2019, 10:46pm

We have no deal with Amazon that affords us special privilege. We are like any paying customer. That said, AWS Polly has billing rates that are quite reasonable. For example, AWS Polly charges:

1,000 requests, 1,000 characters per request 1 million characters ~23 hours, 8 min
$4.00

What Voice does have is caching technology that only issues a single AWS Polly request for a unique text segment, so that “evam me suttam” would only be sent through TTS and billed once. This caching technology is completely open source and we have verified via Aminah that others can indeed set up their own Voice servers to do whatever they wish using the best existing commercial technology. Voice also has a special Pali adaptation for Aditi, which is the AWS Polly hi-IN voice. All this open source and available to you.

To use Voice technology with AWS Polly, you will need to set up your own AWS account and will be billed for your own usage. AWS has a generous 12-month Free Tier for which Voice qualifies, so if you can process the TTS for your book within that 12 months, it may well cost you nothing at all on your own AWS account.

Currently we are investigating Voice Sound Modules (VSM). A VSM is a collection of sounds less that 1GB. We anticipate having one VSM for each Nikaya, translation, and voice combination. VSMs will not require AWS Polly since they will have already been entirely processed through AWS Polly. VSMs are simply the AWS Polly output. The VSM technology will allow us to easily manage differential sound updates for corrections in translations. Only sounds for changed text would need to be updated in a revised VSM. The VSM technology may be of interest to you if you anticipate changes to translated content over time.

Voice will soon exit the AWS Free Tier this year. We have carefully engineered Voice to minimize ongoing costs for Voice in the years to come. For example, we expect to incur charges for passing the Vinaya through AWS Polly in future years. Thankfully, anybody can use Voice to create whatever they wish by simply establishing their own AWS Account. In doing so, they will beneift by the 12-month Free Tier offered by Amazon itself. The Voice team, however, will shortly no longer be able to participate in the Free Tier.

bksubhuti · April 8, 2019, 5:51am

I did not understand the vsm if that meant vsm = collection of aws Polly tts readings for each nikaya or

some type of custom voice building thing or

some type of dictionary of pronounced words.

If the latter is the case, it might not work so well with realistic voice modules since they are tonal unlike my Microsoft David which is not so bad for speed reading.

Yes, I hope you do something quickly (like make books) while it is free and it does seem pretty cheap for 23 hours of license-free voice output of that quality.

My 1999 programming skills (msvcpp and dao) are ancient and dead at this stage. At best , I can do short codes in WordPress :). I also made a robe pattern spreadsheet and a font converter macro for Pali in libreoffice.

Please let me know if you compile books. I hope you do a few speeds as long as it is free.

At 23 hours for $4, if you miss the one year deadline, I would be interested in arranging an allowable donor for some or all of the nikayas.

Aminah · April 8, 2019, 6:05am

In short, to my mind it’s probably not an immediate priority, nor necessarily the best answer for providing offline capability. Also, it’s worth considering in relation to Bhante Sujato’s recording project and the podcast project which might result in something a step or two closer enough to the request (maybe in conjunction with a browser downloader plugin).

sujato · April 8, 2019, 6:54am

Thanks for the feedback and suggestions.

We do aim to provide the texts as EPUB 3.1 audiobooks. 3.1 introduces an audio spec as a native part of EPUB. This works great: it highlights the sentence or segment as it is read. The spec is not widely adopted yet (mainly, so I understand, due to copyright battles, yet again). But some ereaders do support it. It is a great option, as it makes the production of the audiobook super-easy: just give an xml file with the start and stop times of each segment, that’s it.

As for pure audio download, we will add that in due course as well.

The human-read suttas as we make them will be free on github, so anyone can do anything they like with them.

The reason for doing this is that adding an audio interface ends up creating more complexity on an already complex site. It’s not just a matter of putting a button saying “read this sutta”, we have to handle different authors, different speakers, start/stop/seek, and so on. So you either end up either settling for a very crippled interface or replicating most of SCV on the main site. Perhaps in a perfect world this could be done, but, sad to say, it appears that we live in an imperfect world. Given our development constraints, adding a link to SCV is easy and works well enough.

bksubhuti · April 8, 2019, 7:22am

Delete my content and posts

karl_lew · April 8, 2019, 2:05pm

Yes, a VSM is a collection of AWS Polly TTS readings for a translated Nikaya. It is an implementation and distribution consideration, not really an end-user feature. Basically it just means that we can deliver TTS updates for a Nikaya quite cheaply.

Voice.suttacentral.net allows you to specify a list of suttas that you can download for listening as a single MP3. For example, if you wanted to listen to all the suttas on the root of suffering, just click the download link for root of suffering and you will get a custom MP3 audio file having the suttas you chose spoken by the AWS Polly voice you selected in Settings. The default voice is Amy, who speaks slowly and clearly. Some among us prefer Raveena, who speaks a lot faster. For those with high-frequency deafness, Russel is the ideal voice since it has a lower register. You can also choose to download translation alone or interleaved Pali/English for study.

in general, the current “download what you search for” approach has proven quite flexible and allowed listeners to customize their downloads according to study preference. For logistical reasons, the download size is limited to three hours, which means that it should handle any one sutta (e.g., DN33 is two hours long) or a combination of smaller suttas such as those found in AN.

The ability to search for suttas also currently allows downloads such as AN1, which will allow you to listen to or download all the suttas in AN1 as a single MP3.

A more complete solution along the lines of what you suggest (i.e., entire book) would probably be better met by the SuttaCentral EPUB 3.1 audiobooks mentioned by Bhante Sujato. In that case the TTS is performed on the actual hardware device. Although devices such as Kindle do have amazing onboard TTS capabilities, I suspect progress in the hardware area will suffer greatly since it would compete with and undercut Audible, the Amazon audiobooks service.

Voice.suttacentral.net currently downloads MP3’s but we are investigating higher fidelity downloads with OPUS, which will be especially suited for human recordings.

I noticed that there was a recent upload of Microsoft David’s TTS versions of Bhante Sujato’s AN1 translations.. Is this what you meant?

The sc-voice github wiki does include a separate Wiki page for each sutta. For example the MN1 wiki page points to different audio renderings of MN1. The wiki pages are automatically included in the voice.suttacentral.net web page rendering.

The Voice team has discussed offline implementations of voice.suttacentral.net on, for example, Raspberry Pi. This would permit sutta study in such monasteries without reliance on the internet. The offline implementation would support search as well, which might be of value to monastics. However, an offline implementation is a lot of work and will probably not happen this year.

bksubhuti · April 8, 2019, 5:08pm

Delete my content and posts

sujato · April 8, 2019, 10:55pm

Free yourself.

Snowbird · April 11, 2019, 3:49pm

I’ve never tried it, but supposedly the tts feature never left the Kindle when they stopped having audio jacks. You just need an audio adapter. Looks like you can get it from Amazon for USD$20 or make your own for less than that.

When you say ereaders, are you referring to e-ink devices? Or just aps that would need a standard tablet to run?

bksubhuti · April 11, 2019, 8:21pm

Delete my content and posts

sujato · April 11, 2019, 9:56pm

I’m only aware of apps, I haven’t looked into ereaders. Standards compliance is really bad in the epub world, but if we build to spec, hopefully the devices will appear. In any case, if someone wants the feature, all we really need is a single functioning app that supports it.

karl_lew · April 12, 2019, 6:54pm

Ah! That explains why the AN cache got filled! The first time download of any sutta from SC Voice is currently painfully slow as each segment has to be passed through AWS Polly. We are working on technology that will populate the caches automatically and quickly. Right now the worst experience is the first download.

Thank you for creating an index for SC Voice suttas.

bksubhuti · April 12, 2019, 8:10pm

Delete my content and posts