SuttaCentral Voice Assistant

karl_lew · September 19, 2018, 4:39pm

Wow. That’s an awesome sutta and also incorporates the numbered section challenge. Much work incoming.

p.s., I really love listening to AN2.1-10. It’s quite … gory.

Aminah · September 19, 2018, 4:41pm

That is marvellous, but how can I share in the gory joy?

karl_lew · September 19, 2018, 4:42pm

That is all you need type. And this also exposes a new requirement to pass a url to a given sutta for SC-Voice. Thanks again!

Aminah · September 19, 2018, 4:44pm

Oh right, sorry, I completely forgot they are given as one file, I was trying to look them up individually.

Shucks, know the feeling! Have a cup cake?

karl_lew · September 19, 2018, 4:46pm

Oh? I didn’t realize that. Hmm. OK. That’s a new requirement, to deduce file location from a specific sutta id. I thought the dash was given.

Aminah · September 19, 2018, 5:00pm

Yeah, now, really, be cautious as to how accurately I can advise over this point, but where most files have individual ids some are grouped as in: https://suttacentral.net/an2.1-10/en/sujato

To the best of my understanding this follows the root texts.

When looking for some info (and for certain tasks) I have to say, I just find Old SuttaCentral much, much easier and this would be one such instance (although, I guess it would actually be even easier to just look at the files in the sc-data repo). Eg. https://legacy.suttacentral.net/an1 will show you at a glance how the files are cut up.

On a separate note, one further possible teak suggestion: I may have missed it’s applicability in certain cases, but in all that I’ve come across so far, it just feels a bit unnecessary to have a drop down for the sub-headings under the title, I’d maybe just have all of that info given.

Aminah · September 19, 2018, 5:09pm

An absolute joy!

I really have to offer the most enthusiastic congratulations I can possibly muster. My ears can be a bit finicky about readings and accents and so on and can find it a bit difficult to not get caught up on those things, it is the most unlikely feat that you’ve managed to produce two voices I feel really comfortable listening to. I hope you will accept my blushing confession in good spirits, but when the idea was initially presented I thought it was really wonderful (especially knowing the what was motivating you), but didn’t think it would be especially relevant to me as it’d just be so unlikely that I’d click with an automated voice. Turns out I was super wrong. Thank you so much.

karl_lew · September 19, 2018, 6:25pm

The condensation of the title block is actually for voice assistance. Most web pages are overwhelming in content for those who need assistance. In the future, the title drop down will have lots of meta information:

translator
button to download MP3 (or OGG for quality)
button to recite ENTIRE sutta (!). This could be a 25Mb listen.
parallels
etc.

The inability to scan visually is what prompts the draconian use of the <details> HTML element. You will also have noticed the expandable sections in the Settings, where most web pages would just slam all those options onto a single page. Screen readers read everything on a page, so progressive disclosure is an essential feature for cognitive navigation.

Some blind people cope with content overwhelm by turning up the speed on their screen readers to a blistering pace. Unfortunately, those blind people will have had years to develop such listening skills. For the aging populace, gradual blindness happens well after one might acquire such skills easily. The compromise is to use the details block with all the drop downs.

That said, I think that although visual assistance is important, SC-Voice may also be of interest to sighted folks. With that in mind, it may make sense to consider an option to “open up” the content for viewing. This could be an auto-sectioning option.

Thanks for all the background on file organization. I hadn’t realized just how complicated things get!

Aminah · September 19, 2018, 6:36pm

Yes, makes total sense. I was hesitant to say anything because it did have the feel of “there’s something purposeful that I can’t quite see yet.”

Viveka · September 19, 2018, 11:10pm

I just had a listen to Amy (AN2.1-10) - absolutely brilliant!

Unfortunately, I have no idea at all, with regards to any of the ‘tech speak’

Is there an index or something about which suttas are available in voice?

sujato · September 19, 2018, 11:42pm

Just to clarify this point, we really try to make one sutta = one html file = one ID. But in some cases it just doesn’t work because the suttas are too fragmentary. So in the case of AN Ones and Twos, and the various peyyala series, each file contains a range of suttas. In many cases, especially the peyyalas, there are hardly any substantive individual suttas in such ranges. But where possible, the individual sutta IDs are contained within the HTML file and can be directly addressed through them.

Gabriel_L · September 21, 2018, 7:58am

Hi,
I am not able to get it to play SN36.31 for me.

The error it reports is

file not found:/home/ubuntu/sc-voice/local/sc/sn/en/sn36/sn36.031.po

Am I doing anything wrong?

Aminah · September 21, 2018, 9:32am

Can’t be certain, but I’m guessing this is the same as the issue reported above (same error message). If so Karl explained:

karl_lew · September 21, 2018, 3:31pm

Releasing v0.2.6 in 4 minutes (8:20PDT). RELEASE COMPLETE

Release notes

The primary purpose of this release is to broaden the content available.

You can now specify voice and sutta in the URL. Here is MN1 by Raveena..

On a nerdy note, the following graph shows the inbound AWS Polly traffic vs. the outbound HTTP traffic to the world. Notice that we only rely on AWS Polly for uncached suttas. Cached suttas are simply replayed without needing AWS Polly.

Releases will be less frequent in the future. Current focus will be on integrating SC-Voice with SuttaCentral.

Gabriel_L · September 21, 2018, 11:37pm

So is that really happening ? If so, that’s great news indeed!!

karl_lew · September 22, 2018, 12:09am

Aminah has added a Github issue for us here.
Additionally, I’ve reached out to PaliAudio for the best way to integrate access their content and proposed that we might discuss such integration in this forum. And Blake has kindly shed light on the proper API for SC REST access to suttas so that SC-Voice can access suttas such as Snp1.8, which isn’t part of the vast resource of Bhante Sujato’s translations on Github.

These are early days yet with “many miles to go”.

sujato · September 22, 2018, 12:48am

It is, but for the moment we will just have a simple integration, probably just a link on each sutta card pointing to voice.suttacentral.net.

Excellent. Also just to let you know, I have been supplied by a set of over 3000 sutta recordings by Ven Buddharakkhita, a monk staying here at Bodhinyana. He reads them with a lovely Irish lilt! But, unfortunately, as they are recordings of Ven Bodhi’s translations, they are under copyright and most of them cannot be published. We will select the texts that have been released under Creative Commons—about 10%—and can add them.

Meanwhile, as we add more sources, we need to consider how that will impact the UI. Essentially we need to supply some more metadata for each recording, at minimum:

Translator
- Let the user know what is human and what is machine.
Reader
Acknowledgement or link to original source where applicable.

Some other UI thoughts!

As far as theming goes, it’s actually not that dissimilar to SC, maybe use our gold color as an accent somewhere?
I’m not convinced by the addition of the Buddha image in the background. It seems like unnecessary bandwidth for a voice-oriented site. Also, when it comes to accessibility, anything at all distracting is likely to impact a certain subset of users.
Perhaps remove the default font size, so that font-size is entirely set by user agents? This might help visually impaired users who like to set a large font size.
I wonder what the thought is behind making it a dark theme by default? My understanding was that black on white is generally considered the most accessible. But again, certain users will definitely prefer white on dark, so maybe add a theme selector?
A more advanced functionality, but I wonder if we could do something like scroll to match the reading? Or display text per-segment as it is being read? Or maybe enlarge or highlight the text being read? Then a user can full-screen it with large font and follow along. Like Dhamma karaoke!

Viveka · September 22, 2018, 1:07am

@sujato @karl_lew
If this kind of thing is helpful, I would be more than happy to record readings of the Sujato translations of the Suttas, and send them here. It is something that my non-computer skills are capable of, and that would be a pleasure to do.

If this would be helpful, just let me know 1) what format the recordings should be in, What suttas, or which texts to start with
I’d consider it a wonderful project to undertake. I could send a trial recording, so that you can assess clearness of voice (I don’t have much of any accent having been raised multi-lingual).

Anyway - the offer is there

sujato · September 22, 2018, 1:30am

Well, that would be just wonderful!

Thanks so much!

My thought would be that it’s best to start with AN or SN, as the other recorded suttas are mostly from DN and MN.

For the technical side of things, best to let Karl specify what he needs.

The only thing that I would say is that it would be best to invest in quality recording input up front. Recording will take a lot of time, and it’s worth ensuring that the source files are best quality. I have almost zero experience on recording for computer, but for any kind of quality recordings the mic is the critical element. Unless you have one already, I would strongly recommend to invest in a good mic before starting. If cost is a problem, SC can buy it for you. I don’t know much about this area, but something like this looks like a good place to start. Shure is legendary!

http://www.shure.com/americas/products/microphones/motiv/mv5-condenser-microphone-for-ios-and-usb

Viveka · September 22, 2018, 1:36am

Okkies,

@karl_lew, I’ll let you direct me to the equipment and methods that will work best, and then roll up my sleeves