SuttaCentral Voice Assistant

karl_lew · November 1, 2018, 2:00pm

SuttaCentral Voice Assistant v0.8.0 release at UTC 14:05 (~5 minutes) COMPLETED

Features

Automatic sectioning: Suttas will be broken up into sections automatically. Section breaks are determined by text segment leading numbers as well as maximum number of segments in a section. Sectioning will help assisted users navigate long suttas such as DN33.
Sutta Player: audio player that recites sutta in both Pali and English, segment by segment with Previous, Play/Stop, and Next controls.

Bugs

SC-Voice fails with kamma Error: Maximum text length has been exceeded when searching for kamma. Fixed by eliminating redundant SSML. Hopefully there won’t be a longer Pali segment. This fix only reduces SSML by about 1/4, so it is possible that a longer Pali segment would also blow up.

This release has changed the way I study the suttas. Now I regularly listen to the Pali text together with the English text. In this way I am slowly learning Pali–it simply seeps in as I listen. This is how I learned German in college. I listened to alternative native/translated language segments. Hopefully this new feature may be of some use to others as well.

The auto-sectioning feature has also proven important, since I can now hop around a sutta like DN33 with my eyes closed.

Up next I will be looking at playing collections of suttas as @anon87721581 has kindly suggested. This will simplify the listening of collections such as Saṃyutta Nikāya, which are sub-divided into chapters. The proposed implementation will be twofold:

search will now permit a chapter designation such as SN22, and will display in numerical order all suttas in that chapter.
there will be a Play List button in search results that will allow you to play your search results.

As always, please report bugs, inconsistencies, mispronunciations (especially Pali!). Also, now is a good time to request features as we converge on version 1.0.0, which will be the stable version of SuttaCentral Voice Assistant. With the release of version 1.0.0, SuttaCentral Voice Assistant will switch to maintenance mode for bugs and mispronunciations and content updates.

suaimhneas · November 3, 2018, 4:44pm

I tend to listen to these things offline (download MP3 files or whatever and put on a music player, still have an ipod). I’d say the optimal download unit for the SN for MP3s would be the sub-chapter. A full chapter/vagga with perhaps hundreds of suttas would be too much for a single mp3. It’s nice to download individual SN suttas (perhaps often 2 minutes of listening each). It would be nice though, at some stage, if there was an option just to download an MP3 for a single subchapter (usually around 10 suttas for perhaps half an hour of listening).

karl_lew · November 4, 2018, 1:58pm

I have been listening offline to DN33 which is two hours and 41.15MB. For me that is one full day of meditation broken up into morning and afternoon. That might (?) be as large as one might want. DN33 is 5884 lines. In comparison, SN22 is 40531 lines, almost seven times larger. To subdivide that rather large chapter into subchapters, I would propose that we allow sutta ranges such as SN22.120-139 or SN22.135-155. There might also be a need for a list of suttas such as DN33, MN44, SN22.1-10.

I’m glad you requested offline listening specifically. That is actually a separate feature. Currently the Sutta Player composes audio for each segment on demand as you listen. The brief pause to fetch the audio seems to be tolerable and may even be somewhat helpful for listening comprehension. However it does require a persistent internet connection. That persistent internet connection does not permit offline listening. For offline listening, an entire MP3 must be generated for downloading. Generating MP3’s takes time and may take minutes to complete. I’ll need to put an upper limit on such computation. For example, imagine requesting SN22, then waiting 5 minutes for a 300MB MP3, then throwing it away because it was too large. Yukky for all.

I’ll tackle playlists for SuttaPlayer first for v0.9.0 and tackle playlist download for v1.0.0. The audio cache for SuttaCentral Voice is reaching 1GB and we’ve chewed up 50% of our disk allocation so far. I’ll need to make the cache more efficient by cleaning out large MP3s that arise out of custom playlist downloads. This is a good change to make for robustness, but it’s also a bit of work so I’ve deferred it until now. We’ll deal with that for v1.0.0.

Here’s the release plan

suaimhneas · November 4, 2018, 6:10pm

Thanks for all the work on this. The idea of specifying sutta ranges sounds like a good solution for the SN. Possibly one way around having the app not chew up too much cache space when generating the MP3s would be to only allow MP3 generation and download for limited ranges (perhaps a limit of 20 suttas in a range in the SN). The majority of subchapters wouldn’t exceed that length (there are a few places in the SN where they do but 20 suttas is still an adequate download size and much more convenient than one-by-one download).

karl_lew · November 11, 2018, 3:24pm

Support for playlists is surprisingly involved. I was hoping to release something this week but that is not possible and will need at least another week. I’ve realized that folks will want all the following:

SN29 find all suttas in SN29.
SN29.1-15 find all suttas from SN29.1 to SN29.15 (noting that SN29.15 is actually in SN29.11-20)
MN1-3 find suttas MN1, MN2, MN3
mn3, mn2, mn1 find suttas to play in listed order
mn2,mn2,mn1 play MN2 twice followed by MN1 to gain deeper insight into MN2 while reviewing MN1
mn1/en/sujato, mn1/en/bodhi play suttas by two translators for comparison
etc.

Rather than handle each of these incrementally, I’ve found it easier to think about them as a single design with multiple use cases. If I’ve missed anything in the above considerations, let us know…

sujato · November 12, 2018, 12:27am

Kark, this all sounds great. I have a couple of questions/requests.

One thing that I don’t expect you to do, but which it would be nice to see in the design, is the capacity to expand to other languages. In SC, we do this simply by keying off the ISO codes. So I’m wondering if this is something that the current design will allow.
What about straight Pali texts? We have some nice recital of Pali (and possibly other languages). I’m wondering if we can support this?

I apologize if these things have been discussed, I can’t remember everything!

Once we are ready for 1.0, I’d like to invite you to meet with our development team, and present the app to them. They’d love to learn about your work, and this will help us to further integrate voice.suttacentral with the main app.

Currently I am proposing that on the main site we place an icon for audio. Click that, and it takes you to the relevant sutta on voice.suttacentral. That should be a nice simple way to promote the voice site.

I had considered integrating more advanced audio functions on the main site, i.e. the ability to play a sutta straight from the main site. But the more I thought about it the more complex it became, until essentially we’d have to embed voice.suttacentral into the main site. Which would be bad! So anyway, let me know if you have any thoughts about this.

karl_lew · November 12, 2018, 1:21pm

Yes. This is already in the design. ISO codes are used throughout SC-Voice. For example, Aditi uses “hi-IN” for text-to-speech.

The current SC-Voice has an option for “Show only Pali text”. The Sutta Player will speak only Pali with that option. The default is to “Show both Pali text and translated text”. With the default option, each text segment is spoken first in Pali, then in the language of translation (e.g., “en”). If the source Pali texts are also segmented as returned from SuttaCentral API, then the SuttaPlayer will play them as well, since the SuttaPlayer is just looking for the ISO “pli” branch of the text segment. In this manner we can all recite together in Pali. My own inclination would be towards bilingual text segments (e.g., pli+en) that teach meaning of and speaking of Pali. My hope is that we all come to chant Pali. Pali is an excellent language for chanting–the sounds of the language are naturally and consistently spoken with a steady stream of exhalation that promotes mindfulness and immersion. English somehow lacks the sonority of Pali and seems more suited to expressing ideas in writing rather than chanting.

I really look forward to meeting the team. I prefer working in teams. This will be quite interesting and will require sharing desktop screen. I think Hangout does that. It’s good to wait until 1.0. Currently the internal design has not stabilised (i.e., witness all the hair-pulling I’ve gone through with playlists). But in a few weeks I think we’ll have a stable 1.0. I have started realizing that the SuttaPlayer is actually what delivers the core value of SC-Voice, since it speaks and shows text segment by text segment. Thank you to both you and @Aminah for nudging SC-Voice into this important direction.

Yes. Embedding isn’t really workable because of all the cognitive dissonance with the clashing UI approaches. SuttaCentral is designed for rich visual interaction. SuttaCentral Voice is designed for the blind. Combining the two would be a big mess. However, your sutta audio link suggestion is minimally invasive and elegant. Indeed, one might call it … “iconic”.

One topic that may be of interest is how SC-Voice search may help SC search. SC-Voice search understands text segments, and can provide smaller result sets quickly for certain queries. We might discuss an SC search option that allows SC to offer similar search results. This would be a simple invocation of the SC-Voice search API without any UI changes to SC.

anon87721581 · November 13, 2018, 4:06am

Is this the birth of an AI Goddess ? Perhaps one day she can translate pali texts on her own ?

sujato · November 13, 2018, 9:17am

Thanks for the answers, I have not been following development closely enough, obviously!

Yes, usually we use hangouts. Our only problem will be timing, as we are in the States, Europe, and Australia.

Cool, excellent, I had had the same thought.

karl_lew · November 13, 2018, 2:31pm

AI can indeed be used to help ensure a consistent translation. However since AI lacks any personal experience in suffering or its remedy, I fear that we will be stuck with humans for the foreseeable future. Humans are brilliant in their ability to generate new ways to suffer. This is why we need humans to translate the Dhamma.

anon87721581 · November 13, 2018, 2:32pm

anon87721581 · November 14, 2018, 3:45pm

@karl_lew

Hi karl,

Is there a way to download a whole set of suttas at once ?

karl_lew · November 14, 2018, 5:18pm

Playlists have been tricky to implement. ETA on multiple sutta download of a single MP3 is December. Release Notes · sc-voice/sc-voice Wiki · GitHub

That said, I’m really excited to be working on this feature since it opens up so many study possibilities!

anon87721581 · November 14, 2018, 5:44pm

Wonderful !

Aminah · November 16, 2018, 12:50pm

Karl, at this point, after all your extraordinary work you’ve done (and particularly having some vague sense of the extra trickiness playlists have presented) I’m loath to, well, do anything other than offer a standing ovation let alone make any feature requests. And yet… … I wonder how you feel about the idea of adding a pause facility?

karl_lew · November 16, 2018, 4:07pm

Aminah, please say more?

Downloaded MP3’s pause when I double tap my AirPods or remove them from my ears. I can also pause the iPhone by pressing Pause.

On the SuttaPlayer, simply hit the space bar–the Pause/Play icon button is the keyboard focus so pressing the spacebar toggles between Pause and Play.

If there are other use cases for Pausing that I have missed, please let us know!

Aminah · November 16, 2018, 4:20pm

Do you know, I even completely forgot that above I praised the pause function I accidentally found by hitting the space.

It seems that somewhere within iterations this might have been lost (or else something is screwy at my end), because this doesn’t work for me any more. Also, when I commented this morning (as per the above, just doing things quickly, I totally forgot about the space and just tried to hit the play icon again (in some apps play/pause functions are assigned to the same button, sometimes with icons switching to match whatever mode is currently active), but found the only way I could make it stop was refreshing the page.

(I’m only describing desktop browser use).

Aminah · November 16, 2018, 5:04pm

Back again!

In connection to (the latter part of):

This stops the reading rather than skipping over. I’m guessing it’s not a very frequent issue, but just thought I’d flag it up.

karl_lew · November 16, 2018, 5:17pm

Hmm. Most perplexing. It does work for me provided that the Play/Pause icon has the light blue highlight that indicates focus. If the Play/Pause icon loses focus, then the space won’t work anymore. Do you see the light blue highlight when clicking spacebar?

Oh. And I just realized that there is another place where sound is played. In the search results you will see a speaker icon. It is actually possible to hit multiple speaker icons and you will hear a chorus of voices. The only way to stop a particular voice is to hit the speaker button for that voice or refresh the page as you did. Were you in the Sutta Player or search results?

karl_lew · November 16, 2018, 5:18pm

Aha. I just noticed that myself today. Thanks!