SuttaCentral Voice Assistant

As folks are diving into the spoken suttas, mispronunciations are surfacing.

First, thank you all very much for listening so mindfully!
Second, here is a list of current mispronunciations that will be fixed in the next release.

Raveena will always pronounce Pali better than Amy. This is because Raveena’s phoneme basis is larger than Amy’s and includes phonetic support for “bh” and other subtleties. This is why all the Pali search results are spoken by Raveena. Interestingly, I’ve personally started to learn Pali just by listening to Raveena speak search results. As you search for stuff, please do listen to Raveena’s Pali quotes–there should be ample room for correction here.

:pray:

5 Likes

SuttaCentral Voice Assistant v0.7.0 Release at UTC 1435 (~5min) RELEASE COMPLETED

  • Romanized Pali search: Now you can type “jhana” and directly get Bhante Sujato’s translations. When a search pattern consists entirely of [a-z] characters, it is expanded into a full regular expression with Pali characters. For example, if you enter “jhana”, then SC-Voice will search for “jh(a|ā)(n|ṅ|ñ|ṇ)(a|ā)” anywhere in the sutta file, which includes Pali as well as English. Bhante Sujato translates jhāna as absorption, and the search results will show absorption in the English quote and jhāna in the Pali quote (e.g., AN10.72). Similarly, entering “sariputta” will find “Sariputta” or “Sāriputta”.
  • DN33 voice failure: At two hours spoken by Amy, DN33 exceeded SC-Voice internal limits. DN33 is now downloadable for offline listening
  • Mispronunciations: ariyasaccan'ti, preoccupations, purifications, immeasurables, purities, staunchly, smith’s (and about 350 more corrections)
  • Downloader alert: Chrome download is silent for assisted users, so SC-Voice now has its own download alert that notifies user if download has completed or timed out after 60 seconds

With this release, we can now search for things like “third jhana” as a mixture of romanized Pali and English:

The next release will wrap up this feature cycle with auto-sectioning so that assisted users can traverse huge suttas such as DN33. With that release. SuttaCentral Voice features will be frozen as v1.0.0 and followed with minor releases for mispronunciations. If you have any feature requests, please make them now while the code is understood and before it evaporates from my brain–it takes a while for the future me to understand what the past me was doing.

Bhante @sujato, I’ll be submitting the v0.8.0 release to you for review and approval for re-release as v1.0.0. The v0.8.0 release will just have suttas sectioned by leading numbered text segment (with the section UI shown in MN1). I can easily use SC-Voice with my eyes closed or half-open, so I think it works for at least one assisted user. I no longer use grep on the command line since SC-Voice does better. I will be spending my time listening to the suttas and will fix mispronunciations as discovered but have no further plans after v1.0.0 unless there is additional need.
:pray:

6 Likes

As ever, superb, superb, long-going applause!

:anjal:

Well, I’m terribly sorry to abuse this kind invitation, but I’ll throw everything I can at you (users are just horrible! :grinning:)…

The one thing I really would appreciate above all, would be if my voice preference could be remembered globally.

After that more some casual thoughts you shouldn’t take too seriously:

  • an option to choose if suttas opened from search results open in a new window or not.
  • an option for segment highlighting as the sutta is read
  • this may 1) come out in the wash with integration with SC ‘main’, or 2) be too much bother to bother with, but friendly URLs would be lovely. It’d be great if eg. I wanted to send someone a link to a particular sutta I’d be able to do so knowing a simple pattern as with SC.
4 Likes

Aminah, you are horribly kind in the best way. :pray:

This one is subtle and we’ll need to explore. Currently SCV has two voices for sutta recitation (SlowAmy and FastRaveena) and a third voice (SlowRaveena) for search results. The sutta recitation voices are remembered globally in the URL as iVoice=0 for SlowAmy and iVoice=1 for FastRaveena. This means that simply bookmarking a page freezes your preferences. To bookmark an empty SCV page with your global preferences, just click on the upper left Dhamma wheel to clear the page, then set your preferences and bookmark the result. Now you’ll always have the page set up as you like. You can even mail your urls to other people so that they can hear and see your exact setup. I think (?) you can get what you want globally by doing this rather roundabout trick of bookmarking your preferred blank search page. Perhaps an easier way will occur to one.

Oh and if the URL’s are long and cumbersome, just use goo.gl or any other URL shortening website. Here’s an example of DN33: https://goo.gl/nYVK9r

The search results currently use SlowRaveena (a third voice) to speak the Pali quotes (which are quite the tongue twisters). Sadly, Amy butchers Pali search results rather horribly and I dread offering her as a viable option (she simply lacks the phonemes and would assault all our ears to the point of demented screaming). However I was lazy and applied SlowRaveena to the English search quotes as well since I didn’t think anyone would actually click the English search results (the English quote button is actually not available for assisted users who will presumably be suffering with the vocalizations of their screen reader (which is truly horrible with Pali). Are search result voices fine as is or did you want customization of search result voices such as global Amy for English search results?

Ah! I can easily make search results take over current window. This gives you the simple option of CTRL-click to open up a new window. I’d assumed that everybody wants a new window, but will change that for v0.8.0.

I now have a vague idea of how this might be possible. One way to handle this would be to have a sutta player that shows adjacent quotes (just like a search result). When a quote is spoken, the player would automatically bring up and read aloud the next segment. You would see quotes in succession as they are spoken. Such a player would have Back, Stop/Play, Next buttons. It would also recite in English or Pali continuously depending on your selection of Play buttons (one for English and one for Pali as in search results). Is this vague vision of interest?

I too would like the linking of voice.suttacentral.net to http://50.18.90.151. Only @blake and Bhante @sujato can make this possible. I cannot do this. It requires domain owner permission.

Yes. It’s not quite as easy as http://sutta.central.net/dn33, which can be typed without needing the website. Once we get the voice.suttacentral.net link I can start experimenting with a similar syntax

5 Likes

Karl, you are working at such a fast pace that I can’t keep up to follow. I highly appreciate your work, but unfortunately am not able to test and give feedback. I simply can’t find the time. :stopwatch:

But what you do is of high value to me! :heart:

5 Likes

:heart_eyes: How marvellous! Thanks! It’s like Christmas and puppies and I didn’t even have to wait for a new release. This is brilliant.

For me it’s not so much that they are too long and cumbersome, it’s that they’re not workout-able. I guess it might make more sense if I mention the actual example case that first made me think about it. I was talking to someone about updating the reading guide thread that lists the suttas given in Bhikkhu Bodhi’s In the Bhuddha’s Words with sutta links (as the D&D sutta links no longer work); it only takes a minute to regex them in. At that point I thought, it would be kind of neat to link to audio versions and that is how I came to look at SCV’s URLs.

I only mention this by way of explanation, and already presumed it was likely not waters to step into now, that is to say,

Amen! :grin:

Most certainly, but I’d emphasise the ‘vague’ and I’d not let it knock you off course with core development. It’d be nice to have but I don’t think essential. Again to give you the ‘backstory’: the readings are pretty incredible, but as we know some pronunciations are a bit off. I was listening to a sutta and wasn’t sure about a word so just wanted to scroll down and see where the reading had got up to to check the mystery word.

I can’t second this heartily enough. I know your original impetus was to assist people with visually impairments. My sight is actually pretty good at the moment, but I am dyslexic and can really find reading a bit of a mission. It is such a gift to me to have the option of just wiping out a layer of strain.

3 Likes

Makes total sense. I like Bhante Sujato’s suggestion of an audio link on the sutta card. A sutta card is the gateway to all resources for a sutta. This means that simply using the SC link as you have been doing would provide access to all the audio versions. Such an implementation would probably fall to @blake and/or the UI design team for SC itself under Bhante Sujato’s gentle guidance.

Perfectly reasonable. I’ll add it to the list for v0.8.0. I would like that as well for the same reason.

Ahhhhhhhh. Thanks for the clarfication. I just came back from walking meditation and was wondering if you were thinking about hearing and seeing the suttas in a way analogous to the way kasinis engage the eyes. With voice technology, you could actually sit meditation and simply regard the suttas spoken in both English and Pali with corresponding text on the screen in front of you. Imagine doing just that for two hours with DN33 (!). I can’t read that well right now, but I think that would be quite the immersive experience. Just walking to DN33 I’ve started noticed odd things like a particular street marks the hearing of the start of the “Three’s” section. One will eventually be able to walk the streets in one’s head to hear the suttas. :hushed:

4 Likes

Just to say another sadhu! along with everyone else, and to confirm that, as soon as you are ready to go, we will be happy to shift to voice.suttacentral.net.

As far as integration with the main site goes, we can launch a basic integration, simply mentioning Voice on the Home page, quite quickly, and deeper integration when we can.

It would be really nice to find some ways to promote Voice through places that would reach people with visual disabilities. I wonder how that could be done?

6 Likes

Bhante, thank you for your kind offer. I think we are ready to link voice.suttacentral.net to static IP 50.18.90.151. This will provide a single point of reference for those who wish to memorize. It will also provide a single point of reference for SuttaCentral content links. With this single DNS binding, we can easily move to different cloud vendors, etc. in the future.

I did some research on this and found a mixture of things:

  • organizations such as American Foundation for the Blind (AFB) will focus their site on resources to help all blind people rather than any particular religious organization.
  • We could post an answer on Quora but it’s a bit random.
  • The Jewish solution for this problem is well-developed and suggests that resources for visual assistance should be offered within the context of the hosting religious site, such as SuttaCentral or AccessToInsight. I would recommend this contextual approach as most useful since it specifically targets blind people interested in Buddhism.

Before approaching AccessToInsight, I believe we should follow your recommendation of a link on the Home Page, perhaps under a section for Communities and/or Resources. The design and layout of any Home page is quite subtle and balances many considerations–which puts it “well above my pay-grade”. Once we are happy with a SuttaCentral solution, one might approach AccessToInsight to see if they would be interested in working together on resources for the visually impaired. The approach taken by SuttaCentral Voice rests squarely on the text segmentation you have pioneered and the discussion with AccessToInsight would potentially become intricate for that very reason. SuttaCentral Voice does provide links to other audio resources such as Pali Audio and frankk’s recordings (with more to come from Viveka and others). However, I think the folks at AccessToInsight might be interested in spoken versions of their own translations shown and played from within SCV itself. Hence the requisite intricacies of what to do about text segmentation and/or copyright.

5 Likes

SuttaCentral Voice Assistant v0.8.0 release at UTC 14:05 (~5 minutes) COMPLETED

Features

  • Automatic sectioning: Suttas will be broken up into sections automatically. Section breaks are determined by text segment leading numbers as well as maximum number of segments in a section. Sectioning will help assisted users navigate long suttas such as DN33.
  • Sutta Player: audio player that recites sutta in both Pali and English, segment by segment with Previous, Play/Stop, and Next controls.

Bugs

  • SC-Voice fails with kamma Error: Maximum text length has been exceeded when searching for kamma. Fixed by eliminating redundant SSML. Hopefully there won’t be a longer Pali segment. This fix only reduces SSML by about 1/4, so it is possible that a longer Pali segment would also blow up.

This release has changed the way I study the suttas. Now I regularly listen to the Pali text together with the English text. In this way I am slowly learning Pali–it simply seeps in as I listen. This is how I learned German in college. I listened to alternative native/translated language segments. Hopefully this new feature may be of some use to others as well.

The auto-sectioning feature has also proven important, since I can now hop around a sutta like DN33 with my eyes closed.

Up next I will be looking at playing collections of suttas as @anon87721581 has kindly suggested. This will simplify the listening of collections such as Saṃyutta Nikāya, which are sub-divided into chapters. The proposed implementation will be twofold:

  1. search will now permit a chapter designation such as SN22, and will display in numerical order all suttas in that chapter.
  2. there will be a Play List button in search results that will allow you to play your search results.

As always, please report bugs, inconsistencies, mispronunciations (especially Pali!). Also, now is a good time to request features as we converge on version 1.0.0, which will be the stable version of SuttaCentral Voice Assistant. With the release of version 1.0.0, SuttaCentral Voice Assistant will switch to maintenance mode for bugs and mispronunciations and content updates.

9 Likes

I tend to listen to these things offline (download MP3 files or whatever and put on a music player, still have an ipod). I’d say the optimal download unit for the SN for MP3s would be the sub-chapter. A full chapter/vagga with perhaps hundreds of suttas would be too much for a single mp3. It’s nice to download individual SN suttas (perhaps often 2 minutes of listening each). It would be nice though, at some stage, if there was an option just to download an MP3 for a single subchapter (usually around 10 suttas for perhaps half an hour of listening).

2 Likes

I have been listening offline to DN33 which is two hours and 41.15MB. For me that is one full day of meditation broken up into morning and afternoon. That might (?) be as large as one might want. DN33 is 5884 lines. In comparison, SN22 is 40531 lines, almost seven times larger. To subdivide that rather large chapter into subchapters, I would propose that we allow sutta ranges such as SN22.120-139 or SN22.135-155. There might also be a need for a list of suttas such as DN33, MN44, SN22.1-10.

I’m glad you requested offline listening specifically. That is actually a separate feature. Currently the Sutta Player composes audio for each segment on demand as you listen. The brief pause to fetch the audio seems to be tolerable and may even be somewhat helpful for listening comprehension. However it does require a persistent internet connection. That persistent internet connection does not permit offline listening. For offline listening, an entire MP3 must be generated for downloading. Generating MP3’s takes time and may take minutes to complete. I’ll need to put an upper limit on such computation. For example, imagine requesting SN22, then waiting 5 minutes for a 300MB MP3, then throwing it away because it was too large. Yukky for all.

I’ll tackle playlists for SuttaPlayer first for v0.9.0 and tackle playlist download for v1.0.0. The audio cache for SuttaCentral Voice is reaching 1GB and we’ve chewed up 50% of our disk allocation so far. I’ll need to make the cache more efficient by cleaning out large MP3s that arise out of custom playlist downloads. This is a good change to make for robustness, but it’s also a bit of work so I’ve deferred it until now. We’ll deal with that for v1.0.0.

Here’s the release plan

4 Likes

Thanks for all the work on this. The idea of specifying sutta ranges sounds like a good solution for the SN. Possibly one way around having the app not chew up too much cache space when generating the MP3s would be to only allow MP3 generation and download for limited ranges (perhaps a limit of 20 suttas in a range in the SN). The majority of subchapters wouldn’t exceed that length (there are a few places in the SN where they do but 20 suttas is still an adequate download size and much more convenient than one-by-one download).

2 Likes

Support for playlists is surprisingly involved. I was hoping to release something this week but that is not possible and will need at least another week. I’ve realized that folks will want all the following:

  • SN29 find all suttas in SN29.
  • SN29.1-15 find all suttas from SN29.1 to SN29.15 (noting that SN29.15 is actually in SN29.11-20)
  • MN1-3 find suttas MN1, MN2, MN3
  • mn3, mn2, mn1 find suttas to play in listed order
  • mn2,mn2,mn1 play MN2 twice followed by MN1 to gain deeper insight into MN2 while reviewing MN1
  • mn1/en/sujato, mn1/en/bodhi play suttas by two translators for comparison
  • etc.

Rather than handle each of these incrementally, I’ve found it easier to think about them as a single design with multiple use cases. If I’ve missed anything in the above considerations, let us know…

5 Likes

Kark, this all sounds great. I have a couple of questions/requests.

  1. One thing that I don’t expect you to do, but which it would be nice to see in the design, is the capacity to expand to other languages. In SC, we do this simply by keying off the ISO codes. So I’m wondering if this is something that the current design will allow.
  2. What about straight Pali texts? We have some nice recital of Pali (and possibly other languages). I’m wondering if we can support this?

I apologize if these things have been discussed, I can’t remember everything!

Once we are ready for 1.0, I’d like to invite you to meet with our development team, and present the app to them. They’d love to learn about your work, and this will help us to further integrate voice.suttacentral with the main app.

Currently I am proposing that on the main site we place an icon for audio. Click that, and it takes you to the relevant sutta on voice.suttacentral. That should be a nice simple way to promote the voice site.

I had considered integrating more advanced audio functions on the main site, i.e. the ability to play a sutta straight from the main site. But the more I thought about it the more complex it became, until essentially we’d have to embed voice.suttacentral into the main site. Which would be bad! So anyway, let me know if you have any thoughts about this.

6 Likes

Yes. This is already in the design. ISO codes are used throughout SC-Voice. For example, Aditi uses “hi-IN” for text-to-speech.

The current SC-Voice has an option for “Show only Pali text”. The Sutta Player will speak only Pali with that option. The default is to “Show both Pali text and translated text”. With the default option, each text segment is spoken first in Pali, then in the language of translation (e.g., “en”). If the source Pali texts are also segmented as returned from SuttaCentral API, then the SuttaPlayer will play them as well, since the SuttaPlayer is just looking for the ISO “pli” branch of the text segment. In this manner we can all recite together in Pali. My own inclination would be towards bilingual text segments (e.g., pli+en) that teach meaning of and speaking of Pali. My hope is that we all come to chant Pali. Pali is an excellent language for chanting–the sounds of the language are naturally and consistently spoken with a steady stream of exhalation that promotes mindfulness and immersion. English somehow lacks the sonority of Pali and seems more suited to expressing ideas in writing rather than chanting.

I really look forward to meeting the team. I prefer working in teams. This will be quite interesting and will require sharing desktop screen. I think Hangout does that. It’s good to wait until 1.0. Currently the internal design has not stabilised (i.e., witness all the hair-pulling I’ve gone through with playlists). But in a few weeks I think we’ll have a stable 1.0. I have started realizing that the SuttaPlayer is actually what delivers the core value of SC-Voice, since it speaks and shows text segment by text segment. Thank you to both you and @Aminah for nudging SC-Voice into this important direction.

:heart:

Yes. Embedding isn’t really workable because of all the cognitive dissonance with the clashing UI approaches. SuttaCentral is designed for rich visual interaction. SuttaCentral Voice is designed for the blind. Combining the two would be a big mess. However, your sutta audio link suggestion is minimally invasive and elegant. Indeed, one might call it … “iconic”. :rofl:

One topic that may be of interest is how SC-Voice search may help SC search. SC-Voice search understands text segments, and can provide smaller result sets quickly for certain queries. We might discuss an SC search option that allows SC to offer similar search results. This would be a simple invocation of the SC-Voice search API without any UI changes to SC.

6 Likes

Is this the birth of an AI Goddess ? Perhaps one day she can translate pali texts on her own ?

1 Like

Thanks for the answers, I have not been following development closely enough, obviously!

Yes, usually we use hangouts. Our only problem will be timing, as we are in the States, Europe, and Australia.

Cool, excellent, I had had the same thought.

2 Likes

AI can indeed be used to help ensure a consistent translation. However since AI lacks any personal experience in suffering or its remedy, I fear that we will be stuck with humans for the foreseeable future. Humans are brilliant in their ability to generate new ways to suffer. This is why we need humans to translate the Dhamma.

4 Likes

:smile: :smile: :smile:

4 Likes