Listen to suttas in order on SC-Voice

sabbamitta · February 23, 2024, 7:31am

Coming from this issue, we may discuss possible options here. @karl_lew

I am describing how to do this with the new SC-Voice app; the old one will probably die soon, but at the moment is also still working.

Currently it is possible to listen to suttas in order by doing the following:

Open SC-Voice.net
In case you do this for the first time, you will see some small green tutorial cards that point out some essential functions; you may follow these in order to get familiar with main functionalities, or dismiss them and read more detailed user instructions in the wiki of the app later on
Open a search card by clicking the icon in the top right corner
Type for example sn1.1, press “enter”
Click on the result SN 1.1 → this opens a sutta card
Play the sutta by using the left play button (plays current segment and pauses) or the right play button (plays from current segment to end of sutta)
When you’ve finished the sutta, click on the sn1.2 link at the bottom on the right side → this opens a new sutta card
You may close cards that you have finished listening in order to have your screen more tidy
There are links to the previous and next suttas both at the top and the bottom of each sutta card

Another option is to get a list of suttas to play on a search card:

Open a search card by clicking the icon in the top right corner
Type for example sn1.1-5, press “enter”; the default max number of results is 5, if you want more you can change it in > general > search results
Click on the first sutta on your search card, then the second, etc.
It is also possible to have playlists that don’t follow the order of the canon, for example by typing sn1.1,an3.5,mn2 on a search card

Another option is to get a topic oriented playlist

Type for example root of suffering on a search card → depending on your max result setting, this will show you up to 7 suttas that talk about the root of suffering

Unfortunately there is no option at the moment to listen to multiple suttas by just one single click.

The old Voice has still the option to download a playlist (of similar sorts as described above for the new Voice) which the new Voice is not (yet) able to do; this may come closer to what you’d like to do?

Please let us know if these options are helpful for you; or else why it is so important for you to listen to multiple suttas by just one single click.

karl_lew · February 23, 2024, 3:29pm

Thank you, Ayya.

When hearing the suttas walking meditation, I would often realize that several suttas were related in their use of a common key phrase. Quite often, these suttas are short, individually not long enough to last an entire walk. So I added the feature to allow me to absorb all the suttas related to one topic on one walk. This steady listening was quite immersive since I could simply walk and listen without having to interrupt my walk to click a play button. Listening in this way I would sometimes be so immersed in listening that I would end the walk having counted a certain amount of breaths and not know quite where I was.

Now that I have a better understanding of the suttas, I no longer need such long listening sessions, but I do have a fond memory of those long walks. As a result, the newer Voice doesn’t have that feature yet because nobody has asked for it. In fact, nowadays, I just listen to DN33 or DN44 if I need a longer listen, and that’s one click.

milkii · February 23, 2024, 7:12pm

I’ve been requested to add my voice, so here I am!

To start from the beginning; I made the suggestion because I’d like to listen to the suttas in the manner of processing an ordered set, as it were. I guess there’s the set of the Nipata/Agama, Nikāya, and Pikata sets

I listen to lots of audiobooks whilst I’m doing household chores or tasks that I can get into the flow of without requiring a large share of cognitive energy. (and at times Radio 4 shows, but there is more flow interupting energy and time required to find and queue up different shows when they usually end after half an hour)

A large part of the reason for this is that I have ADHD, so not eating up all the “cycles” of my mind makes it hard to focus on that general task, which is a) unproductive, b) upsetting, both in affect and mind. Another constitutional trait of ADHD is memory loss, so coming back to the texts over and over is a form of learning aid. Further, there is the trait of hyperfocusing, which can lead someone to forget to stop a task, and having a defined start and end boundary is an aid. Plus, on the memory side again, it’s easier to remember where one finished so one can restart in the correct place.

There is, indeed, the issue that, to misuse my metaphors, “discomfort is the salty truth”, and, as ChatGPT helped me remember/summerise/think about less, there’s potential for the arising of intellectual attachment, frustration and confusion, misinterpretation and doubt, and attachment to views.

My counterbalancing point/rational/user story is the nature/dhamma of “tranquility through pariyatti”, as well as the bootstrap/ladder-pulling up thrust of the rathavinīta sutta. It’s the more wholesome “evil”!

Anyhow. That’s just me

In general terms, and I’m trying to be more laconic than intelligence-insultingly glib (though the Buddha was a bit of a cheeky jester at times, no?); a) that’s how the baskets come to us*, and, b) a linear order is the average way people consume media (and c) as above, accessability, functionality)

*(notwithstanding any rearrangement of order by sects/etc. (from the argued prior āgama orientated orientation that… y’know…))

The overarching UX point/principal is; this is all to be made for the average person. But, plus, it’s also to be made for divergent-from-average-not-sanity person. (Accessability is both for the non-average and the average person, IMHO, because, IDK, you can never tell/predict the forms of human capacity that will encounter the. The example of the cookie config system on the Voice site really sticks out in my memory as a “that’s a most sensible way/framing for everyone” when I first encountered it.

(while I’m talking about such general things and the argued pan user ontology approach to accesaability; QR codes aren’t just to give dopamine hits to thigh-rubbin technologisgs, but help people with movement and/or perception problems access the suttas to, say, have their phone access a page so it can to Text To Speech). you don’t even have to ask ChatGPT this, you can just google “do qr codes help disabled people” if you wish to dive into learning about this understanding)

Anyway. I’ve hyperfocused here and spent over three straight hours composing this message, so I’m going to stop writing/editing, hit Reply now, and stand up and do something else.

sabbamitta · February 23, 2024, 8:04pm

Thank you so much for your message @milkii !

I don’t think this would be an argument not to do something. We prefer to respond to real needs that someone has than to a theoretical use case. So thanks for your explanations!

This sounds like a very valid use case, and I used to do this a lot myself. Currently I am not doing it, simply because I don’t have a suitable device. The tablet that I have doesn’t give me enough output volume so that I could understand the words while doing something else.

The question is now: what would best help you? Would downloading a playlist and then play it as MP3 be a good solution? This is what the old Voice does, with some limitations:

it only creates a single audio file as MP3, so there is no way to navigate to a specific sutta in the list
there is a certain limitation as to the length of the playlist—downloading the entire DN for example wold be way too much

It would certainly be possible to create a similar feature again for the new Voice (though not for tomorrow).

Would that be helpful?

Would it be helpful to have a “repeat” button in order to repeat a sutta again and again when it’s finished, until you stop it?

Just thoughts—not sure how this would look on the technical side. @karl_lew ?

karl_lew · February 23, 2024, 11:44pm

@milkii thank you for taking this time to help us all. What’s most important here is that your usecase is real–you live it. Although common “best practice” is to use clearly defined “persona”, I’d personally like to avoid that artificial process here and simply chat about what works for each of us, as you have just done. Thank you.

I admit to being a bit more of a rebel myself and tend to drive everyone around me completely mad by “doing things out of order”. It helps my brain learn quickly by personal association. The Tipitaka is a bit too linear for me, a bit claustrophobic in its mandate of order. In contrast, having read your post about the ADHD mind, I am inclined to understand that ADHD chaos is the flood that threatens and order the lifeline that guides. From your post, I imagine ADHD as having to herd 100 mental cats every single waking moment to get one thing done.

Although our usecases seem superficially separate, some things stand out that apply to all of us three.

We would all like to listen to suttas around the house. I would suggest a bluetooth speaker. They are convenient and loud enough to cook by. For vacuuming, I use a Bluetooth noise cancelling headphone.
We would all like to listen to suttas in some sequence natural to us. Some would choose a bunch of suttas by search term. Others would prefer a “tipitaka tour” (yes I would enjoy that too when I can’t think of a search term). So we need a quick and easy way to set up a playlist for listening.
We would all like to know where we are and be able to stop at any moment to resume later. We need a bookmark in each of our playlists.
We would all like to know about paths yet to be explored. The Tipitaka has many paths. Which ones have we seen and which ones can we look forward to?

Does that sound about right? I want to be absolutely sure we all agree on the problem we’d like to solve together.

sabbamitta · February 24, 2024, 12:23pm

Or a herd of fleas?

Thank you for summarizing this so well! I fully agree with your problem description. Let’s see what @milkii says.

milkii · February 26, 2024, 6:50pm

Oh yea, I forgot, lol, I’m autistic as well.

The co-morbidity been ADHD and autism is faitly quite large. The combination is funny, cos one goes from wanting order to not being able to create or maintain order. A new term of “AuDHD” has been emerging. I swerved into a stream of creators posting Instagram Reel videos a while back, though I get less now as the algorithm got the assumed point that I know enough about it.

(Another symptom of autism is audio processing problems. For many years I used headphones to listen to music and radio whilst doing chores, but kept garrotting myself. I’m 40 and only got diagnosed for both autism and ADHD, plus Postrual Orthostatic Tachycardia Syndrome (PoTS, autonomic blood pressure issue, with various effects), around two years ago. My first disability benefit purchase I spent on Active Noise Cancelling ear buds, Sony WF-1000MX4, which allowed me to do listen to things without volume change and comprehension issues.)

Anyway. I’m trying to rush this so I can get back to hyperfocus on my Main Thing, and I’ve written a lot then copyedited it right back, then started again a couple of times, and I realise I need more input and to ask questions to best reflect. I’ve only used the old voice, and very briefly searched the repos.

Is there an ARCHITECTURE.md or one thread I can use to grok things? Sorry for being lazy and the labour.

You mention MP3, but I see Opus when searching sc-voice. Opus being the obvious winner, not that Apple makes it easy, with Safari being the only browser that doesn’t support it directly, though I guess the decoding is done at the level of JS or Wasm (what a turning of the wheel Wasm is).

Short files would enable playlists, so cool if the new one is doing that.

Was audio generated on the fly by Amazon Poly before? What was the compute cost? It was or wasn’t cached? What would the storage cost be? How does it work now? What is hosted, what is FaaS (I guess you can call Amazon Poly that?) What is server cached? Has client-side TTS been considered? Has considering that users might well not care about giving a large whack of space to the PWA to have a lot of audio files cached locally for offline use? What is the calculation for each basket? What do browsers allow again in terms of local PWA storage? How much is localStorage, how much is IndexDB? How much does Apple hating on PWAs affect anything?

Playlists; could be a JSON object I guess, or a literal XSPF in localStorage and processed by Wasm.

The user could state if they want to cache lots or not, the maximum amount of space to be made available before the first files in/least used files get removed for more space. The player could skip over playlist entries it doesn’t have a local audio file for, and it could only download them “just in time”, like, start downloading play, or 30 seconds before finishing the previous (it could also start playing files it hasn’t finished downloading.), or one file in advance. The

What is the, if any, integration now or planned between voice and text? And if there are lots of connections between suttas/text/files, what is the view on JSON-LD?

A repeat button would be cool.

“Tour” order playlists sounds very cool. To thread together certain topics.

Aside from a last-played bookmark/placemarker, there’s also the path of multiple bookmarks, but IDK.

I can’t code, and I’ve never used certain development patterns, like behaviour driven development, though I’ve done some UX/IA thought/creating, and some basic use of personas, and the black-box side of QA (usually I’m good at feature ideas and breaking things). But yeah, I don’t like personas, though I do like user stories as the human side of the minimum-viable workflow of a use case, or sommit like that!

And I prefer the newer best practice of using “good practice” over “best practice” given there are often more than one “best” ways

sabbamitta · February 26, 2024, 8:19pm

Thank you @milkii again for you detailed reply.

I am not sure if I can answer all your questions, as I am not very familiar with all the technical things that you mention.

But as it happens, @karl_lew and I have just been discussing about the requested feature, and we thought it can be useful to have a play button that, when clicked once, plays until the end of the current folder; or until you stop it.

This would mean, when you are in the AN Ones, it would play until the end of AN1; when you are in AN4, it goes until the end of AN4. And when you are in MN it goes until the end of MN—there are no sub-folders in this nikaya. And so on, with all collections.

This seems of course much, but when you have enough you just stop it.

When it starts playing a new sutta it opens the respective sutta card and closes the finished one, so that you don’t end up with dozens of opened cards on your screen.

A second feature could be to add a play button to a search card so that you could start playing your search results list; again, it would open a new sutta card and close the previous one as you progress through the suttas in your list.

For all this, no download would be needed. The data of the suttas you’ve listend to will be stored in your browser cache, so you should be able to listen again later even with no internet access.

Would that sound like something that would help you? (Even without answering all your questions; maybe Karl can respond to those.)

Perhaps it would be good to try out the new Voice as well. The old one’s lifetime will be limited, and we don’t know how long it’s still going to work, so it would be good to get a bit familiar with the new one before the old one dies.

karl_lew · February 26, 2024, 9:21pm

Wow. That’s tangle.

https://admin.sc-voice.net

Hi, @milkii ! Welcome back.

We use AWS Polly cached into S3 keyed by text content hashes. AWS charges are therefore minimal and also bounded given the invariance of content. The monthly bill is about USD20 with most of that going for the EC2 server. Linode is cheaper, but we are strongly dependent on AWS Polly because of the special work we did to customize phonemes for Pali. We don’t use the AWS Neural voices, since the standard ones are perfectly intelligible. Desktop TTS is currently impractical because of phoneme customization and would probably require AWS Polly to offer that.

On mobile we use IndexedDB with user option to clear sound cache. Since phones storage is rapidly increasing due to video demands and high-res cameras, I doubt that audio impact will be an issue for users. Although Opus is better, I just use MP3 because of its ubiquity. Our goal is intelligibility, not high-fidelity.

PWA restriction is due to changes made for EU rules. I have an iPhone in USA. EU has broad internet coverage, so the PWA offline usecase is presumably less of an issue for EU. In the USA, if I walk down the street, I might not have Internet and I live in Silicon Valley.

Integration is by segment. A segment of text is associated with audio for that segment by the hash of the text in the segment.

The suttas are highly integrated with almost mathematical precision, bound together by rigourous repetition of key phrases throughout. They are also integrated by SuttaCentral segment number, which facilitates cross-language comparison. I suppose JSON-LD could be wrapped around segment URLs for handing references around the web, but I can’t quite see a need for JSON-LD within the application, since a segment identifier like an1.219:0.2/es/ebt-deepl is our atomic unit of reference that suffices for everything we do.

Ayya Sabbamitta and I were discussing extending the “Play To End” button functionality, which currently plays segments in the current sutta. So perhaps we could add, for example, a “Play to End of Nikaya folder”. Including options for a shuffle or repeat on the Nikaya folder “playlist” would also be possible. The nice thing about the Nikaya folder paradigm is that it is predefined and bounded (e.g., “play all 26 suttas in AN1” in order). The UI works with “sutta cards”, which is basically playlist of the segments in a sutta, so the card itself is itself a placeholder (for the sutta) with a segment placeholder for whatever is playing. Since you can have multiple sutta cards open, by definition, you have multiple bookmarks…

I propose that we first explore the “Play to End of Nikaya folder” and work out the bugs and kinks in that feature. Later we can experiment with the “Play my search results” feature.

@milkii, yes do try the new sc-voice.net. It’s quite different from the old Voice. Ayya and I only use the new Voice because it is more useful. For example, it has tri-lingual text.

sabbamitta · February 26, 2024, 9:28pm

Haha in Germany, when I walk down the street, I do certainly not have internet, unless I use the mobile data of my phone (and have reception for the phone).

Thank you for answering all of @milkii 's questions so well!

karl_lew · February 26, 2024, 9:30pm

Oh well. So much for my theories.

But you have an Android phone and that can have PWA.

sabbamitta · February 26, 2024, 9:32pm

Usually while walking down the street I need my attention for where I want to go and what I need to do, and I don’t listen to suttas. I’m certainly not missing PWA.

I might perhaps use the internet to look up the next bus connection or so while walking down the street.

karl_lew · February 26, 2024, 9:35pm

I used to work in San Francisco, which took me 90m each way on the train, so the PWA is for cases like that or for people on retreat in the wilderness.

karl_lew · February 29, 2024, 6:12pm

I am now looking at PlayToEnd. Thinking on our conversation and “play to end of tipitaka folder”, I realized that that is actually not what I want. What I want is:

play for a certain time (e.g. 30m, 1hour, 90m, 2hours)
at the end of the sutta choice of: 1) repeat sutta, 2) stop playing, 3) continue to next sutta in tipitaka

What made me change my mind was that I count my breaths while listening to the sutta. This yields a fixed time for listening. In addition, when listening to DN33, it would take several days of walking to get through once, after which I wanted to repeat. However, if I am listening to AN, I would simply want the next sutta. The AN suttas are easily digested and are not so dense like DN33, which requires constant repetition.

Also, as we discussed, I would like to re-use the currently playing card for the next sutta if possible so that we don’t end up with card clutter.

@milkii, Ayya @sabbamitta how does that work for you?

sabbamitta · February 29, 2024, 6:23pm

This would work for me too, although I rather wouldn’t plan my listening according to a specific time. But in case the chore that I am doing is finished earlier than the programmed listening period, nothing would prevent me from just stopping the play.

I understand that this choice is made before starting the play?

Although I am not listening while walking, I agree that there are cases where repetition is preferable over continuing with another sutta.

Definitely!

karl_lew · February 29, 2024, 7:00pm

… or in settings…or both. (whatever is easiest to implement)

karl_lew · March 1, 2024, 7:19pm

@milkii, Ayya @sabbamitta here is a new setting for SC-Voice.net. The setting controls maximum play time for extended play.

We will also add the following options for “PlayToEnd”

stop at end of sutta
replay sutta continuously
continue to next sutta (will continue to end of tipitaka or maximum play time

The default will be “stop at end of sutta”. The replay sutta continuously allows for concentrating on a single sutta. Lastly, the continue to next sutta permits systematic traversal of the sutta in tipitaka order.

Screenshot from 2024-03-01 11-07-11

For custom playlists, we will investigate a future implementation that would basically play search results.

sabbamitta · March 1, 2024, 7:23pm

Looks great, thank you!

karl_lew · March 3, 2024, 1:56am

EBT-Vue3 v1.447.0 is available with Play collection.

Briefly, what this feature does is play a sequence of suttas, one after the other until the maximum play time. @milkii, Ayya @sabbamitta, hopefully this feature will address, at least in part, our need to listen to a collection of suttas for a time. I will use this feature a lot when listening to short suttas such as AN1.1-10. Although useful, this a big change for a new feature, so I would also expect to see some bugs. Please see how this feature works for you and share your experience.

Thanks!

sabbamitta · March 3, 2024, 6:46am

Wow, this is GREAT! Thank you so much for this!

I just briefly tested it, starting with Thig 1.1, and it seems to work absolutely as expected. At the end of the first sutta I hear the end bell sound, then I can take a short breath, and the playback continues with the next sutta.