SuttaCentral Voice Assistant

karl_lew · September 18, 2018, 2:58am

SuttaCentral Voice Assistant v0.2.4 is now available for evaluation. I’ve tested the site on a Chromebooks with ChromeVox on as well as off, so it should work in both sighted and assisted modes. The dark theme is permanent and reserves color for greater contrast highlighting interaction. Features and controls are minimalist to prevent cognitive overload for assisted users.

Content is currently limited to MN, SN, KN, AN as found in Github. Only MN1 is expanded, but additional expansions will be added as requested. Please be alert for mispronunciations!

Mobile support is partial. On the iPad and iPhone, audio clips will only play for about 4 minutes, so suttas will be cut short. It’s an Apple limitation. Those of you with Android phones will probably be fine. We’ll need to investigate a solution for iPad/iPhone (e.g., MP3 player app or SC-Voice app or your suggestion).

Feedback is very much welcomed. This is an early prototype with much room for growth. If you know anybody with visual impairments, please ask for their feedback. Although I can use SC-Voice with my eyes closed, that’s just one data point. We need others.

Take SC-Voice for a walk if your phone permits. Use SC-Voice when doing housework or walking meditation. Let us know how you use it and how you would like to use it.

p.s., Viveka and Aminah, thanks for your feedback. The 1 second pauses really help out. I’ve slowed Amy down to -30% and have corrected Brahma pronunciation.

Viveka · September 18, 2018, 7:02am

Congratulations and thank you!!!

I look forward to trying it!

Aminah · September 18, 2018, 8:03am

WOW! This is incredible!
Mighty congratulations, indeed!

Sorry, but this is just a quick fire note and in addition to giving due salutes I wanted to note a tiny ‘iron-out’ point in terms of pronunciation.

In MN41 (and therefore everywhere else I guess) “bowed” is pronounced as per rather than .

Again, thank you so much.

Gabriel_L · September 18, 2018, 12:54pm

Listening to MN118. This is awesome! Well done! Sadhu, sadhu, sadhu!
It pronounces pali much better than many people I know!
I reckon the speed it reads could be a little faster. Maybe that could be made an option?

Aminah · September 18, 2018, 1:00pm

Yep, agree on both points.

As already noted, I agree on this too.

sabbamitta · September 18, 2018, 4:12pm

Me too!!!

Don’t have much time right now to listen, but just took a sample, and find it ABSOLUTELY VERY AWESOME!!!

Aminah · September 18, 2018, 5:00pm

Hullo again,

Just a couple more notes in case of interest:

1) I don’t know if maybe (that is, I haven’t checked suitability beyond the particular point given here) it would be an idea to skip <h> tags outside of the opening hgroup div. For a reader the following from DN11 is perfectly sensible, for a listener it is slightly disjointed:

For a third time, Kevaddha made the same request,
and the Buddha said this:

1. The Demonstration of Psychic Power

“Kevaddha, there are three kinds of demonstration, which I declare having realized them with my own insight.

2) Karl, you’re right the pauses make a massive, massive difference, but I think they should probably be avoided in at least most instances of hyphenated words (eg. “non-conflict” and “self-mortification” in MN139).

3) A tiny secondary detail: would be nice if the Enter key could be used to submit a search.

sabbamitta · September 18, 2018, 5:58pm

I very much like the speed the way it is. Making it optional would certainly be fine.

karl_lew · September 19, 2018, 1:14am

Thank you all for the very detailed and helpful feedback. I will post a new release once I have applied your recommendations.

for the speed difference, I think I will introduce Raveena, who can move along at a fast clip. We will need to be alert for Raveena’s mispronunciations, which will be different than Amy’s. Raveena is more accurate at Pali than Amy and is better suited for fast cadences. This will give us Amy reciting as currently at a meditative pace while Raveena will speak at a faster pace suitable for review.
I will make MN41 expandable
I’m working on the dashes.
The ENTER for search is a great suggestion.
DN11 has numbered text headings. How interesting! This is new. I can make these collapsible sections as for MN1.
bowed and bowed. Wow! Good catch.

Gabriel_L · September 19, 2018, 4:22am

I suggest you make that an option: some will prefer Amy, others will prefer Raveena.

I have just listened to MN19 and had to increase the speed to 1.2x the original to get to a balanced speed for my ears and mind (most desktop MP3 players allow that).

If my maths is right, that means that a -16% slowdown factor would be my choice.

Aminah · September 19, 2018, 9:50am

This is just going on my experience coding html files, but I’m guessing it will at least hold for the English segmented texts: when coding things up we try to be quite particular about hyphens, en and em dashes. Closed compounds should only ever use hyphens and should be easy to separate from dashes.

Actually, I have to say, I’ve changed my mind a little about preferring a faster read, I think I’d probably appreciate both faster and slower in different contexts. In turn, I even more heartily agree about the option point, which if I understand Karl’s message correctly, is his/your intention anyway.

Much continued general applause!

karl_lew · September 19, 2018, 3:55pm

Upgrading to SC-Voice v0.2.5 at 9AM PST…

Aminah · September 19, 2018, 4:00pm

That’s NOW!

I shall have to celebrate with a sutta.

karl_lew · September 19, 2018, 4:17pm

Update completed
http://50.18.90.151/scv/

Changes in v0.2.5:

The most significant change is the addition of Raveena. Raveena barrels along like a Tesla on insane mode. She can also go faster if we need super fast skimming. But let’s evaluate Raveena as is. Raveena is fairly intense–she grabs the listener by the throat and propels them mercilessly through a sutta. To hear this, try AN2.1-10. Your hair will rise and fear will enter your heart. Choose Raveena in Voice Settings. She has a different vocabulary than Amy, so be on the alert for errors.
MN41 is now expanded. Please check it for errors. Also, if you have expansion requests, please do suggest them. I did take a look at MN118, but I have no idea how to expand that as a human being.
Expanded content is highlighted on the right with a gray border. You can see this in MN1 and MN41. This highlight is critical since it designates auto-generated content. It is therefore not a direct translation and will probably have grammatical errors (usually mismatch between verb/subject number). I.e., these errors are due to software, not Bhante Sujato.
The word hyphens are hopefully fixed.
bowed is now not bowed
The Search/Delete icons are gone unless folks really want them back as an option. Thanks to Aminah’s ENTER suggestion, they are just clutter now.
I did not have a chance to explore auto-sectioning with numbered text segments. It’s part of the larger issue of auto-sectioning, which relates in part to the iPhone problem. For example auto-sectioning to enable iPhone listening is probably high value, and that criss-crosses with the numeric sectioning. There will be options for this. Indeed if you have thoughts on auto-sectioning, please share.

Aminah · September 19, 2018, 4:23pm

Brilliant!

Haven’t tested out the actual audio yet, but have used the wonderful new ENTER function! (btw, this reminds me, I forgot to mention how stoked I was to see that you’d already built in SPACE pause/play functionality - I was listening to a sutta yesterday and was like “wait, what was that?” intuitively from other applications I just hit space and it was only after it did what I was hoping that I thought, “brilli-beans, there’s no reason to expect that it would have!”

Some quick feedback:

https://suttacentral.net/sn10.3

karl_lew · September 19, 2018, 4:26pm

Try again. I have implemented service throttling to avoid this, but I may need to re-tune the queue width. Right now it slams 20 concurrent voice requests at a time to AWS Polly. SC-Voice does catch the responses, so simply trying again should work. Just think of it as poking Amazon: “move! move! move!”.

Whoa. Wait a minute. That’s a missing FILE! D’oh. Those files only have one leading zero. I’ll need to fix that in a future release. Thank you!

Aminah · September 19, 2018, 4:30pm

Yes, I wondered if it was something like that. So to get to AN2.1-10 what do I need to type as is?

karl_lew · September 19, 2018, 4:32pm

You can’t. You are doomed to wait.

Some SN have .001.po, others have .01.po. It’s fixable.

Oh wait. You asked about AN2.1-10. Yes that works. Your typo revealed a bug for SN2.10

Aminah · September 19, 2018, 4:36pm

LOL. Fair dos.

In the meantime:

This is just going back from a distant thought from whenever ago (ie. not something I’ve thought about in a mighty long while), but perhaps MN8?

Aminah · September 19, 2018, 4:38pm

Yes it was*, that’s because that was a bug I found already on the first version. Then I read your release note and wanted to hear the new voice, but also found I couldn’t get to an2.anything.

* well almost anyway, it was actually for sn10.3