Are SuttaCentral translation sources on github current?

karl_lew · July 29, 2018, 12:52am

I was delighted to find SuttaCentral translations in github. This was actually incredibly exciting because I saw the opportunity to learn Pali in the context of studying the suttas. For example:

msgctxt “mn4:1.2”
msgid “”
“ekaṃ samayaṃ bhagavā sāvatthiyaṃ viharati jetavane anāthapiṇḍikassa ārāme.”
msgstr “”
"At one time the Buddha was staying near Sāvatthī in Jeta’s Grove, "
“Anāthapiṇḍika’s monastery.”

I’ve been struggling to find a resource such as this that shows English/Pali phrase for phrase with lots of metadata. This is exactly what I was looking for. With this I can go from the english phrase to Pali and then see how that phrase is used elsewhere in the suttas. Wow! Wow! Wow!

Is this study approach feasible? I.e., is this the actual source for the sutta translations displayed on SuttaCentral?

p.s., if you need software engineers, let me know how I can help!

sujato · July 29, 2018, 2:10am

Hey Karl,

I know what you mean, it’s very cool to see the source code. I’m actually very excited with our approach, and the more we go on, the more I am discovering that segmenting the texts is incredibly useful. It underlies, for example, for transcription of manuscripts, which i something we were not even thinking of at the time.

You can of course see the pali/English for my texts on SC itself. We’ll be adding Brahmali’s Vinaya translations to that soon.

To address the question in the title, the texts on Guthub are a little behind the ones on our translation engine, Pootle. We update them every so often.

Sure, we can always use more programmers! May I ask, what is your area of expertise?

karl_lew · July 29, 2018, 3:44am

Bhante Sujato (is this form of address correct?),

I currently code on Linux and can work with front-end UI frameworks as well as back-end systems. I have a slight preference for Javascript for both domains simply because I am a bit tired of the tower of Babel that is modern day programming. Now retired, my past experience is listed on LinkedIn. Currently I try to give back to the world via open source projects. I’ve helped out with mathjs and have my own garden software project. As you can see, I am familiar with github and use it daily. I retired because I am gradually going blind and would like to spend my time on right action, helping others as needed. – karl.m.lew (gmail)

sujato · July 29, 2018, 10:20am

Nice.

Boy, do we have some JS for you.

So do we!

That looks very nice.

I am sorry to hear that. I wonder whether that could be an area you’d consider helping out with? I have always wanted to implement proper accessibility in SC, and we have tried to follow best practice for a11y. However we have not really assessed the site as well as we should have. I would dearly love to look at things like how the site works with screen readers and so on. What do you think?

karl_lew · July 29, 2018, 1:40pm

Oh interesting! I currently just make the page superlarge to read. I’ve tried to use accessibility software and it just makes me want to scream. It’s like trying to drink the ocean with a straw. So I do not use accessibility software, it doesn’t help me. I’d rather just listen to Dhamma talks or Audible books with Ajahn Chah stories than wade through SuttaCentral blind.

What I seek is conversational interaction. Just this:

K: “Hey SuttaCentral, what is delight?”
SC: “delight is the root of suffering. (MN 1 Thanissaro)”
K: “Continue…”
SC: "Mulapariyaya Sutta: The Root Sequence, translated from the Pali by
Thanissaro Bhikkhu © 1998, … "
K: “What did Sujato say here?”
SC: “relishing is the root of suffering (MN 1 Sujato)”

To support this style of interaction would require the translation content in github as a start.

Note: Alex, my dharma buddy initiate monk in Thailand mentioned that this post might be taken as overambitious and inappropriate for first project. And I agree with him. I merely wanted to clarify that there is a problem in that our society places excessive value on accessibility software. I actually think it’s impossible to make an existing website useful to all and make the site effectively accessible as well. I learned this trying to address accessibility at Cengage for our educational online software. I know that’s a radical position. But I watch myself and others use voice assistants with ease. I can ask Siri what the weather is like with my eyes closed. It works. What I’ve stated is a problem, and a hope that someday someone or some team might make this possible for blind people to study the Dhamma.

As far as helping out, I am perfectly happy cleaning software toilets one bowl at a time. Even with a toothbrush.

Aminah · July 29, 2018, 8:47pm

Over on GitHub, there is an audio support issue. In it Bhante mentions the ambition of developing things so that someone can say something like, “Hey Siri, read me a sutta!”. The issue even features on the wish list of his 2018 Roadmap.

I’m afraid I don’t know anything about dedicated accessibility software, but heartily cheer making things as accessible as they can be: it’s a massive win all round.

karl_lew · July 29, 2018, 11:57pm

Thank you Aminah! Every day I learn something new about SuttaCentral. I did not know about the existing issue.

I really would advocate the voice assistant approach since it flows so naturally in a hands- and eyes-free environment. My own bad experience with accessibility relates to the horror of form field traversal for the blind. Consider for example asking blind people to use the SC advanced search with its myriad fields.

My own inclination would be to have something very simple. An app with a single text field with a microphone icon. That’s it. A very simple voice-enabled text field to handle a few core use cases. Do one thing well first (e.g., read me a sutta about Fear). Then do another thing well (e.g., what does bhava mean?). Repeat. Yes, there would need to be playback controls, but for accessibility an overall design approach should favor brutal minimalism. Minimalism reduces cognitive overload and user fatigue.

sujato · July 30, 2018, 12:38am

This sounds fabulous!

It seems to me the whole approach used by the tech titans for speech is still very basic I/O. It’s just powered by AI to try to parse questions and generate a passable output. My understanding—and it is a very basic understanding!—is that we feed them (a) the data (i.e. text or in this case audio files) and (b) a path(s) for user interaction, which is what you’ve already started above. If the design is robust, it can be applied to the different platforms without (hopefully!) too much difficulty.

The standard for these things seems to be: “don’t suck so bad that the user gives up after less than a minute”. If we can create something that responds to meaningful questions, and which guides the user well, then we can pass at least this test. So aim to provide a simple, useful service.

karl_lew · July 30, 2018, 5:36am

It is indeed fabulous to share a common understanding of the problem!

I’ve used voice assistants a lot, but have zero experience with implementation. If readers out there would like to offer suggestions and insight, feel free to chime in with advice/caveats/etc. Absent such help there will be requisite bumbling, research and inquisitive mayhem. Iteratively. With feedback to/from all interested parties.

Here’s a starter list of considerations::

speech-to-text is probably available via some API provided by the hardware vendor (e.g., Apple, Amazon, Google).
text to query can be esoteric AI or simply a dumb text parser accepting a handful of useful phrases. Dumb parser seems remarkably charming for a prototype.
query engine will understand how to search SC translations for pertinent information (e.g. read me a sutta about Fear). This is a bit hand wavy. A search for suttas having “fear is” sorted by frequency of mention in body/title might suffice. Or might not. This is where a prototype could help us understand the efficacy of proposed search algorithms. It would certainly help us understand the scope of effort required.
media player will we do voice-to-text or simply play existing audio if available?
result navigation in most cases multiple matches will be returned for a query. What navigation UI should we have for prioritized match and alternatives?
etc.

I’ll start looking into these next week and report findings. Thoughts and suggestions are most welcome. If anything more pressing requires attention, let me know.

Mat · June 18, 2019, 6:44am

Just saw this thread. Any progress with the AI?

karl_lew · June 18, 2019, 1:23pm

To demonstrate where AI is today on speech-to-text, Google Assistant currently is quite capable. Just now I spoke, “OK Google, what is the root of suffering?”

So the capability is definitely there. Here is a video demonstrating the Google Assistant SDK

One cautionary note is that this technology is quite monetizable and will most likely not be free. In addition, Google Assistant does not know Pali. Therefore, integrating Google Assistant with SuttaCentral or even Voice has some interesting challenges. I defer to Anagarika @Sabbamitta for guidance on when and how we might integrate GA with Voice in the future.