Cool hacker stuff to do with SuttaCentral? redbean!

sujato · April 21, 2022, 12:55am

There are a few places that use our data for applications: Buddhanexus, SC Voice, and others as well. But I’m sure there are lots of ways we can share the Dhamma. Our content is freely available on Github, and the REST API is accessible.

Someone recently suggested suttas by email. Great idea!

Here’s mine.

The single coolest, most astonishing, awesome, and sheer tech in the last few years is Justine Tunney’s redbean, based on the αcτµαlly pδrταblε εxεcµταblε.

When any developer hears about this, they lose their bearings. Things that seem real, no longer are. That which was impossible becomes imperative. Seriously, read the comments on Hacker news:

I just spent 30 minutes reading about this. I’m so shocked that I’m logging in to comment after 5 years of lurking. ¸ As far as I’m concerned, this is literal magic.

this is the best programming-related thing I’ve seen on the internet in a long time

it’s a kind of thought I don’t think I could ever come up with. Mind blown.

It’s hard to grok cause there are no words to describe it.

Everything she does has this level of jaw-dropping amazingness.

So what is it? Well, I’m far below the level of incomprehension of the programmers who couldn’t grok it. Justine is a next-level programmer who casually announces things like running the latest google machine learning AI … on the original IBM PC from 1981.

But basically redbean is a wrapper for web-ish applications that run offline in a single zip file. It builds on her work with the αcτµαlly pδrταblε εxεcµταblε, which is where the real magic happens.

She discovered, by reading the compiled source code it seems, that there are certain patterns and accidental overlaps and lack of overlaps in the assignment of the binary codes that define the start of the various operating systems. These arbitrary patterns are very old; they were written when the operating systems were made and not touched since. She found that by writing instructions for different operating systems in the gaps not used by other systems, she could ignore the different APIs for running code on different systems, and just run the same thing everywhere. That’s right: just drop a single file of a few bytes and you can run it on any computer.

Not satisfied with this, she bundled it up with a web server that includes Lua and SQlite so you can just write a web app and run it anywhere.

So this is my challenge: who wants to build a version of SC based on redbean? It doesn’t have to do everything that SC does, just something that you find cool. Use Lua and SQLite for maximum simplicity. Add other tech to redbean if you like, but these are the core.

Load SC’s data into SQLite with Lua, query it, and serve a million responses per second on a home laptop. Make it a website, or just let people download and run it themselves on any computer.

Then laugh and laugh maniacally as the world of reason and rationality crumbles at your feet.

karl_lew · April 21, 2022, 2:06pm

RedBean is indeed cool.

We’re on a slightly different tack. We’re trying to switch Voice apps to be serverless where possible. For example, Dhammaregen is hosted by Github and only requires a Voice API server. Dhammaregen is an instance of EBT-Site, which allows folks to create their own searchable Dhamma website in a no-code manner

olastor · April 29, 2022, 10:00pm

Dear Bhante, thank you for sharing this reference. I read the post and am also quite amazed by the whole Cosmopolitan Libc library which I did not know before. Unfortunately, the normal the redbean binary from the website gives me a segmentation fault (Fedora Linux), but it works well when building it locally.

I gave it a shot, uploaded the code at Sebastian K / Portable SC · GitLab and the latest build can be downloaded here (~ 117MB, all english sutta and root texts). I tested this version on Windows where it was working okay (firewall seems to complain though).

I did not use sqlite, but just scraped the same api urls (running locally) which the service worker would cache and stored them in the filesystem where the lua server just servers them directly as the filepath matches the api path. Handling the different possible query parameter combinations (especially “siteLanguage”) is a bit of a challenge, I hope I covered the most important cases.

If time permits, I will try to look into fixing issues that are currently present, multi-lingual builds and perhaps try to use sqlite to make a basic search offline.

ekay · April 29, 2022, 10:01pm

Welcome to the community @olastor ! Enjoy exploring all the wonderful resources available here.

If you have any questions or need any help, please feel free to ask; just tag @moderators in a topic or send a PM.

with Metta

sujato · April 29, 2022, 10:31pm

OMG that’s amazing.

I’m trying it now and getting a segmentation fault. Can you tell me if I’m doing it right?

download redbean.com
chmod +x redbean.com
./redbean.com -v

(ubuntu 20.04)

olastor · April 30, 2022, 9:07am

Yes, that should generally work. The docs say you should use bash, so maybe sometimes bash -c "./redbean.com -v" can help.

It looks like segfaults on Linux x86 are also experienced by other users. I uploaded my local build here, which maybe works better (but then only on Linux probably). Using the debugging version from redbean.dev (.dbg) seems to not segfault, but it gives an error when trying to zip files into it. If I find out how to fix this using the debugging version might give best compatibility.

sujato · April 30, 2022, 10:47am

Hmm, same result.

The linked thread seems to have been resolved, is it the same issue? If not, we should probably open a new one.

OMG now that works! Wow, I am grinning so much right now! Most things are just there, the content, styles. A few bugs with routing and things, but 100% usable.

Congrats, that’s just amazing!

olastor · April 30, 2022, 10:10pm

I am not sure, but I think the version 1.5 that you can download on the website is already a bit behind and on the master branch it seems to be fixed. I opened a new issue, but it’s already been closed because of that reason. I have now switched to building the binary directly in the CI/CD pipeline.

Thank you! Glad it works on your machine, as well. Could you please tell me what routing bugs (and other) you spotted so that I can address them?

sujato · May 1, 2022, 7:50am

Minor bug, the loading animation on the orange top bar doesn’t stop

Sometimes, on Home page, I click the “suttas” card and get a “you’re offline” message. (http://127.0.0.1:8080/sutta?view=normal). There are a bunch of failed to load resource errors in the console, as well as this:

Uncaught (in promise) TypeError: Invalid attempt to iterate non-iterable instance.
In order to be iterable, non-array objects must have a [Symbol.iterator]() method.
    at q (8830.7ccc950fcd44c89c89c3.js:2:8461)
    at B.<anonymous> (8830.7ccc950fcd44c89c89c3.js:2:15647)
    at d (main.faee22e11e5850a4b360.js:203:75428)
    at Generator._invoke (main.faee22e11e5850a4b360.js:203:75216)
    at Generator.next (main.faee22e11e5850a4b360.js:203:75853)
    at o (main.faee22e11e5850a4b360.js:203:81138)
    at s (main.faee22e11e5850a4b360.js:203:81341)

Now, clicking back in the browser from this error page, I’d expect to go back Home, instead I go to the correct page I should have been at before: http://127.0.0.1:8080/pitaka/sutta

A general bug, the navigation breadcrumbs don’t appear in the black nav bar.

General question: this won’t update, will it? Unfortunately you grabbed our source when we had an embarrassing bug on the Home page, the three pitaka cards were disordered. This is fixed now.

olastor · May 1, 2022, 10:24am

Thanks for those details, I noticed I didn’t include the data of “/api/suttaplex/sutta” which is why “/sutta” does not work. It is also not included in the PWA offline data, is there a reason for it? When I open this page it starts making a lot of requests (> 1000) to an “available voices” endpoint (probably for each text) which are just 404s offline.

The loading animation problem and the breadcrumbs bug do not appear for me, but I am currently using a newer build, so I’ll try to post another download link once I’ve fixed some of the issues (maybe that’ll resolve it).

No, you’d always need to download another build file to receive updates. I am currently pinning a commit of the master branch of suttacentral for building the interface and have updated it to include the fix of the card layout. I am trying to create automated builds (that update the data maybe every month or so), but will probably then only bump the commit version for the interface manually. One question: What are you deploying to production? I noticed there is a production branch, but it seems outdated. Are you just checking out a certain commit of the master branch?

sujato · May 1, 2022, 9:52pm

I’m not really sure, but it looks like it calls to SC Voice, which wouldn’t work offline.

great

Ahh, I thought were we using production, but it looks like that’s changed. We changed the bundler a little while ago, maybe Hongda switched to using master. Anyway, just use master, you should be good.

sabbamitta · May 2, 2022, 7:02am

Unfortunately so—it has always been our wish to make Voice work offline too, but currently that is far from close … so to speak.

olastor · May 15, 2022, 9:01pm

I worked a bit more on this “proof of concept” and builds of the lastest tag will now appear on this site which tries to add some docs: sc-portable (everything basically still only for testing use).

olastor · May 15, 2022, 9:06pm

Perhaps you have already considered this or it’s not suited, but modern browsers can make use of a Web API for speech synthesis. An example is https://mdn.github.io/web-speech-api/speak-easy-synthesis/ . The quality is of course rather low compared to AWS.

Khemarato.bhikkhu · May 16, 2022, 12:24am

This won’t work for Pāḷi (or older browsers) but I think it would be a nice option for the translation voice for people like me with a modern, mobile browser but with a high-latency internet connection. What do you think, @karl_lew ?

sabbamitta · May 16, 2022, 6:03am

Yes, and most particularly: Try to get them to speak Pali!

Karl has put a lot of effort into optimizing the AWS Hindi voice Aditi for speaking Pali, and she is pretty good now. Inbuilt browser TTS systems can’t do that.

Oh, only saw your post now …

sabbamitta · May 16, 2022, 12:53pm

I just tried this sentence in English

At one time the Buddha was staying near Sāvatthī in Jeta’s Grove, Anāthapiṇḍika’s monastery.

The word “Anāthapiṇḍika” is way too much of a challenge!!!

karl_lew · May 16, 2022, 12:54pm

Offline Voice is indeed the mountain to climb, especially for Pali and for low-bandwidth users. Offline Voice with full audio caches running on a personal computer, even Raspberry Pi looks quite doable. I’ve been thinking about audio caching using Github, perhaps with a repository for each nikaya. What this might enable is something like “donate a speaking EBT-Pi to a monastic”.

I did try listening to Speech synthesiser, but it was much too jarring for study. Indeed, I regularly listen to bilingual Pali/English recordings for study, and I shudder at how Pali would be butchered. AWS narrators bridge that critical Pali gap for me, and it is currently simpler to store such audio rather than re-create it using alternate technolgy.

I have tried AI voice compression techniques for several months and can get really close to the AWS sound. However, it’s a race between my declining mental capacity and what is thereby achievable. Currently, the audio storage is Occam’s razor of a solution.

olastor · May 16, 2022, 7:07pm

Wouldn’t it be possible to do caching on a word (or “n-gram”) basis? Listening to some of the Pali text on SC Voice it seems to me like the pronounciation of each word is independent of the sentence it’s in. Not sure if that would work out well, but just storing the audio of unique words and then combining those to sentences offline (“on the fly”) would be more powerful and space efficient (if it’s possible).

Were you using Tensorflow for this? Just asking, because they do have a “Tensorflow.js” port which works in the browser with a cli tool that can reduce size of models by quantization. I used that in another instance to reduce a model from 80MB to ca. 20MB (with quality tradeoffs, of course), but it had nothing to do with audio and perhaps those are much bigger. There seems to be someone on the internet has tried something like this before with standard TTS models.

Jhana4 · May 16, 2022, 7:12pm

There is already a really nice service for sending a sutta a day. I just started using it. Snowbird’s. It includes direct links to alternate translations including SC.