SuttaCentral

Bilara i/o


#1

For all geeks and devs out there, we just rolled out a cool feature for Bilara. You can now export the data from Bilara in a range of file formats, and re-import it again!

This is a pre-pre-alpha feature. It will break and destroy everything! Use only if you do not value your own life or that of your loved ones!!!

Only kind of kidding! Actually on my very limited testing it seems to work fine, but best be cautious. I’m sharing it with you in the hope that someone might want to test it and play around and find some bugs.

This is not a user-facing feature. It is developed for two main use cases:

  1. Internal SC work, if we need to make bulk or complex changes to the bilara-data
  2. Consumption of bilara-data in external apps

How does it work? First, clone bilara-data:

Go to the .scripts folder. Change the python version to 3.7.2. (Other versions may work if you have a different version installed.) Run something like:

pip3 install -r requirements.txt

Ready to go, let’s export dn1 as a Libreoffice spreadsheet!

./sheet_export.py dn1 dn1.ods

Edit it, save, and run:

 ./sheet_import.py dn1.ods

Et voila, your changes appear in the bilara data file. Of course you should never do this unless you’re on official SC business, but anyway you can see how it works.

You can easily do something like this, too:

./sheet_export.py dn dn.tsv --include root, translation+en

“Export the whole of DN as a tsv file, including only the root text and English translation”.

It uses Pyexcel under the hood, so you have a wide range of formats to choose from.

http://docs.pyexcel.org

@karl_lew @jared @Florian @Vimala


Volunteer wanted! Help collect author/translator information
#2

Bhante, this is quite fantastic! Thank you!

In a week or two I had planned to revisit the Voice sutta storage in order to support i18n. Indeed the structure that Anagarika Sabbamitta and I had settled on looks remarkably like the Bilara translation folder. We had needed separation by language/nikaya rather than nikaya/language, so the Bilara structure is actuallly perfect. :open_mouth: :tada:

What this means is that Voice can use Bilara data directly for all its sutta needs. Content updates will then be exquisitely simple: git pull. :man_cartwheeling:
The Bilara data is so cleanly structure that Voice will just read the Bilara data directly:

JSON.parse(fs.readFileSync(suttapath).toString());

:warning:…?

As I scan the Bilara repository, I notice a small omission. I can’t find the blurbs that Voice needs to display. Will the blurbs also live here? :thinking:

The ending of defilements comes only when the truth is seen. But seeing the truth comes about due to a vital condition. In this way, twelve factors leading to freedom are united with the twelve factors leading to suffering.


#3

No worries!

Excellent! I spoke with @michaelh about this yesterday, and he will be using the bilara data too.

Eventually, yes. They are already in JSON, so there won’t be any change in the format, but the repo will shift to bilara. We’ll do this at some point in the future when we want to start translating them.


#4

I follow the steps and make it into a Colab notebook.

https://colab.research.google.com/drive/1-dGdBJmSF-3O7_64fEGQx66OrT1T0v7a

It’s may not be very useful, but anyone can play with the notebook.


#5

Thanks! Have you managed to run it yourself?


#6

Yes. I run them too. Although I didn’t change the ods, just export and import it back.

It may also be useful to do some analysis and visualization too, I haven’t done it yet.