SuttaCentral

Getting started on Bilara


#1

This post is for technical discussion among SuttaCentral developers. If you have a general question or comment, please create another thread for it. Thanks!

Bilara is SuttaCentral’s in-house translation app, and it lives here:

https://bilara.suttacentral.net

History

With the deprecation of Pootle, the translation engine on which I made my nikaya translations, we were in need of a new Computer Assisted Translation (CAT) tool. determining that available options did not suffice, Blake proposed that we build our own as a thin client for Github. We christened it “Bilara”, which is Pali for “cat”.

The data is kept as JSON key-value pairs in a Github repo, and the front end is built on Polymer’s LitElement.

I am using bilara for my Dhammapada translation, and now that @brahmali’s Vinaya translation is there, he can start using it too.

Philosophy

The guiding philosophy is: Thou shalt not build into bilara anything that can be done on Github.

So a lot of things like user accounts, data storage, version history, statistics, and so on, all of which were built in to Pootle, can be offloaded to Github and we can keep our client simple.

We keep development simple by only actively supporting modern Chrome. Other browsers may well work, but we are not testing for them. Same applies to mobiles and tablets.

State of the app

Currently bilara is an Minimal Viable Product (MVP). It does the essential things that a CAT must do, but no more. Assume everything is broken and you won’t be disappointed.

  • Edit translations.
  • Add/remove texts
  • Offer translation suggestions.
  • Sync data with Github.
  • User accounts via Github.

That’s it.

  • Navigation is primitive but works fine.
  • There’s no search.

Apart from basic editing/data sync, good find-replace is my number one priority:

Here are the outstanding issues if you are interested, or if you want to add your own!

Accounts

  • Make an account on Github. If possible use something like your real name, it helps to remember who is who!
  • Request write access to the bilara-data repo.
  • Hopefully that will be all. See above re “assume everything is broken”.

Interface

  • Click on nav items to open and see what’s there.
  • On translation pages, click on right hand column to edit a segment.
  • Click on translation suggestion to accept it.
  • Press enter to commit a segment.
  • If you click away without pressing enter, a warning sign will appear. the text exists in your browser but has not been committed.
  • Once you press enter, the text is committed and a check mark appears.
  • There’s a bit of a weird bug where the app works too fast for some browsers, so the text fetching doesn’t happen. If you click on a text page and nothing appears, just go back and then return, it should load.

Search: introduction to the Dark Arts

For now, search has to be done with workarounds.

Github search

Just go to the relevant repo and search, it works fine.

Refine queries using advanced search, eg. if you want to search only your own translations. Further details.

https://help.github.com/en/articles/understanding-the-search-syntax

Note, queries with whitespace must “be quoted”.

Local search

Pull the repo locally and use whatever you want to search. I use Sublime Text, it searches rapidly over the whole corpus and allows regex, whole word, match case, etc.

Pootle?

The biggest limitation of both these approaches is that you only see the text or translation, not both side by side. You then can search using the segment ID to get both at once, but that is clumsy. Until this is fixed properly, the only way I know to easily get text and translation together is to use Pootle. Not very helpful, I know, but can be useful sometimes.

Local editing

Since the data is just JSON files in Github, you can easily sync it locally, make changes there, and push it back. IN THEORY!

This is especially useful for bulk find/replace. Use Sublime or other text editor to make the changes, and push them all at once, voila!

But! There has been some instability in terms of managing merge conflicts. It can still be done, but use with caution until we are confident this has been resolved.

Proofing

There is currently no support for “suggestions” and other proofing tools. If anyone wants to do proofing, we’ll have to work out an ad hoc solution for now. Two possible approaches:

  • lo-tech: make a thread here on D&D and just paste in text.
  • hi-tech: make a branch on Github and use pull requests.

#2

I really like Pootle. Can @Marco and I carry on using it with our AN and SN translation for the time being?

A key and useful functionality of Pootle is the search function. It allows you to find where, for example, you translated things in a way in the past to change it consistently across translations.


#3

Pootle is great for translators. It’s terrible for developers. We have wasted countless hours wrestling with it. Use it for now, it’s not going anywhere, but we will be transitioning to Bilara as soon as we can.

Never fear, we will have a great find-replace function on Bilara, but these things take time.


#4

Is it on the horizon? And how far away might it be?


#5

Thanks bhante.

If I may suggest, it would be great to have a more interactive share-screen session with you and/or anyone skilled in how to use Bilara so to get @marco and I comfortable to transition from one tool to another. Is this something @Aminah can help as coordinator of the translation works? (that`s her role isn’t it?)

For us it will be foremost to be able to keep working reviewing on each others translations as we go through our checklists and tracking progress through it.

In my case, there is a clear knowledge gap in regards to how exactly Github is used in the process. I admit I was spoiled by the simplicity and ease of use of Pootle: you just login, pick the next text in the list, enter your translation and move on. No “sync” is required, my peer can see and review what I did almost instantly, etc

What do you think?

:anjal:


#6

If I could get write permission, I could start poking around and segmenting some of the Chinese translations.


#7

Make a user account. Ask to be added to the repo. That’s it.

That is exactly how Bilara works.

Folks, I love your enthusiasm, but please hold off for just a bit. One thing at a time. We need to onboard Brahmali to start with, sort out any issues that arise, then the next thing. Otherwise we are going to be spending our time answering all sorts of questions.

Blake and I will talk at our meeting tomorrow and I’ll get back to you then.


#8

Bhante, I have already got a Github account (I think), who and how should I ask for this?


#9

That’s so great to see how this is going ahead! :dolphin: :cupcake: :lollipop: :pie:


#10

Please, not yet. We will let you know when we are ready for the next step.


#11

I recently came across a reference to a tool called SmartCat. Apparently it is for free. Have you guys already tested it or it is not kosher?


#12

Yes, we looked into this. Smartcat and Weblate are two of the systems that we considered.

Weblate is technically somewhat similar to Pootle, and is the natural point for migration. It is a good project, but suffers a number of flaws for us: the interface is for translating site UI, not text; and it requires a different data structure than we use. We could get around these, but at the end of the day, it is a legacy programming stack, and you are still hacking to make it work.

Smartcat was considered, and it seems like a good option for businesses. But it is not open source, you essentially have to put your data in the hands of a profit-seeking corporation. This goes against the SC ethos. To be sure, github is also a corporate entity (now owned by microsoft), but the key point is that the data is in Git and we all have local copies, each with proper version control. We control the data.

Ultimately we decided to roll our own. Key reasons for this.

  • All these systems are built on legacy “thick-client” stacks. Leveraging the power of Git, we can offer similar functionality in a much sleeker and lighter way.
  • We can build to precisely our data requirements and ensure that we don’t run into problems of scale such as we did with Pootle.
  • We can optimize the interface for translating texts, whereas all comparable systems are designed to translate software.

We appreciate your patience as we build a system for translating Buddhist texts for the coming decades. :pray:


#13

Thanks for your attention and reply bhante.
No worries, no hurry. And please, feel free to bring @marco and I on-board if there is anything we can help, even if it is manual data cleaning / fixing to aid the transition between translation tools.
:anjal:


#14

That’s great news Bhante @sujato . Can’t wait to start working on Bilara. :slight_smile: