Sutta Central Development Updates

Updates for the previous week:

  1. Improvements to search behaviour , it is now easier to search by sutta title or uid with improved relevancy. Including numbers in a search (such as DN 2) produces fewer irrelevant results. It is to expected there will be some teething problems with a new system and it will require further tuning as time goes on. Fortunately, Elasticsearch is infinitely more tunable than the old system we were using.

  2. Implemented zero down time when updating search indexes. When changes are made to the code which indexes content, it sometimes becomes necessary to completely rebuild the index. Previously, this would result in downtime as a delete and build procedure was followed, and during that time Search either returns an error, or returns only a subset of the possible results.
    Now a technique is used called aliases. While a new index is being built, the old one continues to be used. Once the new index is ready, a swaparoo is performed, and once the new index is being used, only then is the old one deleted. The swaparoo is seamless and there is no downtime at all. As well as eliminating downtime, it also lowers the barrier to implementing improvements.
    As a part of implementing this functionality I also performed significant refactoring of the search engine code, improving maintainability and comprehensability. Refactoring is defined as ‘behavior-preserving transformations’, by definition time spent refactoring adds no new features whatsoever but improves the code which provides those features. (The best way to know that code needs refactoring is if you’d hate to be the poor sap who inherits it)

  3. Implemented the back end for autocomplete. Unfortunately, autocomplete is not magical, and it is up to a developer to decide what can be auto completed. At the moment I am going with sutta titles and text tiles in all languages. It knows what language a text title ‘is’, and if an autocomplete is chosen the search will be performed in that language. Or that is the plan anyway. Titles might turn out to be a dud with a better approach being using a vocabulary of words derived from titles. With the scheme I’m using it’s very easy to change the vocabulary.
    As far as I know big search engines actually autocomplete based on other people’s searches. This is obviously useful, but also carries the danger of causing feedback (i.e. some searches are popular simply because they were popular), which is a great idea when you’re into riding trends, but I’m not sure about the applicability to Sutta Central.
    Work remains to be done on the front end (in browser) aspect of autocomplete, but I expect this to be operational on Monday or Tuesday.


More updates:

Uncovered a potential database corruption problem which could cause texts to become unreachable on the site. Added code to automatically repair this condition.

More work on autocomplete - this turned out to tie into language selection as well, and also a new way of doing dropdowns for selections. Over time the number of languages hosted on Sutta Central has positively proliferated, this has caused some issues with the interface especially on small screens and as of now no work had been done on discoverability of translations secreted away in obscure corners of the site.

Although full interface localization is not yet in the pipeline it will be possible to select a ‘translation language’ which will customize pages, making links to translations in the selected language appear prominently, and it is clear at a glance which divisions actually contain translations in the chosen language. This will be of benefit both to english users and bilingual users and improve discoverability of translations in the less well known collections.
Some of these features are usable right now on

Also investigated methods for two way integration with, for suttacentral, where is this text being talked about? For discourse, automatic generation of links to suttacentral.


Latest Updates:

Firstly, the translation language chooser is working now. The site will remember your choice of language and make it easier to find translations in that language. This should also be a big improvement for the mobile interface. As a part of this we also have our first true localized strings, in the divisions menu in the header, if a division has translations in your chosen language, it is bolded. The title string is localized and correctly pluralized, so it won’t say ‘1 translations in english’, and also shouldn’t in other languages. Of course at the moment we are using google translated strings but if it’s stuffed up we can blame google :).

Search suggestions now function, hopefully making it easier to find suttas by title. The suggestion database is basically sutta and translation titles, and pali terms in the dictionary.

I also, after extensive debugging, stomped an issue which was causing stability problems, causing the site to occasionally become slow and unresponsive.

All sounds good. And we’re now hotlinking suttas like this MN3 or even this en/dn6. I am unreasonably excited.