Updates for the previous week:
-
Improvements to search behaviour , it is now easier to search by sutta title or uid with improved relevancy. Including numbers in a search (such as DN 2) produces fewer irrelevant results. It is to expected there will be some teething problems with a new system and it will require further tuning as time goes on. Fortunately, Elasticsearch is infinitely more tunable than the old system we were using.
-
Implemented zero down time when updating search indexes. When changes are made to the code which indexes content, it sometimes becomes necessary to completely rebuild the index. Previously, this would result in downtime as a delete and build procedure was followed, and during that time Search either returns an error, or returns only a subset of the possible results.
Now a technique is used called aliases. While a new index is being built, the old one continues to be used. Once the new index is ready, a swaparoo is performed, and once the new index is being used, only then is the old one deleted. The swaparoo is seamless and there is no downtime at all. As well as eliminating downtime, it also lowers the barrier to implementing improvements.
As a part of implementing this functionality I also performed significant refactoring of the search engine code, improving maintainability and comprehensability. Refactoring is defined as ‘behavior-preserving transformations’, by definition time spent refactoring adds no new features whatsoever but improves the code which provides those features. (The best way to know that code needs refactoring is if you’d hate to be the poor sap who inherits it) -
Implemented the back end for autocomplete. Unfortunately, autocomplete is not magical, and it is up to a developer to decide what can be auto completed. At the moment I am going with sutta titles and text tiles in all languages. It knows what language a text title ‘is’, and if an autocomplete is chosen the search will be performed in that language. Or that is the plan anyway. Titles might turn out to be a dud with a better approach being using a vocabulary of words derived from titles. With the scheme I’m using it’s very easy to change the vocabulary.
As far as I know big search engines actually autocomplete based on other people’s searches. This is obviously useful, but also carries the danger of causing feedback (i.e. some searches are popular simply because they were popular), which is a great idea when you’re into riding trends, but I’m not sure about the applicability to Sutta Central.
Work remains to be done on the front end (in browser) aspect of autocomplete, but I expect this to be operational on Monday or Tuesday.