Voice release v2.5: Search memorizer

sabbamitta · December 5, 2020, 2:15pm

… Finally, it was done! The new Voice version 2.5 has just been released.

You won’t perhaps see much difference, compared to the previous version. Yet a lot of work has been done by @karl_lew that is not so visible.

Firstly, we have built a new search memorizer. Before, if people have been searching a second or repeated time for the same thing, this second search would just take as much time as the first one. However, the result would still be the same. “Root of suffering” returns 7 Suttas, whether you search for the first or a subsequent time. So we thought, why not remember these results? This way, on repeated search, the result would be available much faster! This seemed especially useful as we could see in our logs that most searches were done with terms from our examples list.

AN5.155:6.3:
It’s when the mendicants memorize the teaching—
statements, songs, discussions, verses, inspired exclamations, legends, stories of past lives, amazing stories, and classifications.
This is the first thing that leads to the continuation, persistence, and enduring of the true teaching.

We thought: “Why not teach our Voice robot to do the same thing?” So that’s what we did.

The new search memorizer is now caching the search results, and if the same search request is made another time, the result is already there! This works both for Voice and for scv-bilara, our command line search tool.

This means, since I did already search for “root of suffering” right now, this term won’t take as much time as it used to do before. From the day of the release onward, the memory cache will gradually fill up, and more and more terms are available almost instantaneously. However, this also means that changes that are made to the texts won’t show up in your search results. This isn’t very relevant for English, as Bhante Sujato only occasionally still makes some edits, but it is more relevant for German where my translations are continually evolving and new texts are being added. In the future, that may also be the case for other languages.

Therefore, on content update the search memory cache will be cleared in order to allow for new search results to appear. But we won’t update content as often from now on as we used to do in the past. Having a faster search means that you will have to wait longer for new content to appear.
As many of you may have noticed, over the past weeks the SuttaCentral main site has been down a couple of times, and this also led to outages of Voice. Voice is dependent on SuttaCentral in various ways. It pulls information like legacy texts, titles, legacy author information etc. from SuttaCentral; basically, all the data that are not in bilara-data—of which Voice has its independent copy.

In order to reduce these dependencies we have built another cache to store these data, except for the legacy texts; this would be another huge task to get our own version of those. But with the new SC-API cache Voice is able to stay up in case SC is down, even if it can’t show legacy texts. At the very least it will then show a relevant error message in case you are trying to access a legacy text—i.e. for example a translation by Bhikkhu Bodhi—and Voice is unable to respond to your request.

To keep up-to-date with changes in the SuttaCentral data, this cache has to be rebuilt from time to time.
While fixing a download bug it turned out that we needed to switch to a new operating system for our Voice servers in order to get the latest version of ffmpeg running, a software tool required for building the download files. At the same time, we also upgraded our own machines to the new OS standards for our local Voice installs to be compatible.

On this occasion, a completely new downloader has been built, and at the same time we also integrated new download formats. Voice does now allow download in MP3, Opus, or OGG formats; the latter two being much smaller in file size, compared to MP3, so that more sound files can be stored in the same space.

We hope to have smoothed out all bugs for the new downloader, but in case we haven’t, please help us find them!
Following this request by @Snowbird, we have added a new verse output format to our command line search tool, scv-bilara—which at a later date we’d also like to make available as a tool for general use with a user-friendly interface. It turned out that this output format wasn’t quite what Snowbird needed, but I guess all Voice devs that have tried it out are very happy to have it! So a big thank you to Snowbird for the request!
We still found another Pali pronunciation issue that the SuttaCentral team has kindly fixed for us by removing some numbers from the root text that should not be there.
Well, and a few more bugs have been fixed as they arose … see all issues of this release here.

Thanks to all our users for being with us. You make us feel that our efforts are not in vain!!

Please make use of our feedback thread or call us by typing @devs-voice.

Stay tuned! We keep developing awesome features … at least Karl and I find them awesome, if nobody else, and we’re having much fun! Thank you so much, Karl, for working together in this way!