We have now updated the BuddhaNexus website with new TML translations for the commentaries. They are already a lot better now we also added several commentarial text translations. Especially the ones from https://www.ancient-buddhist-texts.net/ (Ven. Anandajoti’s translation) where we could add 45.000 additional matches for training purposes.
We experimented a bit with the number of epochs the system should run. It was found that for Early Buddhist texts the higher number of epochs worked best and the lower number for commentarial texts. In any case, the full data of each run can be found on the Github repo: https://github.com/BuddhaNexus/segmented-pali
Then we also started another project to gather together all aligned data we can find between ancient languages and english to use as training data. It is intended to also add the Sanskrit and Chinese and run the TML on those at some point in the (far) future.
The site responsiveness has also been improved with a dropdownbox for the view-selection instead of radio buttons on small screens as well as a really full “full-screen” mode.
In table view mode there is also a download button where you can download all matches into an .xlsx spreadsheet. I intent to add the same for the number-view mode.
Thank you all for your feedback, esp. @Snowbird, @mikenz66 and Bhante @sujato !