TML: translating all Pāli texts into English

We have now updated the BuddhaNexus website with new TML translations for the commentaries. They are already a lot better now we also added several commentarial text translations. Especially the ones from https://www.ancient-buddhist-texts.net/ (Ven. Anandajoti’s translation) where we could add 45.000 additional matches for training purposes.

We experimented a bit with the number of epochs the system should run. It was found that for Early Buddhist texts the higher number of epochs worked best and the lower number for commentarial texts. In any case, the full data of each run can be found on the Github repo: https://github.com/BuddhaNexus/segmented-pali

Then we also started another project to gather together all aligned data we can find between ancient languages and english to use as training data. It is intended to also add the Sanskrit and Chinese and run the TML on those at some point in the (far) future.

The site responsiveness has also been improved with a dropdownbox for the view-selection instead of radio buttons on small screens as well as a really full “full-screen” mode.

In table view mode there is also a download button where you can download all matches into an .xlsx spreadsheet. I intent to add the same for the number-view mode.

Thank you all for your feedback, esp. @Snowbird, @mikenz66 and Bhante @sujato !

9 Likes

This is an amazing project, I just hope there is plan to include Anna too not just atthakatha and tika

Because before ven buddhaghosa wrote atthakatha he wrote visudhimagga, he did this so he didn’t do duplication so anything that’s common in four nikayas is poured to vissudhimagga while anything that’s specific to specific sutta is poured into atthakatha, you could find that the atthakatha often references the visudhimagga this is because visudhimagga and atthakatha are inseparable unit

1 Like