Towards a multilingual comparison model
The holy grail of parallel research using machine learning tools has always been the comparison across languages. The real value of a parallel is when we find a text in one language that has also been translated in another. We can then infer that the text was already in existence before the split of the respective schools.
Using machine learning for this across all languages (Pali, Chinese, Sanskrit, Tibetan) is extremely daunting and needs an enormous amount of computer power, which we at BuddhaNexus simply do not have at present. But we have come up with some strategies that make cross-language comparisons possible and are currently working on this. We hope in the near future to run some tests on comparing Pali and Sanskrit, as these languages are fairly similar.
As a first step, we have implemented a multi-lingual view on BuddhaNexus.net which displays automatically generated sentence alignment between a given Sanskrit text and its Tibetan translation in a table form. More of these texts will be added later.
These matches between Sanskrit and Tibetan are also incorporated in the Text View mode.
The Sanskrit corpus has been considerably enlarged and now also contains the collection of the Digital Sanskrit Buddhist Canon (DSBC, University of the West), and a few other texts obtained from individual scholars and some extra texts from SuttaCentral will be incorporated shortly (f.i. Sanskrit fragments from the Turfan finds and the Udānavarga de Subaši).
The Sanskrit collection has also been reorganized. For practical and structural reasons, the corpus of Buddhist texts is thematically arranged in accordance with the organization of the Tibetan Buddhist Canon. An overview of the new organization can be found in the sidebar.
We have also widened our cooperation in terms of linking to other databases. Currently links with the following websites are available: SuttaCentral and VRI for Pāli, GRETIL, DSBC and SuttaCentral for Sanskrit, BUDA and rKTs for Tibetan, and CBETA, SuttaCentral and CBC@ for Chinese. You can find links at the top of selected texts.