Ven. Anandajoti made me aware of this very interesting project:
The Github repository for it is here:

This repository contains the code and input data for the calculation of possible quotations and similar passages within the gretil corpus based on SIF-weighted averages of word vectors. The output is both a set of tables as well as the visual representation above.

We can do the same thing for pali texts and possibly between pali and sanskrit texts. It would show a whole lot of possible parallels and connections between suttas.

This will be of substantial benefit to parallels-research. There is only one very big catch: it needs a whole lot of computer power, far more than I have. I can probably adapt the program to work with the pali sources, but running it would be a very different matter.

Another very interesting project this guy has written is an allignment between sanskrit and tibetan texts:


