… And making it for independent research teams such as us impossible to provide translation services for, let’s say, Pali to Arabic or Suaheli based on SC data. That’s a pity. I think if there is one benefit for this type of technology, then it is in helping the marginalized communities that are off the radar of both large companies and academic mainstream to get access to the material in reasonable time.
I must think of the Newar Buddhists in Nepal who petitioned Google time and again to add their language to the system but it was always refused based on the fact that there are just not enough Newari speakers for Google to take any interest in this.
Now our Dharmamitra project has been very successful in empowering the Tibetan community to access Western material in their own language, and our work on Pali<>Tibetan (which now needs to be stopped since we have to remove SC data from the training process) gave us, for the first time in history, the situation that Tibetan monks don’t have to learn the colonizer’s language first in order to read the Pali canon in their own language. It was one of the most moving moments of my career when I saw Tibetan monks pasting Pali canon snippets into our system and getting a meaningful translation out of that. So, as people might have guessed, I’ll remain positive about this type of technology.
I respect Bhante’s stance here and I will of course stick to his request. But I want to add the angle that his ban effectively hurts the smaller teams that actually care about his opinion, while the bigger teams that scrape the web indiscriminatively , will just continue. So, the academics lose, and big tech wins now.