AI-2: Machine translations of suttas are the wrong solution for the wrong problem

sujato · April 9, 2024, 8:01am

An Shi-Gao, known by legend as a Parthian prince, settled in the Chinese city of Luoyang in 148 CE, where he proceeded to translate over a dozen texts into Chinese from Sanskrit (or something similar). He was the first translator of Buddhist texts in China, beginning the long, slow process that created the Chinese canon we have today. Many years later, a similar process was undertaken in Tibet.

These two translation projects are great monuments of faith. They have lasted for hundreds of years, and formed the basis of Buddhist culture for entire nations. And they have done so because they have a genuine worth. We want to read them because they express the Dhamma, and because we can feel the Dhamma through a unique human voice. Every translator is imperfect in their own way.

Today, translation of Buddhist texts is proceeding faster than ever. SuttaCentral hosts some 500 MB of translations, and we are just one project of many. The work is far from finished, and many texts remain untranslated: Pali commentaries, as well as most of the Chinese and Tibetan canons.

Translation is hard, and it requires a range of skills that not everyone has. But it’s not that hard, and there are not so few people with the skills. I translated the entire corpus of Pali suttas in a few years. I’ve never had any formal Dhamma education, nor have I any institutional support. I know what it takes. In my estimation, there are dozens of people who use this forum who are more than capable of translating suttas.

It remains a mystery to me why we see so little effort in this regard from all the great institutions, the monasteries, organizations, universities. There are thousands of people there and billions of dollars. All it takes is some leadership and focus.

Normally a translator can translate about 2,000 words/day. That’s how I worked out that I could finish the Pali suttas. And that’s how I know that a few teams of dedicated translators could translate the untranslated texts in maybe a decade. There’s nothing stopping us.

Several projects are, in fact, working on this. Numata has a long-standing plan to translate the Chinese canon. 84000 is proceeding steadily, creating high quality translations of Tibetan texts. Alex Wynne plans to translate the Pali commentaries. These projects are proceeding, and the gaps are being filled.

So why do we need machine translations? What problem are we solving? Why do we need to burn massive computers for a month to generate an AI model that can spit out some auto-generated text based on Buddhist scripture?

What problem are we solving?

The singer and tech will.i.am has been developing both AI startups and education programs. Time reports him nailing the answer:

The investment that society has put in AI (artificial intelligence) surpasses the investment for HI, which is human intelligence. It’s so lopsided that subconsciously we know that we haven’t invested in our youth, in our communities. We haven’t invested in humanity to keep up with intelligent machines. … It’s hard to raise money for kids. It’s inhumane that’s even a fact. Why is it so hard to raise money for education? That keeps me up at night — I don’t understand it.

The work that has been done on creating digital texts was done out of faith. The Pali text used by SuttaCentral ultimately stems from the work of the Vipassana Research Institute in Pune, Maharashtra. That’s one of the poorest areas in the world, where the Buddhist community is comprised of Dalits. Thse people were formerly outcastes from Hindu society, beneath the lowest of the low, despised slaves whose existence was not merely justified but required by God. They found freedom through the work of Dr. Ambedkar, who taught them that the Buddha showed a way for them to lift themselves up. Now, there are Amedkarite Buddhists here in Sydney, working in IT mostly, valued friends and devotees of the Buddhist community.

And we at SuttaCentral, as well as most other Pali projects around the world, rely on the work of their people every day. Can you imagine what they were working with in Pune in the 1990s? It must have seemed amazing to them. I imagine they got hold of a bunch of computers, and human volunteers sat and typed on them. It must have been a lot of people! Doing something so meaningful, joining together to create something great; I wonder how this changed the lives of those who did this? Clacking away on clunky old computers from the 90s, doing work that is still just as valuable today.

Not too much has changed in some ways, as monastics and Buddhist devotees are still working, providing volunteer labor out of faith. This work is scraped up and used by the AI engines, with never a thought to give anything back.

SuttaCentral pioneered the use of the liberal CC0 license, which waives any kind of legal limitations on use of our texts. However we also note that texts should be used “in accordance with the values and principles of the Buddhist community”. This is deliberately vague. But what are those values and principles?

One of them is generosity (dāna). No traditional Buddhist would think of using a temple, taking their freely offered books, and learning from the Sangha without offering something in return. Any Buddhist could tell you that the Sangha offers dhammadāna (“the gift of the teachings”), while the lay community offers in return āmisadāna (“the gift of material support”). This can take many forms: a donation when hearing a teaching; an offering of alms-food; help with chores in the monastery; volunteering for the organization, and so on. But we would never just take without giving. It’s not proper.

But that’s what they do. They take and they take and they never give a single thing back. These projects have used our work for years, and we’ve never got a thing in return. Rich western institutions gobbling up the work of Buddhists. It fundamentally breaks the Buddhist tradition, violating the balance that was established by the Buddha, and by which the Buddhist community has thrived for 2,500 years.

What we should be doing is nourishing people. We should be training scholars and teaching languages. And this should be done in a properly Dhammic context: not just academic scholarship devoid of meaning, but genuine living education that draws out people’s spiritual potentials and talents. Education based on facts and faith. Students should be taught, not just the grammar and meaning of ancient texts, but the art of expressive and meaningful writing, which is by far the hardest part of translation. We need leaders who will inspire people, and we need funding and institutional support.

What happens when you do this is quite remarkable. Because the most important difference between a machine “translation” and a human translation is not in the words they output. It’s that one is a machine, and one is a person. When a person does this work, they spend their days reading the words of scripture, internalizing them, and finding a way to express them. Their work is exploratory and integrative. It changes them. It can’t help but change them.

I’m a different person than I was before I did my translations. Look, maybe I’m not a great person; most days I struggle to be a reasonably good one. I have my moments. But sheesh, you should have known me before! My jokes were even worse! Every day, the understanding that I have learned helps me. It helps in my meditation, and in dealing with the issues in my life. It helps in my teaching and communicating. I’ve learned at a deep level, and many problems that I see younger students struggle with are just solved issues for me.

That’s what you do when you have human translators. You make better humans. Look at the contributions of monastics like Venerables Bodhi or Analayo or Brahmali. Their translation work is a part of their life and part of who they have become as human beings. It elevates them, and it elevates all of us.

This can never happen from a machine translation. The machine makes things too easy. It spits out an answer, seemingly solving every hard problem. But of course it doesn’t. It just pretends the problems aren’t there.

These things take time. I usually work on things pretty quickly. This article you’re reading now, I wrote in a couple of hours, plus some fiddling around. But I’ll spend a day, sometimes several days, just getting one word of Pali right. Mostly the conventional answer is okay, and I could get away with just sticking with what everyone knows. But I can’t do that. There’s a tickle in my brain that won’t let me. It smells when something is not quite right. When I say, “Okay, good enough” and move on, it goes quiet for a bit, then comes back to nag me. It won’t rest until I’ve followed it through to its end.

When you can just get any translation created by a machine like that, it degrades the humanity of the translator. This is why, all over the world, creatives are protesting AI. Journalists, artists, actors, authors, painters, photographers, filmmakers, songwriters. They’re all seeing their work slurped up by these AI engines, which then spit out a half-baked travesty of creation.

Someone sent Nick Cave a song created by ChatGPT “in the style of Nick Cave”. He responded on his Red Hand Files:

I understand that ChatGPT is in its infancy but perhaps that is the emerging horror of AI – that it will forever be in its infancy, as it will always have further to go, and the direction is always forward, always faster. It can never be rolled back, or slowed down, as it moves us toward a utopian future, maybe, or our total destruction. Who can possibly say which? Judging by this song ‘in the style of Nick Cave’ though, it doesn’t look good, Mark. The apocalypse is well on its way. This song sucks.

What ChatGPT is, in this instance, is replication as travesty. ChatGPT may be able to write a speech or an essay or a sermon or an obituary but it cannot create a genuine song. It could perhaps in time create a song that is, on the surface, indistinguishable from an original, but it will always be a replication, a kind of burlesque.

Songs arise out of suffering, by which I mean they are predicated upon the complex, internal human struggle of creation and, well, as far as I know, algorithms don’t feel. Data doesn’t suffer. ChatGPT has no inner being, it has been nowhere, it has endured nothing, it has not had the audacity to reach beyond its limitations, and hence it doesn’t have the capacity for a shared transcendent experience, as it has no limitations from which to transcend. ChatGPT’s melancholy role is that it is destined to imitate and can never have an authentic human experience, no matter how devalued and inconsequential the human experience may in time become.

A songwriter knows this. Why is it that we cannot see it?

Our problem is not that we don’t have enough translations. It’s that we have lost faith in humanity. Buddhists of the past, of out faith, poured their hearts and minds into scriptural work, creating something glorious that has lasted for millennia. We push a button and create disposable pap from a machine.

Why not both? Why not treat machine translation as a “good enough” stopgap that supplies some basic needs, while also sponsoring real translations. Why not both, indeed.

Why not both horse and cart, and petrol driven vehicles? Why not both paper letters and email? Why not both landlines and mobile phones? Why not both corner stores and Amazon?

What we have learned, over and over again, is that people won’t use what is best, they’ll use what is there. And what is there is devices with screens. And every big tech company is devoting billions to dig AI into them as deeply as possible. It’s there, whether you like it or not.

When we, as a Buddhist community, make our own AI projects, we are lending our voices to the AI overlords. Say what you will, do things differently if you can, but the message is still the same. AI is good actually and machine translations are just fine by the Buddhist community .

Why is it that while the artists and singers, the journalists and poets, are fighting to keep their humanity, we just give ours away?

What faith have we lost that has brought us to such a state?

Sasha_A · April 9, 2024, 12:25pm

Unfortunately, if in order to understand the meaning of a sutta, one has to not only open two or three versions of its translation, the original Pali, and become familiar with the peculiarities of the views of a particular translator, then there will always be not enough translations. And in this case, AI makes it possible to do the impossible: not only to perform the translation, but also to rewrite it according to the user’s requirements. For example, if the user thinks it is correct to use this definition of a term, the AI can rewrite the entire translation to fit this new definition. That’s first of all. Secondly, English is not the only language spoken in the world.

This is not a loss of faith in humanity, progress is evidence of a sober realisation of the limitations of human capabilities and the desire to expand them. If anything, AI is just the opposite, it is evidence of faith in the human ability to transcend the limitations of our own nature and do the impossible.

Isn’t it great that jobs that used to take years and require special abilities can now be done by anyone? Isn’t it a great benefit to any Buddhist that now instead of hoping and waiting for a translation of a text of interest, perhaps for years, it can be obtained in acceptable quality right here and now?

What is the criterion of value here? Labour and time spent on translation? The benefit to the translator himself and his pleasure in working on the translation? Or is it the accessibility and quality of previously inaccessible material for Buddhists?

This is because, unlike people in these professions, Buddhist texts are usually translated by Buddhists themselves, and not to earn money for worldly happiness. In general, only the work of the creative professions, whose quality is no different from the pap that anyone can get from a machine at the touch of a button, will lose value. It’s just that automation has now reached these professional areas, just as it had reached more technical one before. However, as in all other areas of progress, the value of the work of truly outstanding authors will only increase, and those who are passionate about their work will continue to create, even if no one else values their work.

Moreover, everything that was created before AI is increasing in value right now, and will only continue to increase - this is the new “low-background steel”, but in the field of data.

What kind of faith is this and what is its object if it can be lost from the fact of the appearance of a new instrument in the world or, in general, from the fact of some change in the world?

Venerable, I also do translations of Buddhist texts. For me, this is primarily an opportunity to work with the text on a deeper level than just reading it. And I do it absolutely for free, in the public domain and essentially anonymously. Secondly, I translate only those texts that I personally find extremely important and that I want to share with those who cannot read them due to a lack of translations in my native language. And for myself, I see only great benefit and joy in the fact that everyone will have the opportunity to read a text on the Dhamma in a foreign language at the touch of a button.

Isn’t following the Dhamma yourself and making the Dhamma accessible to others the only real way to change the world for the better?

Nessie · April 9, 2024, 4:12pm

sujato:

What happens when you do this is quite remarkable. Because the most important difference between a machine “translation” and a human translation is not in the words they output. It’s that one is a machine, and one is a person. When a person does this work, they spend their days reading the words of scripture, internalizing them, and finding a way to express them. Their work is exploratory and integrative. It changes them. It can’t help but change them.

I’m a different person than I was before I did my translations. Look, maybe I’m not a great person; most days I struggle to be a reasonably good one. I have my moments. But sheesh, you should have known me before! My jokes were even worse! Every day, the understanding that I have learned helps me. It helps in my meditation, and in dealing with the issues in my life. It helps in my teaching and communicating. I’ve learned at a deep level, and many problems that I see younger students struggle with are just solved issues for me.

That’s what you do when you have human translators. You make better humans. Look at the contributions of monastics like Venerables Bodhi or Analayo or Brahmali. Their translation work is a part of their life and part of who they have become as human beings. It elevates them, and it elevates all of us.

Very well said, Bhante. I think this essay spots marks for our discussions, and I hope our generation will happen to encounter that substancial change-of-direction towards humanity in our process of thinking which you indicated already by the sentence: the most important difference between a machine “translation” and a human translation is not in the words they output. It’s that one is a machine, and one is a person. It correlates perfectly with one of my central ideals, when I teached students of doing empiry (generate and evaluate data) but to do so for their own growth of self-awareness and humanity. And it gives deep joy to see this in one of two forums which I visit often expecting deep buddhist musings…

yeshe.tenley · April 9, 2024, 6:26pm

How about using AI for discovering parallels or similarities? Would this be a good use for AI in your estimation Bhante?

BethL · April 9, 2024, 8:07pm

Bhante, from your initial thread yesterday, I saw these noteworthy ideas – indeed, this may dryly summarize all of them:

AI will evolve to cross-analyze & index suttas to “produce essays which are of higher quality than any human can.” I (BethL) call this better meta analysis for study and other purposes.
AI will evolve to produce more volume at higher quality more consistently by non-specialists. Supposedly this has been the trajectory for all tool-based evolutions in human history.
AI quickly enables “first-cut” translations (but not the finished product). When working across multiple languages, this saves time & effort (i.e., increases output or volume at the initial phase of translation).
Following on #3, this becomes more important when time and near-term capability is of the essence for some of us.
The cat is out of the bag, as they say. Given that, we can only mitigate risk and cause the least harm possible. This involves taming AI as best we can. (Of course, this isn’t problem-solving; it’s risk acceptance and mitigation. But it was mentioned in several posts in yesterday’s thread.)

This goes to the “volume & pace” proposition that several spoke to in yesterday’s thread. This seems to be people’s main focus. Does today’s situation warrant that we take on the AI risks?

May this come to fruition.

It would be interesting to hear people speak to this more directly.

That AI is a wholly transactional thing (like anything in computing) is not so bothersome as the fact that this will be obscured for most people – and some of that on purpose. So on the server side we must be motivated by faith, generosity, intelligence, and compassion with supreme mindfulness. So that those on the client side are not wholly deceived.

Alas, can a greedy corporation or nation-state act this way… sigh…history says no. The underlying tendencies. It’s playing out in real time today. Guns, Germs, & Steel.

sujato · April 9, 2024, 9:06pm

The percentage of readers who are competent to to do this approximates to zero.

Thanks Nessie, it sounds like you’re doing your students a great service.

That’s essentially what BuddhaNexus does, and I certainly have less problem with it than with translations. Basically by doing that they are making something that is sort of like a “product recommendation” algorithm. “You liked this sutta, perhaps you might like this one!”.

Something like this, or like, say, using language models to improve search, don’t run the risk of being directly mistaken for human. So I’m kind of on the fence about it. There are other problems with this tech, such as energy consumption, although in a small operation like BuddhaNexus that’s not such an issue.

The only kind of applications that I feel are 100% justified are for accessibility purposes, such as text to speech, which is a real lifesaver for people with vision impairment.

Great point, thanks.

Viveka · April 9, 2024, 11:05pm

Am I really the only one who is looking at the suttas as the Buddhas instructions for practice for Liberation? It is likely an artifact of the self selection of individuals on this forum… those with a different focus are secluded and renouncing the worldly. But I believe in and support the work of Sutta Central, in making the Buddha Dhamma accessible, so I will give a reminder of this perspective. To me the underlying purpose of all this effort is to keep the Path to Nibbana accessible for those who wish to Practice. So for me the question comes down to what supports the Path to Awakening and what hinders it.

For a long time I have watched how people mistake cognition and intellectual understanding for being on the Path. As far as I am concerned, the Buddha has used Language to communicate what lies beyond language. This was a ‘Super-Power’ of the Buddha to be able to do this.

In order to understand the Dhamma one has to see and know beyond the words. If the words were the only thing, then no problem - do what you will. But they are not. For me the words are like the Buddhas fingers - pointing to the exit from Samsara.

This insatiable greed for knowing every word the Buddha said and rapacious search for any logical inconsistency… Where in the N8fP or the 37 wings of enlightenment is this listed as a factor? This is misunderstanding of Practice. Ayoniso Manasikara. Truly, if contemplated in line with the 8 fold Path, one does not need that much text… the 4 Noble Truths are enough. Ok one needs well developed spiritual faculties to be able to make do with this - so ok 10 suttas - 20 … that will suffice. No-one has had as much access to the texts as now… yet there are less awakened Beings. It is not the quantity of words that is the essential thing. It is depth of practice.

The Buddha said that one of the things required for the continuation of the Buddha Sasana is the existence of enlightened Beings… Those who can demonstrate - experientially - what the Buddha was pointing at.

An experience of any of the Higher states of Consciousness (even cessation of consciousness) surpasses any and all ‘words’ used to describe it.

The problem is that the words are literally the ‘fingers of the Buddha’, (the words of another), pointing where we have to look, using levels of consciousness that transcend the ordinary. It has seemed to me that this scholarly focus (when misapplied) is one where people become experts in ‘the Finger’ and have abandoned trying to see and experience what the finger is pointing at.

The finger (texts) is a tool ! It is a component of the raft .

If one practices and sees for oneself, then the texts take on their proper place. There is no need for ‘perfect’ consistency across every sutta. Individual words are given their correct importance - not that much… If it complies with the 4 Noble Truths… or even more fundamentally with - “All that has a cause to arise has a cause to cease” - that is the pertinent matter. Once you see and transcend, then the role of the teachings is understood. Arguments and speculation about the latter stages of the Path are a waste of time - cultivate them to be experienced each for themselves. Be Patient - not greedy for instant gratification…

Even without AI, I have been observing this trend with a feeling that there is widespread Ayoniso Manasikara spreading like a virus - to the detriment and limitation of the potential for Liberation.

Enter AI and this situation becomes exponentially worse. When I speak of the disadvantages of AI, I do so in relation to the Dhamma - to the goal of Awakening, and not in relation to AI in a broader context.

When I read the suttas, I read between the words as much as reading each word itself. I don’t grasp the words, I let them gently wash over me and I allow this pointing to have an effect within Samadhi. This is where seeing and knowing occurs. You really don’t need much at all. Once you learn how to see, you can see for yourself. No intellect/logic required. It is experienced. Experiential Knowing.

I have constantly been amazed at how oblique the meanings are. It is NEVER what you think or imagine it will be EVER

When this level of Knowing has occurred then one can see how some seemingly incongruous angles of descriptions are all unified. The problem isn’t with the words - the problem is with conceptualisation.

Quite simply the focus is on the wrong thing. Ayoniso Mansikara. I can guarantee that it is not the quality of the translation that will be the factor that decides whether you will get enlightened or not I have to qualify this because certainly if the translation is really wrong and misleading then it will lead you in the wrong direction. And this is why statements like this are a real concern

So a user (not personalising this to Sasha but in general terms), who has no level of Enlightenment, who does not have perfected Right View, is going to re-write the suttas and create a whole new ‘pointing’ from the perspective of Ignorance, in line with their preferences (fully derived from craving), while fully enmeshed in Delusion. Counterfeit and decoy Dhamma. This then gets incorporated into the data and used/weighted as ‘equally valid’, and so the statistical drift of meaning goes… ceaseless spiral downwards… The lid on the coffin…

Already it is so difficult to stumble across the actual Buddha Dhamma in this world with its profusion of ‘personal dhamma’ - just each persons preferences, opinions and papancha. The blind leading the blind. There is SO much chaff in what is called ‘buddhism’ in the world… finding the needle in the haystack is easy by comparison.

Now we have AI telling us what the true Dhamma is. I have read some of these ‘answers’ by AI… It is worse than just gobbeldygook… Because ‘they’ have access to so much data it can appear to be irrefutable in its ability to argue and cite references… at least when relying on intellect, logic and language. Beyond this it is absolutely meaningless. This appearance/illusion of substantiation convinces those who have not yet been able to see for themselves.

But as so many have pointed out, this is the way of the world. The Buddha saw this from the beginning - this inevitable erosion of the Dhamma > back towards ‘flowing with the stream’ - the path of least resistance…

It is not good or bad - it just is, and I have no doubt that the Dhamma will disappear from the world until at some time in the future a new Buddha will appear. After all poor humans are quite similar to AI in some ways. We too are conditioned in a cascading feedback loop - drawing us ever downwards deeper into the vortex. It takes a lot to dispel the vortex and to be free from the conditioned.

But still, this does not mean that we should not try to act as skillfully as possible - to see beyond what the world feeds us, not to just go with the flow, and not simply gobble up whatever is fed…

Good on you Bhante @sujato for resisting the vortex. The pressure is immense - this illusion that there is no choice but to go with the momentum.

There is a choice. Once the vortex has been dispelled it is seen for what it is - as Nothing… simply the momentum of conditioning… This vortex of Samsara will continue - but one can choose whether to step out or not.

Snowbird · April 9, 2024, 11:26pm

One issue that I haven’t seen discussed is the fact that machine translations are not consulting the commentarial tradition as an influence on the translation of root texts. And even if a pile of commentarial data was dumped into the blender, it’s still not going to be able to have the ability that the human translator has to discern how much weight or value to place on any information in the commentaries.

Sasha_A · April 10, 2024, 11:32am

In this case, the readers don’t need to be competent in translation. This competence is delegated to the AI.

The problem this solves is the translator’s incompetence in understanding the meaning of the text he is translating. For example, when a translator takes on the role of an expert not in the field of translation, but in the field to which the text he is translating belongs. In the context of translating suttas, this will be a situation where the translator, being an ordinary person, that is, without having the right views and without having at least entered the stream, either does not realise or simply does not know the limits of what he can be sure of in his interpretation of the text he is translating.

The competence to choose the correct definition of a term from the Dhamma lies in the realm of having the right views and not in the realm of the competence of a translator, philologist or historian.

Are the available translations made by people who have the right view?

BethL · April 10, 2024, 2:22pm

Bhante, with gratitude and respect for raising these important questions in the AI threads . As you mentioned, there are other subjects you’d rather be contemplating…

I’m putting this into the grey “meta analysis” box where there seems to be some wiggle room in these evolving threads.

The carbon-greedy nature of computing is not solved by anything. Other than reducing the use of computing. Quite soon (window is closing) we have to go low-carbon in the high-consuming countries. Otherwise, we will be talking about many other things besides AI (I don’t need to name the other things … everyone reading these threads already knows).

I’m changing my perspective to low-carbon going forward … we will never collectively get to zero carbon because I’m convinced humanity cannot live without carbon-based stuff.

As such, I’m happy with the heavy consumers (like the US) massively scaling back – causing much of our privileged lifestyle to come to a screeching halt – so that low-carbon uses are maintained into the low-carbon future. In fact, such an apocalyptic-like pause in the US could be a window for investing in human intelligence again. That way we can teach children to relate intelligently to AI.

There must be a way, going forward, to firewall the technology for this type of application. Given that, it will never be a 100-percent solution.

SebastianN · April 10, 2024, 7:53pm

I totally get the point about Dāna. BuddhaNexus was hopelessly underfunded (I did a lot of work for free, and ven. Vimala did of course all their work for free which never got valued accordingly by the institutions taking advantage of that software, which I think is a grave case case of ignorance).
I was a poor grad student back then living month to month and there was no way for me to donate even a small amount. Right now I am still a poor grad student, working on a different project.
For Dharmamitra, the situation is not so much different to be honest. We haven’t received a penny from any Buddhist institution that is using the translation system via monlam.ai. This is all, so far, research charity.
I think its more than fair that money flows back to the communities that created these translations in the first place, I will keep this in mind.

BuddhaNexus actually uses machine translation under the hood to create the semantic mapping (basically the encoder stack of a machine translation transformer that puts out the similarity score of two sentences as a logit).
I guess we have to re-evaluate all of that now. We had larger scale plans to map the Pali canon and the Chinese canon with each other to find similar passages, but since we need to pull the English translations now out of the language model stack, that will make this task more or less impossible. Perhaps best to work on a different problem now.

Regarding the overall question of whether we, all ethical considerations aside, need machine translation for this material or not, I only have to look at the fact that less than 10% of the Chinese canon are translated, and there we have the answer.

Viveka · April 10, 2024, 10:43pm

Yes you are right of course… and it is hard to know. However, using the system prescribed by the Buddha with constant group reciting/reviewing (including those with RV) for passing on teachings, distortions were kept to an impressively low level. However, distortions (purposeful or accidental) become quick, easy, and I’d say inevitable, with this technology.

With best wishes and much metta to all who work to preserve and share the teachings. May we all find a wise way to proceed

sujato · April 10, 2024, 11:23pm

Thanks Sebastian, that’s really important. I’m grateful that you’re taking this on board.

richard.nagyfi · April 12, 2024, 11:30am

This was a great one!

Snowbird · April 18, 2024, 9:07pm

One great way to test the “intelligence” of these LLMs is to ask this question:

It take three hours to dry three towels on my clothesline. How long will it take to dry six towels?

I tried this on Metta’s Llama3, it’s most advanced AI just released yesterday:

This is such an obviously simple problem, and it fails completely. How then can we trust that these tools will do a good job translating very difficult texts or answering very difficult questions in line with the Buddha’s teachings.

BTW, we get a more detailed but equally wrong answer from Claudio:

Khemarato.bhikkhu · April 18, 2024, 11:45pm

llama3 is a 70B parameter model. This is smaller than even GPT3 which has 175B:

If it takes three hours to dry three towels, it will likely take the same amount of time to dry six towels since you’re doubling the load.

And GPT 4 (1.7T) is even more nuanced:

If it takes three hours to dry three towels on your clothesline, drying six towels would also take three hours, assuming you have enough space on your clothesline to hang all six towels at once and they are exposed to similar conditions of air and sunlight. The drying time is not dependent on the number of towels but on the conditions and the space available to dry them. If the towels can all be hung out simultaneously without overcrowding, they should dry in the same amount of time.

This is one of the examples they give to justify the claim that larger models are getting “smarter.”

Dheerayupa · May 8, 2024, 2:48am

A bit off topic here, Bhante. But just for your information, as a professional translator, I can do 2,000 words a day, but… a big but here… but it usually takes much longer to provide excellent-quality translation.

The work in my previous life (before Australia) involved sensitive issues with particular tones and nuances and with such great importance that no mistake in any aspect of translation was allowable. We needed to do lots of research, including but not limited to asking the writer (where possible) or the authorised person for the ‘real’ message intended to convey.

After the first draft, we had to review our own work before giving to another competent translator (e.g. our supervisor or the person authorised to issue the Thai version) for another review.

Then, we did another review for typos and suchlike that might have escaped at least two pairs of eyes before publishing the translation.

On one occasion, the translation of a message from one head of state to another got read by nearly 10 people, including a person close to the latter, before I finalised it.

Buddhist texts are so significant for human beings that they need more than one day to get 2,000 words properly translated.

So, I don’t hope to finish my translation project before I die.

Snowbird · May 8, 2024, 3:26am

Well, I hope you have a very healthy and long life!!

Dheerayupa · May 8, 2024, 4:29am

Thank you. I’ll try!

sujato · May 8, 2024, 6:45am

That is a very good point, although on the other side, since much of the content of the suttas is repeated, it would probably work out that you actually translate 1,000 word/day in order to complete 2,999 words. Still, on the other other hand, the context and research required to translate is more complex than any modern text. So … well, I guess we just do our best!