AI-16: What to do?

Dogen · June 2, 2024, 7:00am

I agree, both on this and on the nature of corruption in the texts. And your work on your website is amazingly demonstrative of these points.

Which is why I don’t consider much difference in human and computer translations. They’re all very statistically prone to error due to same tendency to (human or computer) bias. Reading Ven. Sujato and Ven. Thanissaro side by side, sometimes they speak of complete opposites. If monks can diverge on their translations that much, what matters if the source of dispute is a computer or a bhikkhu?

So, the caveat is with every translation “This is a faithful attempt at translation” should be the benchmark. Even the actual words of Buddha aren’t the reality itself, only an approximation. They’re a guide to practice. And hopefully, any endeavour to expound the dhamma will make these remarks abundantly. Which is all the more reasons that such widespread LLM translations should be spearheaded by our monks.

So, with all those potential room for errors, I think providing a basic workable template to people who’ve never heard of dhamma can rouse interest in searching for Pāli and applying their own critical skills on the path. I think that’s a net benefit.

Your insight is always valuable good friend.

christie · June 2, 2024, 7:08am

Agree. I wish you the best. At the end of the day, every tool is just a tool. Using them appropriately can save time and provide a benefit. Using a hammer inappropriately may break one’s fingers. That’s a decision for each individual. I guess some here are concerned we may use tools inappropriately, just like some here go into passionate prolonged discussions with others they believe are misguided or wrong.

Khemarato.bhikkhu · June 2, 2024, 8:07am

I see it more as a corrective to the billions of dollars being pumped into hype. Even those who believe in AGI think it has the potential to kill lots and lots of people. Maybe we should believe them when they ask for regulation?

Actually a great analogy. The reason flying machines are so safe these days is because there is a lot of public oversight in the design, production, and operation of airplanes. You have to keep track of every screw that goes into building an airplane. Right now, the CTO of OpenAI can’t even answer basic questions about their training data.

Healthcare is a great example, as there have been many examples of diagnostic AIs that perform great on their test sets and then perform terribly in the real world, usually because the machine just learned some bias in the training set rather than learning how to actually recognize e.g. cancer or pneumonia or whatever. If such AIs were blindly trusted to replace doctors… yes, they really would have led to preventable deaths.

Thankfully medical device companies have mostly been ethical in testing their devices before deploying them “at scale.” Again, I don’t think reminding people that new technologies are dangerous and should be understood before being deployed in the real world is “a witch-hunt” It’s just… good engineering.

I mean… were they wrong?

I mean, it certainly seems that way, given the regulatory regime we have and the low price of the technology. But that doesn’t have to be true. Laws and cultural norms can change. You don’t see many people playing with radium in their basements these days, for example.

I think your assumption here is that such applications (high-quality, AI-powered tools) can be built? I think Bhante’s point is that they can’t. That he’d rather have the tools be obviously bad than subtly bad (and that those really are the only two options).

Not saying I agree with him, btw. I, personally, think some useful tools can be built in this space. I’m just trying to bridge the communication gap here a bit as (I believe) I understand both sides of this.

Sorry if you’ve spoken about this already, Dogen, but may I ask about your technical background?

My own experience reviewing AI models is that… um… it’s really truly Hard. Like… interpretability is an unsolved problem, Hard.

I frankly don’t think Bhante Sujato has anywhere near the relevant technical skills to review your fine-tuning.

Is it though? I thought it was their base models that they run against all those benchmarks. No? The basic idea has been that throwing more and more high-quality data at larger and larger models is what leads to the LLMs’ “general intelligence.” Considering that all your sources of data are likely already on GitHub / the wider web (and thus in OpenAI’s training set) I don’t see how your fine tuning is going to significantly improve the base models…

May I ask how you’re fine tuning? On what data? What your theory is for why it’ll be better than the current state-of-the-art models? What is your plan to evaluate the accuracy of your fine-tuned model?

In the industry we had a saying: “In God we trust. All others must bring data.”

And these LLMs aren’t gods yet!

richard.nagyfi · June 2, 2024, 8:28am

I think everyone here makes really good points and there is simply no right or wrong answer at the moment.

I am very greatful for Bhante for trying to take a stance against the dangers of AI, just like he took a stance against climate change or bitcoin. I agree that most people are unaware of how the technology works and what their short- and longterm effects will be and there is a need to raise awareness. I found the essays and comments to be really helpful in providing a different take on the subject, outside of my own AI bubble.

I agree that the narrative is mostly about how bad AI is (if it’s unable to reach human level, then its an ureliable hoax, if it as able to reach human level then its a threat) but people are free to express their opssoing views. The AI field itself is divided with people updating regularily their opinions and future forecasts.
One thing that I have not really seen being borught up as a danger, is that masses will be radicalized in the near future as they have no understanding of the rapid changes in our times and just want to be protected and find a single scapegoat for their problems.

AI is indeed leveling the field, brining everyone to similar level, but this also has negative consequences. Less people are required to do the same job in a contracting job market and having more people compete with quality products only creates a noisy environment where it’s almost impossible to succeed.
But the technology is here to stay, and it’s really hard to believe that the entire human race can collectively agree not to do linear algebra anymore so we can stop any kind of progress in the field altogether.

I can understand the reason behind a full ban for using LLMs for translating sacred texts, but I am not a Pali translator so I can not tell how useful a tool like this would be to me in my work.
I do not see however, why search through vector databases should not be allowed (yes, there is no gurantee that the best result would always be recovered, but no other solution gurantees that either).

This is really helpful, thanks for the explanation!

So the thing about recent “AI” is that the field has been trying really hard for decades to encode rules into these models (such as language experts encoding grammar rules or adding some well known moves to a chess machine). These models work okay for a while then plateu and will seldom become better, no matter how many more additional exceptions we add to the encoded rules.
The bitter lesson is the AI models that do best are the ones where we humans step aside, where we do not try to encode our own human ideas into them, but instead provide the models with larger and larger amount of compute and data (while creating architecutres that are able to learn).

Traditional NLP models are far worse than LLMs and make even more mistakes. The translations or search results they provide are usually many magnitudes worse. They have also been hyped as solutions for analysing large corpora of the text. They’ve been expected to “work really soon” so the job of translators will be unnecessarry ever since the cold war. It does not matter how many language experts worked on them, language simply can not be bounded by a set of rules.

LLMs are trained on a large corpus of text (larger than the entirety of Wikipedia itself with all its languages). They can incorporate opposing, crazy, wild ideas without having an issue of cognitive dissonance. You simply can not train an LLM on sacred, high quality text because of their data requirement. What you need this data for is to allow the model to “learn language”. Once a model is capable of generating text about concepts you can then further fine tune it. The reason LLMs do well on rare, archaic langauges too, despite the small amount of text available, is because they already have learned multiple languages by seeing weird texts from all over the internet. I can understand if people are reluctant to use a model for translating sacred texts that was part 4chan, but there is no other way currently as those models would still have a better understanding of human nature and interactions.

You can not really do that because of these same reasons. You can not 100% fine-tune them either. The models WILL halucinate even with RAGs.

@NgXinZhao happy birthday!

richard.nagyfi · June 2, 2024, 8:37am

Also one thing we should try to keep in mind is to separate the PROs and CONs for / against AI from the discussion of should we use AI in any way for Buddhist texts.

I believe most of the hate comes from people being fed up with the current state of the economy while seeing AI rapidly getting better and likely taking away their jobs in the near future. Not because the tech in general is so good for killing and so bad at translation.

Dogen · June 2, 2024, 8:48am

Of course. I’m the son of an ancient computer scientist, with another brother in the field. My technical proficiency is limited in that I can barely scrap some code. I have, however, worked professionally as a product manager in CS products, so I have a good understanding of the technologies. My brother is personally overview fine tuning of these LLM models for commercial applications.

On the most basic level, LLMs are transformer models, who try to finish your sentence. If I say “Once upon a time” it gives me “there was a princess” based on a probability outcome.

The reason these models can emulate dialogue, refrain from certain topics (like asking for pornography from ChatGPT, or illegal materials, it will deny your request, even though it can provide those things (numerous jailbreaks are a proof of this), is because they can be fine tuned to a certain degree at all.

It is a challenge to fine tune it on a general sense. On special fields, like customer service bots that reliably provide answers (because companies measure these things with painstaking details), it’s a rather easy even if somewhat a laborious endeavour from the get go.

But for something like Pāli suttas, it would be quite easy comparatively. Most of the suttas are repetitions. Breaking down to the constituents (khandas, haha), I would hazard we would have something like 20 strong repetitions (think, jhana formulas, DO formulas, etc), about 100 slight variations, and then we’d have some of the more unique cases.

Afterwards, you can manually make certain that certain passages always are translated as such. Then you can ask the bot to give all the unique semantic arguments that do not neatly fall into those parameters and see how it handles those. Manually assist there, if necessary.

On the whole, for the %5-10 amount of work it would take Bhante Sujato to translate entire Pāli corpus, you can have a bot that does literally the same work with 99% precision.

Later on, merging this bot with modules of other languages, you can have the same spirit of text in any language you want with a slight initial parameter configuration.

This is all very possible with the current technology.

This is all true. But we also have to differentiate the technology from its largest user. Cars are not Ford Company. ML is a very strong approach to handling large amounts of data.

Precisely for the reasons of oversight is why I think Venerable Sujato is in a unique position to dictate how such Pāli bots can be trained - using only copyright free data, providing reliably accurate dhammic translations that Ven has signed off. For the ethics of production, deployment and content all.

Khemarato.bhikkhu · June 2, 2024, 9:33am

Thanks for the background

You mean that he signs off on each actual translation (“human in the loop”) or that its raw translations meet his standards (“human eval”)?

Either way, it’ll be very hard when you start talking about other languages that e.g. Bhante doesn’t speak. A bot might have good recall of the English which it was trained on but then flub the Portuguese. So then you need a Portuguese translator… who has to translate the suttas… so we’re back to square one.

And if you mean that the machine can copy and paste their translations of the repeated pericopes: Bilara already does that.

So I’m still inclined to mostly agree with @Snowbird here:

As in the end, that’s what even you need, right? High quality, human translations in the target languages.

Dogen · June 2, 2024, 9:52am

Let me provide a working analogy of the process:

The Bot provides a rough draft of Jhānās formulas.
Overseer provides feedback on special terms and fine tunes.
From then on, bot translates all such instances with same phrasing every single time.

The difference is that currently, translations require a complex set of skills - linguistic understanding, dhamma knowledge, and most importantly time.

With the translation bot in place, we would have access to a certain database of some, for example, 500 terms at max that are very important. Then it would take a special session with authorised monks, to train translaters and find the exact wordings that are proper in the target language. I think it could be condensed to a week optimisticly, at most a month.

These bots would no doubt get more sophisticated in time, but could provide a substantial base to start spreading dhamma. PTS translations are less than ideal compared to our current corpus, but there’s no doubt that they kicked off a very important process.

It’s much more easy to train 100+ such overseers than train them in actual translations.

I’d be interested in hearing @SebastianN thoughts on these, seeing he’s actually working on these.

Snowbird · June 2, 2024, 9:59am

Do you mean stock phrases? Most pericopes are not stock phrases and many stock phrases are not pericopes.

Otherwise, all good points.

Khemarato.bhikkhu · June 2, 2024, 10:12am

Yes, thanks. That’s what I get for trying to sound smart!

You mean the part where I agree with you? Well, that is usually a safe bet!

Khemarato.bhikkhu · June 2, 2024, 10:33am

Now I know for sure you weren’t lying about being a PM!

Dogen · June 2, 2024, 10:49am

guilty as charged!

BethL · June 2, 2024, 2:59pm

IMO Bhante’s essay series is establishing policy (first essay), substantiating the rationale, and suggesting a practical path forward (last essay, this thread). It feels like people are using this last one to keep bringing up reactions and thoughts that were already posted.

Well, maybe it’s more like sitting around at a coffeehouse and having an ongoing conversation. I suppose this last essay is the best place to do that.

This rationale was covered in an earlier essay around The Why for AI in Translation Work. (One of the early ones – maybe #2.) There was some agreement that mass-proliferation of the suttas in multiple languages at a certain volume and speed is not considered a major objective for SuttaCentral in order to rationalize AI (compared to the current volume and speed). I don’t mean to say everyone agreed on this point but it was discussed at some length.

I hear and appreciate Sebastian’s recurring comment that the policy is hugely disruptive to his team’s work. I’ve thought about this often, when reviewing these threads, because I can only imagine how difficult this has been. At the same time, as someone who’s had to develop and enforce painful organizational policies in my career, I’ve not found there’s ever an easy way to do it that isn’t disruptive.

Most policy I’ve been involved with is developed in the first place because something got out of the gate and has been misused. This is how policy usually gets a bad name, but there you have it. So it is usually retroactive, protective in nature, and universal in reach (to ensure its effectiveness, at the cost of worthwhile exceptions). It is decidedly painful and disruptive, especially for those who might fall within the exception category.

Managing exceptions takes a huge level of effort that is generally not sustainable. I would argue it takes more time and effort (hence funding) to manage exceptions than to ensure compliance. The cost-benefit rapidly breaks down but usually no one wants to be the person who stands up at the meeting and says, We’re not doing exceptions anymore. When you do, expect lonely days at the cafeteria… no accolades waiting for you there.

SebastianN · June 3, 2024, 6:26am

Thank you BethL for your words!
I can understand the desire to regulate the use of AI, but if we look at the larger picture around Buddhist translation/technology, enforcing strict policies has repercussions and I don’t see much positive value in doing so.

The big ones with questionable ethical compass (OpenAI etc.) don’t care even a tiny bit
Those who care are hurt the most. Now who cares?

Small research teams on university funding (us)
(Volunteer) translators, especially those who don’t have English as their mother tongue, and there are people in this forum who spoke up about their frustration, and there are people who decided not to speak up but are frustrated anyway.

We decided to remove data by people that publicly anounced they are against our work, but we also can’t and won’t ask everybody since that is just not feasible.

The larger question that I think about is: When did being against technological progress ever worked out for the better? I find all the talk about AI regulation and policy rather ridiculous, usually these are the early 20s Silicon Valley males that have some sort of inverted superiority complex and feel that this powerful technology needs to be “regulated” while in reality, people using it are perfectly capable of regulating their use of the software themselves and don’t need some larger entity to tell them what they should have access to and what not. If somebody doesn’t want to use google translate, ChatGPT or what not in their work thats great, I appreciate that, all the luck to you. But having strong opinions about what other people should or shouldn’t do is never a good sign in my opinion.

Since Dogen raised the question what we do at dharmamitra.org: We build translation models that facilitate machine translation for Buddhist translation projects. We also build search applications that use semantic embedding systems to map between Buddhist source texts preserved in different languages (Pali, Sanskrit, Chinese, Tibetan) so users can see whether sentence A in Pali has a corresponding sentence in Tibetan or Chinese and if so, at what position in the large canonical datasets these passages are found.
We might also put up some sort of RAG ‘chat with the Buddhist tradition’ thing up at some point. But truth be told, the information retrieval part is the much more difficult part here, and putting an LLM on top of that (or not) is really just an afterthought that we might do once we are there.

NgXinZhao · June 3, 2024, 6:57am

Have you checked Bro. Piya Tan’s translation? Maybe can just use his ones. There’s other alternatives.

Snowbird · June 3, 2024, 7:18am

Well, you could ask Ajahn Thanissaro. I’m guessing that he is one of your major sources. And he tends to have a fairly rigorous copyright policy. He answers his phone for about an hour every day so it wouldn’t be hard to ask him.

Not saying that you have to, but just that it’s completely feasible.

Maybe we are talking about different things. When I think of regulating AI, I’m thinking about stuff like facial recognition and law enforcement. I totally don’t think law enforcement is capable of regulating itself.

And the voices here on the forum that are concerned about AI aren’t against technological progress. From what I can tell they are some of the most techy people here. What people are questioning (I think) is whether or not this is actually progress.

I think this is a really good point. It’s been such a far ranging discussion it’s hard to not talk past each other.

Or just languages that don’t have an established vocabulary for Buddhist terms. I am guessing that unless we are talking about Indic languages it’s going to be a crap shoot.

No, that’s not what “accurate translation” means. I think you mean, something like “output in line with their training model.”

Well, they are using two different root texts, so rare cases of opposite meanings can happen. I’d have to see examples to say more.

But your point is still interesting. It sounds like you are saying that if even humans can disagree on translating a text, then why not let an LLM give it a try. I think that the opposite conclusion should be drawn. They are not translating things differently because they are wrong. They could be translating things differently because the texts are very difficult. And if the texts are too difficult to even have human experts come to different conclusions, then why in the world would we trust an ignorant LLM that doesn’t even understand the meaning.

I think a lot of this comes down to what is “good enough.” If I’m writing a little python script to do a task for me (I don’t know python) then there is a really wide range of “good enough” for the script. If the script does what I need it to do, it doesn’t matter how many milliseconds it does the job in, or how readable the code is, or how maintainable it is, or (probably) if there are any security issues.

However some of us feel that the “good enough” threshold for translating suttas needs to be much, much (much?) higher.

SebastianN · June 3, 2024, 7:26am

Not my

That seems to be the case indeed. I was thinking this discussion was about text-generation language models, as that was the original context of Bhante. If you want to talk about face recognition, okay, thats your thing, I wasn’t expecting that to be the topic here.

Dogen · June 3, 2024, 7:39am

Knowing a bit of Pāli, I think it’s very likely they’re using the same root text here though I could be wrong. This is literally how the same passage can be bent in one way or another. SN22.54:

Bhikkhu Sujato
If a mendicant has given up greed for the feeling element … perception element … choices element … consciousness element, the support is cut off, and there is no foundation for consciousness. Since that consciousness does not become established and does not grow, with no power to regenerate, it is freed.

Being free, it’s stable. Being stable, it’s content. Being content, they’re not anxious. Not being anxious, they personally become extinguished.

They understand: ‘Rebirth is ended … there is no return to any state of existence.’”

Bhikkhu Thanissaro
“If a monk abandons passion for the property of consciousness, then owing to the abandonment of passion, the support is cut off, and there is no landing of consciousness. Consciousness, thus not having landed, not increasing, not concocting, is released. Owing to its release, it is steady. Owing to its steadiness, it is contented. Owing to its contentment, it is not agitated. Not agitated, he (the monk) is totally unbound right within. He discerns that ‘Birth is ended, the holy life fulfilled, the task done. There is nothing further for this world.’”

This is not an arcane, pedantic difference of translation, in fact, these are both examples of their respective opinions - Ven. Thanissaro, coming from Forest Lineage, is a proponent of what they call “Consciousness that doesn’t land” which is clearly explained as the nature of nibbāna in his website.

Bhikkhu Sujato in fact wrote an whole article against him, saying Nibbāna isn’t a type of consciousness.

Both translations are very obviously skewed to their preferences. I think we couldn’t find a more important divergence in the whole Buddhist doctrine than this one. Nibbāna either is a type of consciousness or it isn’t.

Well, I think there’s arguments to be made. Humans have existential problems and are biased, have stake in reading things for what they wish they were.

I would trust a program (not specifically ChatGPT, but generally) that is trained specifically on the Pāli corpus to analyse its contents and come to a less biased conclusion, because it only cares about the internal logic, doesn’t have a stake in the result, doesn’t have people shouting at it for it’s opinion, it doesn’t matter to its nibbāna or not or his childhood dreams or what its teacher’s said.

Removing the problem of Nibbāna, what is the gist of Buddhavacana?

“Don’t do anything evil,
And put what’s good into practice.
To purify one’s own mind:
This is the teaching of Buddhas." EA1.1

Does it really matter is sukha is translated as bliss or joy? Devas as gods or deities? What major source of obstacle and confusion can there be (one that’s greater than the very nature of nibbāna)?

Snowbird · June 3, 2024, 7:44am

But isn’t that then an old school NLP MT? Because a LLM has to be trained on existing translations, doesn’t it?

Dogen · June 3, 2024, 7:54am

Not necessarily. You can teach an LLM algo both English and Pāli and then have it merge the two worlds together. It learns “King” and “Queen” in one language, and in another, they appear more or less in the same “space”. Take a “King” and remove man/ add woman, it gives “Queen”.

This is why my main request isn’t even that Bhante allows his translations to be used for ML training (it really doesn’t matter, though people do use it for convenience because it’s easier to fine tune if you have translations available).

ML algos are powerful, potentially weaponised technologies that is concerned with information. They’re not like guns where you can’t use a gun for much else than shoot another thing to death. ML can be used for evil, or it can be used for good.

Being cautious, advising regulation, common sense, these are all good, but we should also figure out how to use ML for good, precisely to combat the chaos and confusion some people want to sow using ML themselves.

This is the nature of my printing press analogy. Large scale reproduction of information. It would be good if our leaders spearheaded the sensible projects and made way to printing the good books.