Language tools and Pali translations

If you are interested in tools for linguistic analysis you might find some of the resources listed in this guide of interest. The type of layout for analysis is familiar to students of linguistics. In my day it was mainly paper and pencil, but digital means were emerging. :slight_smile:

Thanks Gillian, I had a quick look but these tools are a bit limited and outdated. I think things have progressed a lot, especially in the last few years.

For example, Digital Pali Dictionary already recognises and classifies inflected Pali words, and in addition it is able to do sandhi splitting and compound deconstruction. My understanding (correct me if I am wrong @bran) is that DPD (or something similar) is already integrated into SC so most of the functionality for word translation and lexical analysis is already there.

What @sujato was suggesting was that the translation feature in SC be enhanced to support the trilinear format as advocated by @johnk. We all agree the initial implementation should be relatively simple and leverage what is already there and the focus should be to create content.

What @bran and I was discussing is that we both believe a word-by-word analysis isn’t enough for an accurate translation, a grammatical analysis at the sentence level is required. I then created three examples (all taken from the exercises for Warder Lessons 1-12 which forms the content for this course) to illustrate why a grammatical analysis is necessary. @bran also created a great example on the Github issue thread.

It should be possible to do a grammatical analysis by training a language model to detect common idioms and sentence patterns, and then it should be able to automatically deconstruct and analyse a sentence. However, it will need a tagged training data set, which is why creating content should be the first priority.

I apologise if this post comes across as a bit patronising, in truth I probably know a lot less than my words will imply. My background isn’t software development, I worked as a management consultant, so I earned my living by talking, not necessarily doing. However, I did have the opportunity to play around with deep learning and language models before I retired, and I have kept up to date with the latest developments in AI, so I have some sense of what is possible.


Not patronising. This always happens with talk between AI and Linguistics folk. My short answer is that “grammar” can’t be properly represented in just a single line, as is often assumed to be possible. (One of Warder’s strengths is that he recognises this). … Also, one difficulty that keeps cropping up in the teaching and learning of Pali comes out of the assumption that the grammars of the two languages are more similar than they really are.

A frequent difficulty is the conflation of grammatical structure and grammatical function. Two separate lines for these please!

I’ve just realised that you are talking about developments of SC in the future? Because of the heading of this thread, I thought you were making proposals for the first few weeks of beginners study with Bhante S! …

I am neither an AI nor linguistics expert, so apologies for any error in use of terminology (for either). I am just a beginner student in Pali.

I wasn’t saying that a sentence’s meaning can be fully deduced from just one sentence, so please don’t misunderstand me. I am saying a sentence’s meaning cannot sometimes be deduced just by a word by word translation. Obviously the more the context, the better. The good thing about a language model is that it can ingest the entire Tipitaka in one hit, so it should be able to derive a lot of context.

I also don’t think I made any assumption (implicitly or explicitly) about whether Pali and English grammar are similar, or are you bringing all these points for an imaginary “strawman”? :slight_smile:

Warder did say in the Introduction that:

The analysis and the learning of any language should be based on the study of sentences, that is, of the language as it is actually found in use. It is useful to study words in order to understand the sentences, but, like roots and stems, isolated words are in fact mere abstractions devised by grammarians for the analysis of language. (In the Indian tradition of writing “words” are not separated and each sentence appears as a continuous piece, as in speech. Only by grammatical analysis can words be abstracted: marked by certain “inflections”.)

So it is on that basis that I am using the term “grammatical analysis at the sentence level” - my apologies if I am using it incorrectly. I don’t know what you mean by either “grammatical structure” and “grammatical function” as I have never used either of those two phrases. Maybe you can explain your last paragraph to non experts like me?


@bran and I are talking about a proposal by Bhante, as per this post:

I have no proposals for how Bhante should run this course, I am excitedly awaiting for it like others I am sure. I also have no proposals or recommendations for how people should study Pali, each individual has their own learning style.

Haha. :rofl: I remain confused about whether we ended up discussing formats for learning Pali or for displaying more information on SC itself. I think your original comment was probably about learning and then the conversation snowballed in the way conversations often do. …. No matter. :rofl:.

I presume the conversation on GitHub to be about SC tho, and hope to find time soon to add some thoughts to it.

Depends on what you mean by original comment. I think you’ll find if you actually read the thread carefully, it will all become clear. Our minds work in mysterious ways - we misread something, and then everything else we read reinforces the original misunderstanding.

I suspect you read fast and form an opinion without carefully considering what the poster actually said, as opposed to what you thought the poster had said. For example, you interpreted what I said as “displaying more information on SC itself.” No, I didn’t say that, read what I said again carefully.

And at no point were we discussing “formats for learning Pali” - I am not sure how you managed to get that impression. I thanked John for his trilinear approach in generating his answer key for Warder, and Bhante Sujato said SC should implement that feature. So it’s enabling a different approach to translation, not “more information.” The discussion then started from there. I did say I may adopt the trilinear approach, but that was for working out the passages in Rune’s PBT book, not for Warder.

Very short answer: an item like a word can be one thing and can do something at the same time. Like people have different roles or jobs, nouns can function in in different ways, eg as object or subject of a sentence.

Short answer: EG:

1.Putto maram passati. The son sees the man (de Silva Lesson2 Ex3).

In Pali the nominal functions SUBJECT and OBJECT are indicated by the inflectional endings -o and -aṁ: so whatever the word order, putto naraṁ passatiti or naraṁ putto passatiti have the same meaning because the inflections tell us what is subject and what is object. In contrast, in English, word order tells us which noun is functioning as subject and which as object: The son sees the man vs The man sees the son.

I’m sorry, but your answer doesn’t really help me. I still don’t understand what you mean by “grammatical structure” and “grammatical function”?

When I was talking about grammatical analysis, I was referring to the Warder in the introduction.

(In the Indian tradition of writing “words” are not separated and each sentence appears as a continuous piece, as in speech. Only by grammatical analysis can words be abstracted: marked by certain “inflections”.) It is the sentences which are the natural units of discourse and which are the minimum units having precise, fully articulated meaning.

What @bran and I was discussing was that he said the word lookup function in SC already classifies the inflection of the word into case, number etc. And I said that wasn’t enough, as we also need to understand how the word is used in the sentence. And that’s what I meant by “grammatical analysis at the sentence level”

As an example, see my diagram on the Github issue thread where I classified the words in the sentence into agent (kattar) and action (kiriya).

For a fuller example, please see this link where the author has done a trilinear translation of 3 suttas complete with grammatical analysis (note, the author also uses the term “grammatical analysis”):

A Grammatical Analysis of Three Discourses

I am somewhat at a loss why you are trying to correct me on the usage of “grammatical analysis” when I am trying to use it in the sense that Warder is, and also Ānandajoti Bhikkhu. But then I am not a linguistics expert. And I still don’t understand your comment about “grammatical structure” and “grammatical function” and why they need to be on “separate lines”.

It looks like they’re talking about how a noun could be the nominative subject or the accusative noun (kamma) which was covered in Warner’s intro. Not to mention other tenses or voice.

There are also grammar sentence trees to see a visual way to relate words


Spot on, @bran - that’s exactly what I was visualising when I attempted to draw my “pseudo” class diagram. I probably should have chosen a more complex sentence. If I get some time, I may try and do a more complex hierarchy diagram in mermaid. Probably on one of the answers to the exercises in Warder.

OK: the SC lookup (which isn’t always accurate btw, tho it is a great tool) is dealing with things that loosely come under structure and your “how the word is used in the sentence” is function.

link please.

Agent and action to my mind are semantic rather functions rather than grammatical functions: which are also sometimes called grammatical roles.

Try this easy website for starters:
Grammatical functions in the clause |

It is a major difficulty that there is more than one theory of linguistics and linguists posit different models of language (and have a history of fighting over them!). Your interest appears to be at a level where a university course in Linguistics would fill your needs. :clap: :

I’ve not read this book, but it seems to be a decent general introduction.

I’m afraid I’m a bit out of touch, having been retired for 15 years and retrained as a visual artist in that time!! :woman_artist:

@bran: how did the elephant fit into your pyjamas? (Wonderful eg btw.)

The issue is that the feature does not at all describe “how the word is used in the sentence” - which was the point I was trying to make.

It is at best guessing the case of the word, based on the inflectional ending. But as we know, Pali inflections are ambiguous - the dative and genitive share the same inflections (although historically they were supposed to be different).

And then there are words like the enclitic me which can serve multiple functions depending on context. Both Rune Johanssen and Bhikkhu Bodhi describe examples where one case is substituted for another - so an instrumental can be used as an ablative or dative for example (the instrumental case is quite complex in Pali as per Warder’s Lesson 8).

That’s why I created the post with the three examples - which you misinterpreted as me talking about learning Pali - those three examples were specifically chosen to show that it is not possible to infer “how the word is used in the sentence” purely from the word inflection.

The first example shows a specific construction pattern, the second is a passive sentence where the use of the cases change, and the third is the genitive absolutive, where the genitive does not indicate possession but is a link (Warder calls it a “nexus”) into a sub clause with a different agent.

I hope you understand why I chose those examples now.

Literal translation #2730

In Pali, it’s a bit complicated, as the case endings actually change for the agent and patient depending on the “voice” of the sentence, as per my second example. By the way, if you look closely at my diagram where I label the agent and action, I am not labelling the agent and action as grammatical attributes of the words, but as a relationship between the sentence and the words).

And in the first example, the “patient” of the sentence actually is the “agent” “governed” by the yenatena construction (which Warder describes very confusingly in the Introduction and doesn’t entirely clear up until Lesson 12) - hence the use of the nominative case rather than accusative.

I really hope you will take the opportunity to reread my examples more carefully now. I think you rushed into it thinking I was talking about learning Pali whereas I am really talking about something quite different.

BTW thanks for providing all those links about linguistics. I really appreciate that you are trying to help. I’ll try and browse through them when I can, but it’s not my focus on the moment.

I only care about learning Pali, as quickly as possible by any means necessary. Then my next focus will be reading the suttas in Pali, which I am starting to do now. My ultimate goal is to attain nibbana, hopefully within this lifetime. I am conceited and deluded enough to think this is within the realms of possibility.

For real, those links are hefty. It had some tools that could turn making a literal translation function very easy.

To me, knowing the third noble truth can be a complex thing in practice. Whether it be difficult, or wherever one is, or however one is sitting or standing, or whoever they are, the reality is that it is within the realm of possibility, and nothing else says otherwise.

He fit there with a misplaced modifier :smile: (also, I just now realized it used the word “shot” which is unexpectedly violent)

I am still struggling to understand the first noble truth! Only very recently have I realised what anatta really is, and only after reading the Abhidhamma. The Buddha describes our mind and our consciousness functioning very much like ChatGPT - consciousness and self awareness is an artificial construct generated through repeated exposure to external and internal stimuli. Our sense of identify and self is impermanent - it disappears after a death and rebirth. Kind of like a complete reset of the weights and layers in a neural net. And THIS causes dukkha (“suffering”).

This was one of the things that is spurring me to study Pali. I now realise it’s very difficult to understand what the Buddha is truly saying in a translation - words like citta, cetasika, are pretty much untranslatable to English, the suttas really need to be studied in Pali. Now when I go back and reread even the simplest suttas, it’s a “mindblowing” experience - there is so much depth and nuance even in the simplest passages that I brushed aside when I read them in English.


Well, we’re all here to learn Pali and the title of the thread was about a Pali class.
So what are you talking about? Should we change the title of this thread to clarify, do you think?