Pāli Primer vs. the Dowling method

mikenz66 · February 1, 2021, 6:21am

Frank is active on DhammaWheel.com. I’ve sent him a PM to see if I’m missing something obvious…

mikenz66 · February 17, 2021, 7:57pm

Here’s some audtip news:

Javier · February 17, 2021, 9:59pm

That Dowling method seems totally unnatural and probably only works for a very small group of people. You don’t learn a language by memorizing grammar, you learn a language by picking up sentences and words here and there, and slowly figuring out how it all comes together (i.e. the grammar).

Incidentally, I have just started with Warder, and my plan is to go through it and then start trying to read Pali passages in some of the readers that are out there, while continuing to build up vocab with Ankidroid. Ideally, I want to work through Warder pretty quickly so I don’t get bored and jump into trying to read and translate passages as quick as possible (which is the were the real learning happens).

I studied greek and latin in university and we used this book called athenaze for greek, and it has you reading passages of (very simple) greek from the very beginning. This was so much more fun and better than the latin textbook, which was just grammar and sentence translations, for hundreds of pages. Super tedious.

I think that jumping into reading big chunks of text as early as possible is the best method IMO. I may even just do a few of the early Warder chapters first and then not finish it and just jump to translating passages as soon as possible.

So yea, my opinion, on learning ancient languages like this anyways, is just jump into reading big chunks of simple prose ASAP. This has at least 3 advantages:

You’ll be doing what you wanted to do in the first place when you chose to learn the language, you won’t get bored, you won’t feel you’re in a classroom or something.
You’ll see how the language is actually used in the texts, not in isolated sentences built by scholars.
You’ll pick up lots of vocabulary early on, because you will have to.

But hey, I am total beginner with Pali, i just started Warder’s chapter one yesterday, so take all this with a grain of salt.

Khemarato.bhikkhu · February 18, 2021, 2:48am

To my knowledge this is how monks traditionally learned Sanskrit or Pāli: memorizing a grammar textbook which was written in the target language with progressively more challenging sentences

Javier · February 18, 2021, 11:02am

What I meant and should have said was, that’s not how people naturally learn a language.

Gillian · February 19, 2021, 12:06pm

There are different brains, and different learning styles.
The best advice is to do what works for you.

Sumano · February 19, 2021, 12:30pm

I realize too at some point you just have to take the plunge to be able to even see what is it that works for you. That’s been my issue. I have a huge resistance to learning Pāli for some reason, but I began yesterday in earnest so we’ll see if I stay on the wagon!

Javier · February 19, 2021, 4:28pm

Hey! I just started too, and I was wondering if anyone else was just starting out as well as wanted to maybe have a study group?

We could make a discord or whatever. FYI I am using Warder, and Brahmali’s lectures on Wisdom and Wonders, as well as the youtube channel learnpali, which has a series on Warder.

I was just reading an article by Gombrich about learning the language and he stressed how good it is to work with others. And then I remembered your post that you had just started too.

Is there anyone who would be interested in a newbie Pali study group online (alas, I am in south Florida).

Pamirs · February 21, 2021, 3:31pm

Thinking more about this thread - as it was suggested the major part of learning is the vocabulary. I wonder:

What is the “go to” Pali dictionary that people use most? I found the PTS dictionary digitally (https://dsalsrv04.uchicago.edu/dictionaries/), but I am wondering if there are other maybe more recent resources. Actually, is there a dictionary function on this suttacentral site? (The site has so much functionality that I might have just missed)
I saw for other languages there are “frequency” dictionaries that just help prioritize vocab learning. Given the Pali texts are mostly digitalized (like on this site) are there corpus analysis done for example on the nikayas so that it can help Pali students to start building vocab?

Sumano · February 21, 2021, 4:50pm

The Digital Pāli Reader is a good resource. I had also thought about frequency dictionaries, but as far as I know none have been compiled for Pāli in a formal way.

Gillian · February 22, 2021, 4:50am

I believe that as yet there are no good frequency lists for Pāḷi. @khagga might like to coment here.

tuvok · February 22, 2021, 9:29am

For me the M. Cone’s dictionary is great, but it is still incomplete, one part to go I think. Wouldn’t expect it in under 8 years though

I don’t think there are any, but it is easy to generate such lists yourself. I did when I was trying to learn Pali few years back. If I remember correctly this did the trick:

cat sutta_file.txt | tr -d "()}‘,." | tr "-" " " | tr " " "\n" | sort | uniq -c | sort -r > freqlist.txt

it should work without issues on any linux-based terminal. A bit of explanation:

cat ./sutta_file.txt this just reads the file and dumps it to standard output so you can pipe it to the next command. You can put many filenames here.
tr -d removes characters listed in “” from the input text. Fine tune this depending on your source files
tr "-" " " substitutes ’ ’ for ‘-’, and the next one substitutes spaces for new-line characters, so that you have one word per line
sort does the obvious, with -r in reverse order
uniq counts repetitions and prints words in format: “count word”

output should look more or less like this:

 Total words: 147100
   3308 kho
   2217 ca
   2046 vā
   1490 hoti
   1433 na
   1060 te
    925 bhikkhave
    772 pana
    771 bhante
    758 Atha

It might be useful to lowercase them first, I think tr '[:upper:]' '[:lower:]' should do the job.

EDIT: I have no idea where the Total words line comes from, I might have used a sligtly different script or edited this manually… the date stamp on the file is 31.07.2014

khagga · February 22, 2021, 4:21pm

Ooh, I got @-ed!

There are a lot of frequency-related files mentioned by @Nibbanka in this post:

Of these, the largest is a huge (22MB) file called sortedFrequencyPali.txt:

Complete word list of all Pali words (about 967.000) as occuring in the CSCD (VRI) Tipitaka edition

Apparently the site on which the file (stored as a .zip) was originally hosted, nibbanam.com, has evaporated, but happily it survives in the Internet Archive:

http://wayback.archive.org/web/20150707075127/http://www.nibbanam.com/sortedFrequencyPali.zip

Here are the first 100 lines:

First 100 lines of sortedFrequencyPali.txt

167140			ca
150824			na
116790			vā
76637				pana
72505				hoti
65284				taṃ
54673				tattha
54515				evaṃ
49782				so
49223				pe
45897				kho
44558				nāma
40389				hi
38357				tassa
38087				te
37980				vuttaṃ
35300				bhikkhave
28660				attho
26537				ayaṃ
25953				viya
25277				tena
23661				tesaṃ
22309				atha
21929				katvā
21675				yaṃ
21550				me
20986				āha
20667				tasmā
20462				idaṃ
20258				yathā
20208				ettha
19663				dhammaṃ
18975				tathā
18879				dhammā
18565				tato
18532				yo
18172				uppajjati
17963				bhagavā
17925				dhammo
17771				attano
17371				bhante
17055				paccayo
16923				ekaṃ
16701				no
16632				dve
15885				paṭicca
15779				bhikkhu
14585				idha
14428				atthi
14268				natthi
14011				kiṃ
13991				ni0
13631				vuccati
13334				cittaṃ
13099				eva
13078				honti
12783				tasmiṃ
12773				hotīti
12619				ye
12318				tīṇi
12237				sā
11286				bhikkhū
11087				iti
11006				yassa
10794				ahaṃ
10652				hutvā
10612				iminā
10566				sace
10461				bhagavato
9957				disvā
9806				imaṃ
9772				saddhiṃ
9720				ceva
9697				gahetvā
9377				pañca
9305				puna
8981				kathaṃ
8856				ime
8819				rūpaṃ
8802				rājā
8764				sī0
8753				tvaṃ
8735				siyā
8416				yena
8269				syā0
8199				ahosi
8145				gantvā
7884				niṭṭhitā
7813				nu
7789				nava
7777				idāni
7564				āvuso
7533				yattha
7425				dhammassa
7374				maṃ
7371				ka0
7342				eko
7250				yasmā
7178				sati
7153				vatvā

After that, just 966,566 lines to go.

It’s a pretty remarkable resource, if a little hard to deal with! If anyone would like a shorted version I’d be happy to put one together and put it online.

Sumano · February 22, 2021, 5:08pm

I would be interested but you may find that some monastics are notoriously unreliable So I don’t want to make a promise I can’t keep, but if you do get something started I might check it out.

The lists are pretty easy to generate indeed, and I took a look at some of the docs posted, too. The only downside is, unlike a frequency dictionary, the generated lists don’t seem to take into account the declensions/conjugations, etc. So you’ll see dhammā and dhammaṃ show up on the list, for example.

I remember coming across a Spanish frequency dictionary that, aside from the master list, sorted each word by part of speech (verbs, adjectives, nouns, etc.) and even subject. Of course that’s a huge scientific undertaking, and Pāli is not as popular as Spanish.

Now if we could get a hold of the most frequent verbs and nouns in Pāli, that would be amazing. (Or maybe I’m just looking for a shortcut )

dayunbao · February 22, 2021, 5:33pm

Hmm, well if we knew the root form of the word, and if there was a predictable pattern to the suffixes, we could infer whether the word was a verb or noun. This would be a fun programming project, although I’m not sure if I have the time to take it on.

tuvok · February 22, 2021, 5:36pm

That could be (partially) fixed with some more intelligent scripting (no, I will not do it ), but there is, I think, a bigger problem: words often have a couple different meanings, depending on context. For example dhamma definitions take a bit over 14 pages of A4 size in Cone’s dictionary.

I think learning words by frequency is much less useful with Pali than with other (living) languages. If you look at those frequency lists, there are thousands of words that are used very infrequently. Even if you learn all the frequent ones, you will still not be able to read and understand the text. For example for the DN1 sutta with my script I get 1655 entries (with a bit of garbage, I admit). Of that about 850 entries occure only once. That’s half of the unique words in the text!

Pamirs · February 23, 2021, 1:55am

Thanks tuvok. As a beginner I think M. Cone is really a huge overkill. I was looking around more and found:

http://www.ahandfulofleaves.org/documents/Concise%20Pali%20English%20Dictionary_Buddhadatta.pdf

Is this legit?

Example of “vā”:

“Vā (particle of disjunction), or; either-or” (Buddhadatta)

" Vā (indecl.) [Ved. vā, Av. vā, Gr. h)/, Lat. – ve] part. of dis- junction: “or”; always enclitic Kh viii. (itthiyā purisassa vā; mātari pitari vā pi). Usually repeated vā – vā (is it so – ) or, either – or, e. g. Sn 1024 (Brahmā vā Indo vā pi); Dh 1 (bhāsati vā karoti vā); PvA 74 (putto vā dhītā vā natthi?). – with negation in second place: whether – or not, or not, e. g. hoti vā no vā is there or is there not D i.61; taŋ patthehi vā mā vā VvA 226. – Combined with other emphatic particles: (na) vā pana not even Pv ii.69 (manussena amanussena vā pana); vā pi or even Sn 382 (ye vā pi ca); Pv ii.614 (isayo vā pi ye santā etc.); iti vā Nd2 420; atha vā Dh 83 (sukhena atha vā dukhena); uda . . . vā Sn 232 (kāyena vācā uda cetasā vā). – In verse vā is sometimes shortened to va , e. g. devo va Brahmā vā Sn 1024: see va4 ."(online PTS dictionary)

tuvok · February 23, 2021, 8:38am

Looks great for learning, short and to the point definitions.

dayunbao · February 23, 2021, 8:55am

Context is something that is still beyond Natural Language Processing (NLP), as far as I know.

I haven’t done any checking on this, but I wonder if NLP would be slightly easier with a dead language like Pali, where everything is much more fixed than with a living language that keeps changing. I also wonder if the case system in Pali would make NLP slightly easier, meaning it would be easier to connect subjects, objects, verbs and nouns than in English, for example. I’m guessing that in Pali a sentence like “I saw the man on the hill with a telescope” isn’t as vague as it is in English. For those not familiar with NLP, that sentence is an example of how difficult it can be for an algorithm to understand the meaning of a sentence. Does it mean “I” used a telescope to see the man on the hill? Or the man “I saw on the hill” has a telescope?

khagga · February 23, 2021, 1:38pm

Indeed, this is a super tricky problem. It would be great to be able to query “what is the distribution of frequencies for all forms of the stem dhamma? And it’s not just the fact that a given stem has many inflections, it’s that any given inflection can show up in numerous different paradigms — aṃ being ap articularly variable example:

[ā]: feminine singular accusative
[r]: masculine singular accusative
[as]: neuter singular nominative
[as]: neuter singular vocative
[as]: neuter singular accusative
[an]: neuter singular nominative
[an]: masculine singular accusative
[an]: neuter singular accusative
[a]: neuter singular nominative
[a]: neuter singular vocative
[a]: masculine singular accusative
[a]: neuter singular accusative
[(m/v)ant]: masculine singular nominative
[(m/v)ant]: neuter singular nominative
[(m/v)ant]: masculine singular vocative
[(m/v)ant]: neuter singular vocative
[(m/v)ant]: masculine singular accusative
[(m/v)ant]: neuter singular accusative

I don’t know if dhamma is one of them, but many stems of course can function as adjectives or nouns, which means that even having stored information about the “built-in” gender of a noun stem (like, dhamma is “inherently” masculine (er… or is it neuter?? ) doesn’t mean that stem can only appear with masculine endings, if it’s used adjectivally.

Even worse, there’s things like iṃ that can be verbal (first person aorist, I think?) or nominal (lots of accusatives).

Another clue might be to look prefixes; Pali seems to prefer to stick a upa- or some other prefix on verbs, some of the time anyway.

It really does seem like a fun programming project. I wonder what kind of logic is built into the automated word lookup on Sutta Central, since you’ve been looking at the code, perhaps you know @dayunbao?