Frank is active on DhammaWheel.com. I’ve sent him a PM to see if I’m missing something obvious…
Here’s some audtip news:
That Dowling method seems totally unnatural and probably only works for a very small group of people. You don’t learn a language by memorizing grammar, you learn a language by picking up sentences and words here and there, and slowly figuring out how it all comes together (i.e. the grammar).
Incidentally, I have just started with Warder, and my plan is to go through it and then start trying to read Pali passages in some of the readers that are out there, while continuing to build up vocab with Ankidroid. Ideally, I want to work through Warder pretty quickly so I don’t get bored and jump into trying to read and translate passages as quick as possible (which is the were the real learning happens).
I studied greek and latin in university and we used this book called athenaze for greek, and it has you reading passages of (very simple) greek from the very beginning. This was so much more fun and better than the latin textbook, which was just grammar and sentence translations, for hundreds of pages. Super tedious.
I think that jumping into reading big chunks of text as early as possible is the best method IMO. I may even just do a few of the early Warder chapters first and then not finish it and just jump to translating passages as soon as possible.
So yea, my opinion, on learning ancient languages like this anyways, is just jump into reading big chunks of simple prose ASAP. This has at least 3 advantages:
-
You’ll be doing what you wanted to do in the first place when you chose to learn the language, you won’t get bored, you won’t feel you’re in a classroom or something.
-
You’ll see how the language is actually used in the texts, not in isolated sentences built by scholars.
-
You’ll pick up lots of vocabulary early on, because you will have to.
But hey, I am total beginner with Pali, i just started Warder’s chapter one yesterday, so take all this with a grain of salt.
To my knowledge this is how monks traditionally learned Sanskrit or Pāli: memorizing a grammar textbook which was written in the target language with progressively more challenging sentences
What I meant and should have said was, that’s not how people naturally learn a language.
There are different brains, and different learning styles.
The best advice is to do what works for you.
I realize too at some point you just have to take the plunge to be able to even see what is it that works for you. That’s been my issue. I have a huge resistance to learning Pāli for some reason, but I began yesterday in earnest so we’ll see if I stay on the wagon!
Hey! I just started too, and I was wondering if anyone else was just starting out as well as wanted to maybe have a study group?
We could make a discord or whatever. FYI I am using Warder, and Brahmali’s lectures on Wisdom and Wonders, as well as the youtube channel learnpali, which has a series on Warder.
I was just reading an article by Gombrich about learning the language and he stressed how good it is to work with others. And then I remembered your post that you had just started too.
Is there anyone who would be interested in a newbie Pali study group online (alas, I am in south Florida).
Thinking more about this thread - as it was suggested the major part of learning is the vocabulary. I wonder:
-
What is the “go to” Pali dictionary that people use most? I found the PTS dictionary digitally (https://dsalsrv04.uchicago.edu/dictionaries/), but I am wondering if there are other maybe more recent resources. Actually, is there a dictionary function on this suttacentral site? (The site has so much functionality that I might have just missed)
-
I saw for other languages there are “frequency” dictionaries that just help prioritize vocab learning. Given the Pali texts are mostly digitalized (like on this site) are there corpus analysis done for example on the nikayas so that it can help Pali students to start building vocab?
The Digital Pāli Reader is a good resource. I had also thought about frequency dictionaries, but as far as I know none have been compiled for Pāli in a formal way.
I believe that as yet there are no good frequency lists for Pāḷi. @khagga might like to coment here.
For me the M. Cone’s dictionary is great, but it is still incomplete, one part to go I think. Wouldn’t expect it in under 8 years though
I don’t think there are any, but it is easy to generate such lists yourself. I did when I was trying to learn Pali few years back. If I remember correctly this did the trick:
cat sutta_file.txt | tr -d "()}‘,." | tr "-" " " | tr " " "\n" | sort | uniq -c | sort -r > freqlist.txt
it should work without issues on any linux-based terminal. A bit of explanation:
cat ./sutta_file.txt
this just reads the file and dumps it to standard output so you can pipe it to the next command. You can put many filenames here.tr -d
removes characters listed in “” from the input text. Fine tune this depending on your source filestr "-" " "
substitutes ’ ’ for ‘-’, and the next one substitutes spaces for new-line characters, so that you have one word per linesort
does the obvious, with-r
in reverse orderuniq
counts repetitions and prints words in format: “count word”
output should look more or less like this:
Total words: 147100
3308 kho
2217 ca
2046 vā
1490 hoti
1433 na
1060 te
925 bhikkhave
772 pana
771 bhante
758 Atha
It might be useful to lowercase them first, I think tr '[:upper:]' '[:lower:]'
should do the job.
EDIT: I have no idea where the Total words
line comes from, I might have used a sligtly different script or edited this manually… the date stamp on the file is 31.07.2014
Ooh, I got @-ed!
There are a lot of frequency-related files mentioned by @Nibbanka in this post:
Of these, the largest is a huge (22MB) file called sortedFrequencyPali.txt
:
Complete word list of all Pali words (about 967.000) as occuring in the CSCD (VRI) Tipitaka edition
Apparently the site on which the file (stored as a .zip) was originally hosted, nibbanam.com, has evaporated, but happily it survives in the Internet Archive:
http://wayback.archive.org/web/20150707075127/http://www.nibbanam.com/sortedFrequencyPali.zip
Here are the first 100 lines:
First 100 lines of sortedFrequencyPali.txt
167140 ca 150824 na 116790 vā 76637 pana 72505 hoti 65284 taṃ 54673 tattha 54515 evaṃ 49782 so 49223 pe 45897 kho 44558 nāma 40389 hi 38357 tassa 38087 te 37980 vuttaṃ 35300 bhikkhave 28660 attho 26537 ayaṃ 25953 viya 25277 tena 23661 tesaṃ 22309 atha 21929 katvā 21675 yaṃ 21550 me 20986 āha 20667 tasmā 20462 idaṃ 20258 yathā 20208 ettha 19663 dhammaṃ 18975 tathā 18879 dhammā 18565 tato 18532 yo 18172 uppajjati 17963 bhagavā 17925 dhammo 17771 attano 17371 bhante 17055 paccayo 16923 ekaṃ 16701 no 16632 dve 15885 paṭicca 15779 bhikkhu 14585 idha 14428 atthi 14268 natthi 14011 kiṃ 13991 ni0 13631 vuccati 13334 cittaṃ 13099 eva 13078 honti 12783 tasmiṃ 12773 hotīti 12619 ye 12318 tīṇi 12237 sā 11286 bhikkhū 11087 iti 11006 yassa 10794 ahaṃ 10652 hutvā 10612 iminā 10566 sace 10461 bhagavato 9957 disvā 9806 imaṃ 9772 saddhiṃ 9720 ceva 9697 gahetvā 9377 pañca 9305 puna 8981 kathaṃ 8856 ime 8819 rūpaṃ 8802 rājā 8764 sī0 8753 tvaṃ 8735 siyā 8416 yena 8269 syā0 8199 ahosi 8145 gantvā 7884 niṭṭhitā 7813 nu 7789 nava 7777 idāni 7564 āvuso 7533 yattha 7425 dhammassa 7374 maṃ 7371 ka0 7342 eko 7250 yasmā 7178 sati 7153 vatvā
After that, just 966,566 lines to go.
It’s a pretty remarkable resource, if a little hard to deal with! If anyone would like a shorted version I’d be happy to put one together and put it online.
I would be interested but you may find that some monastics are notoriously unreliable So I don’t want to make a promise I can’t keep, but if you do get something started I might check it out.
The lists are pretty easy to generate indeed, and I took a look at some of the docs posted, too. The only downside is, unlike a frequency dictionary, the generated lists don’t seem to take into account the declensions/conjugations, etc. So you’ll see dhammā and dhammaṃ show up on the list, for example.
I remember coming across a Spanish frequency dictionary that, aside from the master list, sorted each word by part of speech (verbs, adjectives, nouns, etc.) and even subject. Of course that’s a huge scientific undertaking, and Pāli is not as popular as Spanish.
Now if we could get a hold of the most frequent verbs and nouns in Pāli, that would be amazing. (Or maybe I’m just looking for a shortcut )
Hmm, well if we knew the root form of the word, and if there was a predictable pattern to the suffixes, we could infer whether the word was a verb or noun. This would be a fun programming project, although I’m not sure if I have the time to take it on.
That could be (partially) fixed with some more intelligent scripting (no, I will not do it ), but there is, I think, a bigger problem: words often have a couple different meanings, depending on context. For example dhamma definitions take a bit over 14 pages of A4 size in Cone’s dictionary.
I think learning words by frequency is much less useful with Pali than with other (living) languages. If you look at those frequency lists, there are thousands of words that are used very infrequently. Even if you learn all the frequent ones, you will still not be able to read and understand the text. For example for the DN1 sutta with my script I get 1655 entries (with a bit of garbage, I admit). Of that about 850 entries occure only once. That’s half of the unique words in the text!
Thanks tuvok. As a beginner I think M. Cone is really a huge overkill. I was looking around more and found:
http://www.ahandfulofleaves.org/documents/Concise%20Pali%20English%20Dictionary_Buddhadatta.pdf
Is this legit?
Example of “vā”:
“Vā (particle of disjunction), or; either-or” (Buddhadatta)
" Vā (indecl.) [Ved. vā, Av. vā, Gr. h)/, Lat. – ve] part. of dis- junction: “or”; always enclitic Kh viii. (itthiyā purisassa vā; mātari pitari vā pi). Usually repeated vā – vā (is it so – ) or, either – or, e. g. Sn 1024 (Brahmā vā Indo vā pi); Dh 1 (bhāsati vā karoti vā); PvA 74 (putto vā dhītā vā natthi?). – with negation in second place: whether – or not, or not, e. g. hoti vā no vā is there or is there not D i.61; taŋ patthehi vā mā vā VvA 226. – Combined with other emphatic particles: (na) vā pana not even Pv ii.69 (manussena amanussena vā pana); vā pi or even Sn 382 (ye vā pi ca); Pv ii.614 (isayo vā pi ye santā etc.); iti vā Nd2 420; atha vā Dh 83 (sukhena atha vā dukhena); uda . . . vā Sn 232 (kāyena vācā uda cetasā vā). – In verse vā is sometimes shortened to va , e. g. devo va Brahmā vā Sn 1024: see va4 ."(online PTS dictionary)
Looks great for learning, short and to the point definitions.
Context is something that is still beyond Natural Language Processing (NLP), as far as I know.
I haven’t done any checking on this, but I wonder if NLP would be slightly easier with a dead language like Pali, where everything is much more fixed than with a living language that keeps changing. I also wonder if the case system in Pali would make NLP slightly easier, meaning it would be easier to connect subjects, objects, verbs and nouns than in English, for example. I’m guessing that in Pali a sentence like “I saw the man on the hill with a telescope” isn’t as vague as it is in English. For those not familiar with NLP, that sentence is an example of how difficult it can be for an algorithm to understand the meaning of a sentence. Does it mean “I” used a telescope to see the man on the hill? Or the man “I saw on the hill” has a telescope?
Indeed, this is a super tricky problem. It would be great to be able to query “what is the distribution of frequencies for all forms of the stem dhamma? And it’s not just the fact that a given stem has many inflections, it’s that any given inflection can show up in numerous different paradigms — aṃ being ap articularly variable example:
- [ā]: feminine singular accusative
- [r]: masculine singular accusative
- [as]: neuter singular nominative
- [as]: neuter singular vocative
- [as]: neuter singular accusative
- [an]: neuter singular nominative
- [an]: masculine singular accusative
- [an]: neuter singular accusative
- [a]: neuter singular nominative
- [a]: neuter singular vocative
- [a]: masculine singular accusative
- [a]: neuter singular accusative
- [(m/v)ant]: masculine singular nominative
- [(m/v)ant]: neuter singular nominative
- [(m/v)ant]: masculine singular vocative
- [(m/v)ant]: neuter singular vocative
- [(m/v)ant]: masculine singular accusative
- [(m/v)ant]: neuter singular accusative
I don’t know if dhamma is one of them, but many stems of course can function as adjectives or nouns, which means that even having stored information about the “built-in” gender of a noun stem (like, dhamma is “inherently” masculine (er… or is it neuter?? ) doesn’t mean that stem can only appear with masculine endings, if it’s used adjectivally.
Even worse, there’s things like iṃ that can be verbal (first person aorist, I think?) or nominal (lots of accusatives).
Another clue might be to look prefixes; Pali seems to prefer to stick a upa- or some other prefix on verbs, some of the time anyway.
It really does seem like a fun programming project. I wonder what kind of logic is built into the automated word lookup on Sutta Central, since you’ve been looking at the code, perhaps you know @dayunbao?