Pāli Primer vs. the Dowling method

Ooh, I got @-ed! :slight_smile:

There are a lot of frequency-related files mentioned by @Nibbanka in this post:

Of these, the largest is a huge (22MB) file called sortedFrequencyPali.txt:

Complete word list of all Pali words (about 967.000) as occuring in the CSCD (VRI) Tipitaka edition

Apparently the site on which the file (stored as a .zip) was originally hosted, nibbanam.com, has evaporated, but happily it survives in the Internet Archive:


Here are the first 100 lines:

First 100 lines of sortedFrequencyPali.txt
167140			ca
150824			na
116790			vā
76637				pana
72505				hoti
65284				taṃ
54673				tattha
54515				evaṃ
49782				so
49223				pe
45897				kho
44558				nāma
40389				hi
38357				tassa
38087				te
37980				vuttaṃ
35300				bhikkhave
28660				attho
26537				ayaṃ
25953				viya
25277				tena
23661				tesaṃ
22309				atha
21929				katvā
21675				yaṃ
21550				me
20986				āha
20667				tasmā
20462				idaṃ
20258				yathā
20208				ettha
19663				dhammaṃ
18975				tathā
18879				dhammā
18565				tato
18532				yo
18172				uppajjati
17963				bhagavā
17925				dhammo
17771				attano
17371				bhante
17055				paccayo
16923				ekaṃ
16701				no
16632				dve
15885				paṭicca
15779				bhikkhu
14585				idha
14428				atthi
14268				natthi
14011				kiṃ
13991				ni0
13631				vuccati
13334				cittaṃ
13099				eva
13078				honti
12783				tasmiṃ
12773				hotīti
12619				ye
12318				tīṇi
12237				sā
11286				bhikkhū
11087				iti
11006				yassa
10794				ahaṃ
10652				hutvā
10612				iminā
10566				sace
10461				bhagavato
9957				disvā
9806				imaṃ
9772				saddhiṃ
9720				ceva
9697				gahetvā
9377				pañca
9305				puna
8981				kathaṃ
8856				ime
8819				rūpaṃ
8802				rājā
8764				sī0
8753				tvaṃ
8735				siyā
8416				yena
8269				syā0
8199				ahosi
8145				gantvā
7884				niṭṭhitā
7813				nu
7789				nava
7777				idāni
7564				āvuso
7533				yattha
7425				dhammassa
7374				maṃ
7371				ka0
7342				eko
7250				yasmā
7178				sati
7153				vatvā

After that, just 966,566 lines to go. :exploding_head:

It’s a pretty remarkable resource, if a little hard to deal with! If anyone would like a shorted version I’d be happy to put one together and put it online.