SuttaCentral

Frequency-ordered Pali vocab flashcards

If you’re a user of the TinyCards flashcard app, I’ve just finished uploading four decks of Pali flashcards:

I created these from the frequency-sorted word list from Schmidt’s dictionary (linked here in the forum: A better Pali Dictionary) and the definitions & parts of speech from Sutta Central’s copy of the Concise Pali-English Dictionary.

The decks I uploaded are about 400 words, but I ran my script against the entire frequency-sorted vocab list and produced a master list of over 10,000 cards as a tab-delimited file. I’ve uploaded that file to Google drive, if anyone would like make themselves another few thousand flash cards or upload it into another program: https://drive.google.com/open?id=1MKCP06wqxOGjBSnaxzgvpXdU2qmV41_L

I can provide smaller CSV files if needed for import them into another flashcard program, but you should be able to manage it using a spreadsheet program or text editor. I also haven’t checked closely for errors, though I did find a typo or two in the electronic version of the CPED! Feedback welcome.

14 Likes

Sadhu! I imagine a Pāli frequency dictionary is hard to come by!

2 Likes

Could you say a little about the app? I’m not familiar with it.

1 Like

TinyCards is a spaced repetition flashcard app created by the makers of Duolingo, a popular language learning app. It doesn’t have all of the features and customization options that something like Anki has, but is also much simpler to use out of the box.

It’s also possible to create and share decks of flashcards with the user community- there were a few Pali decks focused on terminology, but no general vocabulary material that I could find.

2 Likes

Thanks!

I’m actually more of an Anki man myself, but as you’ve used CSV format any Anki-user who wants to use your file should be able to convert it easily.

Instructions here:

2 Likes

Thanks so much, this is fantastic!

We had a stab a little while ago at making our own something like, this, but it never made it to the real world.

I just went through the first lesson, and my only comment would be that for absolute beginners, simpler definitions might help. In some cases the definitions are ambiguous or overlapping, which is fine, it’s just how language is, but it’s confusing for a beginner. And there is no context to help understand what meaning is applied. Also there’s a lot of grammatical jargon! I’ve not used such apps myself, and don’t know how it all works, so please take this as a mere first-timers thoughts.

Some examples:

  • for pana I get to choose “particle of disjunction”, “copulative or disjunctive particle”, or “indeclinable”. Now of course “particle” and “indeclinable” mean more-or-less the same thing. It is an indeclinable, and is it copulative or disjunctive or both?
    • Suggest: choices for pana = “but”, “or”, “and”
  • Now I get “na”. Here one choice is: “and; yet; but; out the contrary; and now; more over. [(Adversative and interogative particle) ind.]”. Yikes!
    • Suggest: “and”, “yet”, “not”
  • Now I have hoti, for which the correct answer is “hū = a”. Not sure how a beginner is supposed to understand what this means.
    • Suggest: hoti = “it is”

And so on, anyway you get my drift. I understand it’s a job of work to go through them all, but maybe at least the first few lessons could be made clearer?


If a few users are using these things, maybe we could collect them somewhere?

5 Likes

Yes- I’ve noticed playing with it in the app that it is quizzing you on the part of speech in addition to the definition, which might not be what everyone wants.

One option would be to merge the POS and definition fields- make the part of speech part of the definition, rather than something that it’s going to quiz you on independently. You would still see the parts of speech and derivation as part of the back of the card. The downside here is that as the definitions get longer, the text becomes too small to read on a phone.

The definitions and POS info all comes from the CPED, but if there is another source for definitions that would be clearer, I can regenerate the entire set from another JSON dictionary without much trouble. I picked the CPED because it seemed likely to have short definitions, and because I have access to a physical copy that is useful for checking typos. I wasn’t familiar with what is marked as the NCPED, but from what I gather from this thread, using that as a source with the CPED as a fall-back might be an improvement.

The disadvantage of using the frequency list for selecting words is that the first few sets are strongly biased towards indeclinables, adverbs, and particles, many of which have very similar or overlapping meanings- it isn’t the best introduction for a beginner to try and distinguish a bunch of words meaning ‘indeed’, ‘but’, and ‘although’, but I was curious how much of a set I could generate with a little scripting. This was intentionally very quick ‘n’ dirty in methodology. I’ve seen some vocab lists from textbooks floating around in the past that might make a better graduated introduction rather than the drink-from-the-firehose method. A glossary might be a more useful source for vocab and flashcard definitions for newbies rather than a formal dictionary, if there is one available that has enough words.

If a few users are using these things, maybe we could collect them somewhere?

I have a GitHub account and I’m happy to contribute things back to the sc-data repo or put my scripts and card sets up there if there are other materials to gather together.

1 Like

Your best source is here:

Here is the direct link for the file:

https://github.com/suttacentral/suttacentral/files/4090192/ncped.zip

Essentially, NCPED is the CPED updated following Cone’s Dictionary of Pali. The file I link to above is the current state of NCPED, which corrects all entries that are included in Cone. The remainder will be updated when she releases the final volumes.

Generally speaking, yes, NCPED has short entries and is a good basic source.

Also a good point. Maybe a hand-curated list for the first ten or so lessons, and let the computer figure it out from there?

1 Like