I’d like to share a project I’ve been working on for the last year and a half with the hope that some may find it helpful.
It includes the MN, AN, and the older parts of the KN. SN is in progress, with DN to come later.
My current working title for the project is the Comprehensive Index of Pali Suttas (CIPS)
Below are answers to some of the most common questions I field. Feel free to skip and just start using the index.
I am very much open to any and all feedback. Please let me know how this tool could be made more useful.
Goal of the project
The goal of this index project is to produce a comprehensive index of the Sutta Pitaka (i.e. covering all things) without being exhaustive (i.e. covering every occurrence of every possible thing). It is meant to be used both by people exploring a subject as well as looking for specific things, although the second is much more difficult to do.
This project is a combination of a term index and subject index. Meaning, sometimes citations are to specific Pali terms, and sometimes to the general subject matter of a sutta even if a specific term does not appear.
Terms, subjects, proper names, and similes are all included together into a single index.
My hope is to eventually produce a print-on-demand version of the index to be kept in monastery libraries where computer access is not possible.
What the index is not…
For an exhaustive study of specific Pali words, people should be doing direct term searches, not using an index.
An index is not meant to be a tidy classification or organization of technical terms. At every step I try to ask the question, “How would someone think to look for this information?” and “Will someone be happy with what they get after fallowing this citation?”
Status as of 2024-05-06
So far the MN and AN have been completely indexed, as well as Kp, Dhp, Ud, Iti, Snp, Vv, Pv, Thag, Thig (which is all I plan to cover in the KN). The SN has been started. The DN will be competed last.
At this point the project should be considered to be in the draft stage. Human created indexes are done in two stages. In the draft stage, the text is read and headings, subheading and locators are created and indexed on the fly, trying for some consistency but not always achieving. In the editing stage, the index is reviewed to lump and split in order to make it more useful. Since this project is still in the draft stage, there will be many cases where similar subheads will be eventually grouped, and others where long lists of citations will need to be differentiated.
The real editing stage will only happen after the completion of the SN and DN.
Creation method
This index was created in the same way that all modern indexes are: by hand, reading each text one by one. Although computers are good at creating concordances of words, they are still very lacking when it comes to creating usable and helpful indexes. There are really no shortcuts.
This is not a compilation of existing indexes (e.g. accesstoinsight.org and the Wisdom Publication translations), although I did reference them to help with term selection. I originally started the index by collecting a few terms from those indexes (e.g. loving-kindness, giving, death) but I found that this was rather limiting and not feasible for the complete project, so I stopped. In the end the complete project could only be accomplished by the traditional method of reading and indexing.
Except for the Vv and Pv, I used Bhante Sujato’s translations as the basis for the work, although I didn’t always choose his vocabulary for the indexed terms. (see Headwords below)
Although I would love for people to be able to enjoy simply reading through the index for enjoyment, the desire to keep terms and subheadings as brief as possible makes this less likely. The exact phrasing of headwords and subheads should not be taken to imply any specific interpretation of the Dhamma. It’s just an index.
Features
Beyond the obvious…
You may notice that similes are usually indexed in two directions. For example under dreams
you can find simile for sense pleasures
. But if you are looking for all the similes for sense pleasures, you can look there and find like a dream
, like a grass torch burns holder
etc. The hope is that this will be useful for people preparing talks on a topic when they would like to include a simile.
On desktop, when you type in a search phrase, diacritics, spaces and punctuation are ignored. Pressing Enter will take you to the first item in the list. Pressing Tab instead will move you down the list.
Technical:
- On desktop if you hover over a citation it will usually either popup the “blurb” from Sutta Central, or it will contain the contents of the verse (Dhammapada) or first verse (Thag/Thig). Personally I find this to be one of the most useful features of the whole index, potentially saving hours of time following irrelevant leads.
- If you hover over a headword (on desktop) you will see three icons:
TXT
,M↓
andHTML
. Clicking on these will copy the entire entry (minus cross refs) in either text, Markdown, or HTML format. - Under the gear icon there is a button for “Toggle color”. This will colorize the citations using the color scheme from SuttaFriends.org. Sometimes it’s useful if I want to specifically find citations from a specific book.
Headwords
- When possible, headwords are nouns. Usually plural. Sometimes exceptions are made when a term is well known, e.g. “against the stream.”
- I have shamelessly played fast and loose with the grammar of Pali and English in the headwords. Meaning related nouns, verbs, adj, etc are almost always lumped together. And subheads will not always grammatically match headwords. For example under “eating in moderation” is the subhead “how to.” Matching them up would have been overly cumbersome (“how to be one who practices…” ???) and not that much more helpful to the user.
- Eventually more well known Pali words will be included as cross-ref headwords. For now, they can be found using the search filter at the top (on desktop)
- Generally headwords will in English, although some exceptions are made for terms that lack a simple, indexable translation, e.g. Bodhisatta
- For the most part I have so far ignored term translations other than Bhante Bodhi’s and Bhante Sujato’s. Eventually I will add those of Ajahn Thanissaro as cross refs once all the headwords have been settled. (e.g. there is not yet a cross ref from “stress” to “suffering (dukkha)”)
- There are some cases where English headwords with Pali (e.g. “harmony (samagga)” and “harmony (sāmañña)”) will have an English only headword (e.g. “harmony”). In these cases without Pali it should be understood that although a sutta may contain a certain topic, it doesn’t contain a Pali word for that topic. However, there are countless cases where there is no Pali word in the headword. These are either cases that aren’t suitable for Pali or just need some cleaning up.
- In an index to a specific translation/book, it is much easier to decide upon terms. For this index I have taken into account the commonly known translation of terms, their ease of indexing, as well as my own preferences. When the index is finished, the use of cross references should mean that no matter what translation one is using, the needed information can be found.
- Don’t forget that you can use Ctrl+f in the browser to search on the whole page. Sometimes this is useful if navigating the head/sub-head words doesn’t give you what you need.
Cross References (cross refs)
The cross refs are an essential part of the index. If you don’t find what you want directly under a headword, you must follow the cross refs if you want to have any hope of finding what you are looking for. Because of the scope of the index, I often opt for splitting things up if there is some justification for doing so. As a user of the index myself, even I fall into the trap of ignoring the cross refs, to my own peril.
One goal of the index is to have important synonyms cross ref’ed. For example, you shouldn’t have to know that a certain simile is about “vipers” and not “snakes.” So I am especially interested in feedback on terms that should have cross refs. (except, at this point, for Ajahn Thanissaro’s terms).
I’m also experimenting with creating category xrefs. Nature
is a great example of that. It can lead you to plants
that could further take you to flowers
where you will find a list of individual flowers. Is that useful? I don’t know!
Citations only to individual suttas?
A book index has the luxury (and heavy burden) of citing specific pages where an index term or subject appears. Since I didn’t want this index to be permanently linked to a specific edition, I am only citing individual suttas.
On the theoretical level, I feel it is problematic for a user to be directed very narrowly to only the part of a sutta that relates to their search topic. Personally I have faith that the the Buddha and the arahants composed and collected the suttas as they did for a reason, and that taking a passage out of the context of a larger sutta can be problematic.
On a technical level, even if I was willing to link to segment numbers on Sutta Central, once you start doing that you get into the complications of dealing with ranges of segments and non-contiguious segments. By linking only to one segment it is implied that that is the only place the topic occurs. Unless you get into technical methods like ff
and so on.
So for the sake of simplicity as well as philosophical belief, I have only linked to whole suttas.
But… The DN does push the limits of this method. Will it really be useful to link to the whole DN16 for a topic that only occurs one place? I’m considering linking to sections, but that raises the issue of whose sections to link to. I’m still not sure and am putting off the DN till last in hopes of eventually coming up with a good solution.
Appreciation
First off, I need to thank Ven. @Khemarato not only for his technical coding advice but also for feedback on the indexing itself. I have received countless suggestions from him that have greatly improved the project.
And of course also to Bhante @Sujato. Without his segmented translations this index would be far less accurate and useful than it is.
I also need to express my gratitude to the creator of the Digital Pali Dictionary. It is very helpful for this kind of work. But any faults in the Pali in the index are of course only mine.
Feedback is welcome!
I would love to hear any and all feedback. This project will only fulfill its purpose if people can use the index to find the Dhamma that they need. Although I try to get into the mind of the potential user, that can only go so far. Please let me know when you aren’t able to find something you would expect to.
I’m also down for discussing all things related to the philosophy and art of indexing. As you can tell, I’ve been thinking about this all too much over the last year and a half, lol.
Geeky technical details
I am creating the data in a spreadsheet that later gets built into a json object which is used by the app. LibreOffice provides some of the functions that bespoke indexing software would, such as auto completion.
When I investigated indexing software I found that not only were they prohibitively expensive (usually USD 500+) but that they were extremely antiquated and despite their claims, did not always play nice with Unicode. Ideally indexing software would save some time and perhaps reduce errors. But it was not to be.
The current version of the site is a static html page with no framework.
The website itself is a single page CRA, Create React App. I was learning React at the time I started the project so I just used that.
Unfortunately, early on the size of the index pushed the limits of React. And state isn’t really that big of a part of the app (the only real state thing is where the citations link to). There will be a new release of React in 2024, React 19, that seems like it might solve some of the app’s problems. If it doesn’t I will likely rebuild the whole app using vanilla JS and probably have the index as a smiple html page that I build in advance. In any case, a document with over 13,000 links (at present!) is always going to push the limits of browsers, and especially mobile devices.