Comprehensive Index of Pali Suttas

I’d like to share a project I’ve been working on for the last year and a half with the hope that some may find it helpful.

It includes the MN, AN, and the older parts of the KN. SN is in progress, with DN to come later.

Note: it may not work on older mobile devices.

My current working title for the project is the Comprehensive Index of Pali Suttas (CIPS)

Below are answers to some of the most common questions I field. Feel free to skip and just start using the index.

I am very much open to any and all feedback. Please let me know how this tool could be made more useful.

Goal of the project

The goal of this index project is to produce a comprehensive index of the Sutta Pitaka (i.e. covering all things) without being exhaustive (i.e. covering every occurrence of every possible thing). It is meant to be used both by people exploring a subject as well as looking for specific things, although the second is much more difficult to do.

This project is a combination of a term index and subject index. Meaning, sometimes citations are to specific Pali terms, and sometimes to the general subject matter of a sutta even if a specific term does not appear.

Terms, subjects, proper names, and similes are all included together into a single index.

My hope is to eventually produce a print-on-demand version of the index to be kept in monastery libraries where computer access is not possible.

What the index is not…

For an exhaustive study of specific Pali words, people should be doing direct term searches, not using an index.

An index is not meant to be a tidy classification or organization of technical terms. At every step I try to ask the question, “How would someone think to look for this information?” and “Will someone be happy with what they get after fallowing this citation?”

Status as of 2024-05-06

So far the MN and AN have been completely indexed, as well as Kp, Dhp, Ud, Iti, Snp, Vv, Pv, Thag, Thig (which is all I plan to cover in the KN). The SN has been started. The DN will be competed last.

At this point the project should be considered to be in the draft stage. Human created indexes are done in two stages. In the draft stage, the text is read and headings, subheading and locators are created and indexed on the fly, trying for some consistency but not always achieving. In the editing stage, the index is reviewed to lump and split in order to make it more useful. Since this project is still in the draft stage, there will be many cases where similar subheads will be eventually grouped, and others where long lists of citations will need to be differentiated.

The real editing stage will only happen after the completion of the SN and DN.

Creation method

This index was created in the same way that all modern indexes are: by hand, reading each text one by one. Although computers are good at creating concordances of words, they are still very lacking when it comes to creating usable and helpful indexes. There are really no shortcuts.

This is not a compilation of existing indexes (e.g. accesstoinsight.org and the Wisdom Publication translations), although I did reference them to help with term selection. I originally started the index by collecting a few terms from those indexes (e.g. loving-kindness, giving, death) but I found that this was rather limiting and not feasible for the complete project, so I stopped. In the end the complete project could only be accomplished by the traditional method of reading and indexing.

Except for the Vv and Pv, I used Bhante Sujato’s translations as the basis for the work, although I didn’t always choose his vocabulary for the indexed terms. (see Headwords below)

Although I would love for people to be able to enjoy simply reading through the index for enjoyment, the desire to keep terms and subheadings as brief as possible makes this less likely. The exact phrasing of headwords and subheads should not be taken to imply any specific interpretation of the Dhamma. It’s just an index.

Features

Beyond the obvious…

You may notice that similes are usually indexed in two directions. For example under dreams you can find simile for sense pleasures. But if you are looking for all the similes for sense pleasures, you can look there and find like a dream, like a grass torch burns holder etc. The hope is that this will be useful for people preparing talks on a topic when they would like to include a simile.

On desktop, when you type in a search phrase, diacritics, spaces and punctuation are ignored. Pressing Enter will take you to the first item in the list. Pressing Tab instead will move you down the list.

Technical:

  • On desktop if you hover over a citation it will usually either popup the “blurb” from Sutta Central, or it will contain the contents of the verse (Dhammapada) or first verse (Thag/Thig). Personally I find this to be one of the most useful features of the whole index, potentially saving hours of time following irrelevant leads.
  • If you hover over a headword (on desktop) you will see three icons: TXT, M↓ and HTML. Clicking on these will copy the entire entry (minus cross refs) in either text, Markdown, or HTML format.
  • Under the gear icon :gear: there is a button for “Toggle color”. This will colorize the citations using the color scheme from SuttaFriends.org. Sometimes it’s useful if I want to specifically find citations from a specific book.

Headwords

  • When possible, headwords are nouns. Usually plural. Sometimes exceptions are made when a term is well known, e.g. “against the stream.”
  • I have shamelessly played fast and loose with the grammar of Pali and English in the headwords. Meaning related nouns, verbs, adj, etc are almost always lumped together. And subheads will not always grammatically match headwords. For example under “eating in moderation” is the subhead “how to.” Matching them up would have been overly cumbersome (“how to be one who practices…” ???) and not that much more helpful to the user.
  • Eventually more well known Pali words will be included as cross-ref headwords. For now, they can be found using the search filter at the top (on desktop)
  • Generally headwords will in English, although some exceptions are made for terms that lack a simple, indexable translation, e.g. Bodhisatta
  • For the most part I have so far ignored term translations other than Bhante Bodhi’s and Bhante Sujato’s. Eventually I will add those of Ajahn Thanissaro as cross refs once all the headwords have been settled. (e.g. there is not yet a cross ref from “stress” to “suffering (dukkha)”)
  • There are some cases where English headwords with Pali (e.g. “harmony (samagga)” and “harmony (sāmañña)”) will have an English only headword (e.g. “harmony”). In these cases without Pali it should be understood that although a sutta may contain a certain topic, it doesn’t contain a Pali word for that topic. However, there are countless cases where there is no Pali word in the headword. These are either cases that aren’t suitable for Pali or just need some cleaning up.
  • In an index to a specific translation/book, it is much easier to decide upon terms. For this index I have taken into account the commonly known translation of terms, their ease of indexing, as well as my own preferences. When the index is finished, the use of cross references should mean that no matter what translation one is using, the needed information can be found.
  • Don’t forget that you can use Ctrl+f in the browser to search on the whole page. Sometimes this is useful if navigating the head/sub-head words doesn’t give you what you need.

Cross References (cross refs)

The cross refs are an essential part of the index. If you don’t find what you want directly under a headword, you must follow the cross refs if you want to have any hope of finding what you are looking for. Because of the scope of the index, I often opt for splitting things up if there is some justification for doing so. As a user of the index myself, even I fall into the trap of ignoring the cross refs, to my own peril.

One goal of the index is to have important synonyms cross ref’ed. For example, you shouldn’t have to know that a certain simile is about “vipers” and not “snakes.” So I am especially interested in feedback on terms that should have cross refs. (except, at this point, for Ajahn Thanissaro’s terms).

I’m also experimenting with creating category xrefs. Nature is a great example of that. It can lead you to plants that could further take you to flowers where you will find a list of individual flowers. Is that useful? I don’t know!

Citations only to individual suttas?

A book index has the luxury (and heavy burden) of citing specific pages where an index term or subject appears. Since I didn’t want this index to be permanently linked to a specific edition, I am only citing individual suttas.

On the theoretical level, I feel it is problematic for a user to be directed very narrowly to only the part of a sutta that relates to their search topic. Personally I have faith that the the Buddha and the arahants composed and collected the suttas as they did for a reason, and that taking a passage out of the context of a larger sutta can be problematic.

On a technical level, even if I was willing to link to segment numbers on Sutta Central, once you start doing that you get into the complications of dealing with ranges of segments and non-contiguious segments. By linking only to one segment it is implied that that is the only place the topic occurs. Unless you get into technical methods like ff and so on.

So for the sake of simplicity as well as philosophical belief, I have only linked to whole suttas.

But… The DN does push the limits of this method. Will it really be useful to link to the whole DN16 for a topic that only occurs one place? I’m considering linking to sections, but that raises the issue of whose sections to link to. I’m still not sure and am putting off the DN till last in hopes of eventually coming up with a good solution.

Appreciation

First off, I need to thank Ven. @Khemarato not only for his technical coding advice but also for feedback on the indexing itself. I have received countless suggestions from him that have greatly improved the project.

And of course also to Bhante @Sujato. Without his segmented translations this index would be far less accurate and useful than it is.

I also need to express my gratitude to the creator of the Digital Pali Dictionary. It is very helpful for this kind of work. But any faults in the Pali in the index are of course only mine.


Feedback is welcome!

I would love to hear any and all feedback. This project will only fulfill its purpose if people can use the index to find the Dhamma that they need. Although I try to get into the mind of the potential user, that can only go so far. Please let me know when you aren’t able to find something you would expect to.

I’m also down for discussing all things related to the philosophy and art of indexing. As you can tell, I’ve been thinking about this all too much over the last year and a half, lol.

Geeky technical details

I am creating the data in a spreadsheet that later gets built into a json object which is used by the app. LibreOffice provides some of the functions that bespoke indexing software would, such as auto completion.

When I investigated indexing software I found that not only were they prohibitively expensive (usually USD 500+) but that they were extremely antiquated and despite their claims, did not always play nice with Unicode. Ideally indexing software would save some time and perhaps reduce errors. But it was not to be.

The website itself is a single page CRA, Create React App. I was learning React at the time I started the project so I just used that.

Unfortunately, early on the size of the index pushed the limits of React. And state isn’t really that big of a part of the app (the only real state thing is where the citations link to). There will be a new release of React in 2024, React 19, that seems like it might solve some of the app’s problems. If it doesn’t I will likely rebuild the whole app using vanilla JS and probably have the index as a smiple html page that I build in advance. In any case, a document with over 13,000 links (at present!) is always going to push the limits of browsers, and especially mobile devices.

If anyone knows React and would like to try to optimize what I have now, that would be most appreciated. Just send me a PM.

19 Likes

Congratulations Bhante! This is a big job to do.
Nice simple and easy to use interface.
Being built in react it should be easy for it to be a PWA, but at least on my iPad it’s not showing as such. Is that something planned?

1 Like

I don’t think I ever even considered it. maybe that would help with the speed, at least on the second loading. I’ll look into it.

Thanks!!

1 Like

WOW!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

What can I say … I am speechless!!!

3 Likes

Just happened to look at this entry:

accomplishment in diligence (appamādasampadā)

one with will develop noble eightfold path

My first impression when reading the explanation “one with will develop noble eightfold path” was: “one with will … ??? What sort of will does one need to develop the noble eightfold path?”

Then I looked back at the headword, and it occurred to me: “Ah! It means 'one with … [the headword] … will develop the noble eightfold path”.

Perhaps that could be made clearer by inserting a sort of placeholder for the headword in the sentence. I think I have seen a * for that purpose in indexes or dicrionaries:

one with * will develop noble eightfold path

4 Likes

Thanks for that feedback! I looked into this and it seems like using a marker like that is not part of normal indexing practice. At least none of the major indexing guidebooks talked about it. It’s not uncommon in dictionaries.

The AccessToInsight.org index (and therefore the one on SC) uses a ~ as a place holder. I have no idea why it’s not a standard practice. I realize now that not using this convention could make it more difficult for non-native English speakers.

I will revisit that issue when I start the editing process. Thanks!

2 Likes

What an incredible body of work, thank you so much for making this available! Will definitely be cross referencing this when working through the EBT myself, such a valuable resource! What is and isn’t a reference to kamma, for example, is subjective, did you do this alone or was there an ‘independent’ co-reviewer (e.g., as they do with systematic reviews)

1 Like

In Voice examples, we also noticed that big suttas such as DN33, DN34 sometimes have singleton phrases. It’s a rather peculiar thing to have these solitary islands of meaning. After some discussion, we are now inclined towards prioritizing “more than 1 sutta and less than 25-50 suttas” as a criteria for inclusion in our indexing examples for a particular body of translations. Our rationale was to illuminate the connections as a priority for understanding and discussion. More suttas provide richer informed discussion.

That said, singletons are fascinating and a singleton index as a separate work would be a great topic for speculative discussion but might not lead to any conclusive consensus given the lack of repetition.

In any case, it should be possible in the index to include the segment ID of DN16 containing such a solitary indexed phrase to expedite lookup in large suttas.

5 Likes

Found a bug:

I tried to set it to “open in SC-Voice”, but it tries to find the old Voice voice.suttacentral.net which doesn’t exist any more. The new one is https://sc-voice.net.

2 Likes

Indeed! In fact all subject indexes are subjective. Term indexes are a little less so (either a term exists in a place or does not). But even with a term index there is a judgement by the creator whether or not an occurrence of a term merits a place in the index.

Usually print indexes in the back of the book are limited by space and time. The space allowed for an index is usually determined before the index is created. And usually indexes have to be created, for the most part, after the book contents is designed and the page proof is given to the author. So authors may have as little as two weeks between getting the proofs and their index deadline.

All three of Bhante Bodhi’s nikaya indexes begin with the statement “This index lists significant references only.” With my index, I had neither a space nor time limitation. So I tend to be much more comprehensive in what I include.

That said, it is not exhaustive. So I tend not to index several near identical suttas with the same term. If somene is looking for all occurrences, then they need to do word/phrase searches. As well, I am always hoping that the user feels that their time is not wasted when they use this index. There will always be misses along with hits, but I hope to have more of the latter than the former.

I’m not aware of any kind of verification process for indexes generally. Most books are lucky to have one at all, let alone one that is peer reviewed. A contentious book offer will often hire a professional indexer (which of course I am not). But I don’t think any index will aim to be the final arbitrator of truth or knowledge. It’s just a tool.

Of course if you are volunteering to review my work I would be very happy indeed. :nerd_face: But short of that my hope is that as people use the index they may express their gratitude by taking the time to offer any corrections or suggestions they may have. Until it goes to print (if it ever does) there is always the chance for improvements.

Yes. It’s not so much a technical issue. I can easily link to segments. But if someone is using the index in a print manifestation, then that doesn’t work. Unless of course they have print editions from SuttaCentral. Even still, as I mentioned, once you start linking directly to one occurrence within a sutta, there is generally an assumption that you will link to all of them. The lack of standard numbering of suttas themselves is an issue, all the more so numbering within suttas.

4 Likes

Indeed it looks like it was.

So for everyone on mobile, you should be able to go to your browser menu and select something like “install this app”. Then an icon should appear on your home screen that lets you use the index (kind of) outside your browser. So far it isn’t loading any faster for me, but it means that you can use the index without an internet connection.

Folks may have noticed that on the mobile version there is no built in index search box as there is on desktop. That’s because it was too buggy on mobile. For the full index experience, for now, you will need to be on a desktop.

4 Likes

Thank you for hard work and this effort :hearts:

3 Likes

Congratulations Bhante!

I imagine if the Canon compilation were done properly, this should have been a big task almost as important as the Suttas themselves. It will help Suttas to be much more accessible to new readers (or old alike).

If I may offer some help on what I can do (beyond translating UI to Thai, which I promise to finish that someday), I can help with some data analysis or finding some patterns.

For example, here’s a stat about which locators’ books. If you need some analysis or think of some questions about the data, I volunteer to help.
Screenshot 2567-05-18 at 00.07.20

Runnable Colab: Google Colab

1 Like

Yes, it’s all very interesting to think about. I guess in the past lucky people would have been around monastic teachers who had comprehensive knowledge of the canon and could just ask questions. But today most people are not so lucky.

1 Like