On the very idea of a critical edition of the Pali Canon


I just read another article about the critical edition:
Somaratne, G. A. (2015). Middle Way Eclecticism: The Text-critical Method Of The Dhammachai Tipitaka Project. Journal of the Centre for Buddhist Studies, Sri Lanka (Journal of Buddhist Studies). Online here

It is more technical in describing the procedure of the edition and giving a short manuscript background for the pali suttas. Both are interesting to read, especially for those interested in EBT manuscripts…

List of tipitaka versions' differences

Here is a talk by Wynne on the project at the Oxford Centre for Buddhist Studies:

The talk is not really entertaining, and unfortunately only audio (why?) but gives interesting details about the possibility of an actual Ur-text at Buddhaghosa’s time. Also the announcement that apart from the book-edition all the material including scans of the originals will be available online was new to me.


That’s some very interesting insight, I’ll try to keep this answer simple but I can post some more technical doc if you’re interested. Generally speaking this is quite experimental and we don’t really have some code for our solution.

First, we don’t really produce critical/diplomatic editions at BDRC, but here are a few links on what’s been done by some of our partners:

  • Esukhia is working on a critical edition of the whole Tengyur, it should be published in a few month for a subset of 50 volumes, and the rest by the end of the year, they have some notes on their methodology on their Github repo
  • OpenPhilology got a grant for a related project, the grant proposal has some interesting insight on the notion of Ur texts and we tend to follow that view
  • SARIT has a few critical editions
  • rKTs have built an interface to allow the input of different versions of the Tibetan Canon that provides images and a proposal of text that you can copy/paste, based on other editions, you can see it here, I think it’s a pretty good and simple idea
  • the upama platform, (seen here in action) is I think a good example of a UI for a digital critical edition: you can change the base text, show/hide witnesses or even hide the critical apparatus alltogether if you don’t need it. I think having a UI that adapts to the user (scholars and practitioners have different needs) is important.

Now, let’s get a little bit more technical about data representation. There is a very important use cases we want to allow that are not directly possible with all variants as separate, unrelated texts. It is the ability to reference a portion of a text but not in a specific edition. For instance, if you come across a late commentary that quotes a text, we want to have the information “this part of the commentary is a quote of this part of this text”. But if you have dozens of versions of the original text, which one will you use? A variation of this use case is obviously collation. Now, because we don’t really want an ur text, what we are moving towards is what we call a “base layer” or a “neutral layer”, which is just a mere reference, not a version of the text per se. Then, based on this layer, all other editions are recorded as diffs (using Linked Data technology, here web annotations). So when you want to refer to a specific part of a text in general (when the original edition doesn’t matter), you can just point to some coordinates in the base layer. This brings some interesting benefits: you can stack different layers on top of the base layer; users can even create their own (bookmarks, notes, etc.), and they can also choose what version of the text they want to see, we don’t choose one in particular (because that could only lead to partisan feud). As I said it’s a bit experimental, but all comments and questions are welcome!


Thanks, I’ll check out the projects you mention.

Indeed, this is crucial. In the approach that I describe, the base reference would be provided simply by the segments. Because this is based on semantic divisions in the text itself (sentences, etc.) each text is keyed off this, and a reference to a segment is edition-agnostic. For us, the segments are derived from our edition, the Mahasangiti, but there’s no reason other editions shouldn’t be treated the same way.

So we can refer to, say, MN 1#4.5, which means “Majjhima Nikaya sutta number one, section 4, segment five”. The neat thing is that this reference is not just agnostic between editions of the Pali, but also between translations.

Doing this is probably not too difficult in the Pali sphere, as the differences between editions are (in most cases) very minor. If the editions varied more substantially, it gets more complex. But still, there’s no insuperable difficulty.


I’m not convinced that producing the first critical edition of the Canon is a bad idea. It would be a bad idea if it became The Only critical edition (ie the established truth), but, as Wynne points out in his article, scholarship progresses by work being redone and earlier mistakes being corrected by more recent scholars. In the talk @Gabriel links to in post #22 Wynne says that they are putting the hundreds of texts they’ve collected online for free access and that they have been forced to choose a small selection to base their edition on. In this way they will contribute a small amount to existing knowledge about the canon. As Ven @brahmali says above good translators need more than philological knowledge; this is another reason why multiple editions would be useful. Wynne is not closing the door on later attempts to extend and correct/replace their findings. This is how scholarship progresses. A corpus full of thousands of texts is great for scholars, but dilettantes like myself will never want to wallow among many unedited transcriptions of original texts and will find somebody else’s best efforts a boon.

I have a couple of questions:

  1. How does the OCBS depository of texts differ from your Github one? (I expect you’ll tell me that yours is more fully interactive, but I won’t really understand the benefits of this as allowing open access for anyone to alter the transcription of a historical texts sounds rather risky to me.)

split this topic #26

6 posts were split to a new topic: Temp Topic


Like i said:

Sounds great, except it hasn’t happened to the best of my knowledge. Maybe they will do it, but really, how long does it take to upload a bunch of images? They could have have put it on Internet Archive in an afternoon.

Ahh, existence? I don’t think OCBS has any images. The project is run by Dhammakaya, who so far as I know have the images in-house and have not released any. (This from Mark Allon, who cannot view them even though he is on the board for the project.)

A bad faith actor can hack anything and change anything. It happens all the time. It happened in pre-digital times. In the sixties, a couple of Sri Lankan monks “translated” the English translation of the Chinese translation of the Vimuttimagga back into Pali and claimed they had discovered the original text. They were swiftly unmasked and their reputations ruined. That is how it works.

The fact that people can change things is not the point. The point is, can the change be detected.

Github is perhaps one of the most secure environments available today. The major tech companies—Google, Facebook, Microsoft, and so on—host open-source code on Github and use it to run critical applications every moment. Heck, you are running code from Github right now. What more secure way of hosting things than the place that all the tech companies choose to host their things. Who better to ensure security and safety in a digital medium than those guys, who are protecting their livelihood against much smarter and more dedicated hackers than we will ever face? Any attempt to make one’s own proprietary system for ensuring security will undoubtedly end up less secure.

Github retains an indelible record of every change to every file. If anyone claims to produce something and it differs from the original, it can trivially be checked and those who attempt the fraud will be ruined.


That explains why I couldn’t find anything then. :rofl:

I’m sorry I put you to the trouble of such a long reply. But thank you very much Bhante for spending time on it.

Unbelievable. :open_mouth:

Thank you for information about Github: obviously I am ignorant about these things including

And ultimately anicca rules.