A review of inferred parallels on SC

sujato · May 29, 2016, 2:53am

Background

The data SuttaCentral uses for Sutta parallels originated with sets of tables compiled by Rod Bucknell. These were from a Pali perspective, in that they listed sets of parallels for Pali texts. When we initially designed SC we wanted to be able to look up parallels from any direction, not just Pali. To make this possible we introduced the notion of “inferred” parallels. These are meant to create a list of relevant parallels for any Sutta. The parallels are not kept in our data, they are generated programatically.

Essentially this works by saying “a full parallel of a full parallel is inferred to be a full parallel”: A=B and B=C, therefore A=C.

For example, MA 98 is listed in our data as a parallel of DN 22. MN 10 is also listed as a parallel of DN 22. However, MA 98 is not listed as a parallel of MN 10. However it would appear as an inferred parallel.

Blake has generated a list of inferred parallels; there’s 973 entries. Organized pair-wise, as the data is currently in our csv files, it’s 2100. (Note that the two sets of data are not identical; dn14, for example, appears in inferred.json but not in pairwise-inferred-parallels.json. Not sure why this is.)

Problems

“Parallel” doesn’t mean “identical”. So while such inferences are safe enough in maths, when it comes to sutta parallels a certain percentage will not be correct. It’s not easy to identify such cases.
The texts as listed in our data are not the same as what shows on our site. That means the portability and reusability of our data is poor.
There is no trivial way to correct errors, as they are produced via algorithm.

Testing

How big is the problem? It’s not easy to say. Of the 973 entries the first 27 are unusual, in that they are from the Pali side. That is, they are inferred parallels for Pali texts. I have checked these, and the results are below; we can use these to make corrections. But here’s the summary:

wrong: 4
right: 11
half-right: 6 (This is usually where a text is inferred as full parallel where it should be partial.)
unknown: 1

This is not hugely reassuring.

However, this may be because this is not what the inferred parallels are meant for, which is for inferences regarding texts other than Pali. My ability to test this is limited due to my lack of Chinese. However, I can check the inferences from Sanskrit → Pali. I’ve done this with a fair number of the SF texts and have yet to find any mistakes.

So unless we get any better information, it seems our “inferred parallels” work for what they are meant for, to create inverse sets of parallels. We should obviously fix the mistakes identified below, but there seems to be no need to begin larger scale corrections.

Pali Inferred Texts

MN 9: AN 9.13 is inferred.

MN9 is a full parallel for SA 344 and MA 29.
AN 9.13 is full for MN 29 and partial for SA 344.
However AN 9.13 and MN 9 are not parallels at all and have nothing in common.

MN 10: EA 12.1 and MA 98 are inferred.

the data is entered under DN 22. In this case it is correct.

MN 33: The Tibetan text is inferred, I cannot check it.

MN 36: MN 85 is present both as inferred and partial parallels. Obviously the inferred should not occur. The same situation occurs in the reverse situation, i.e. MN 85 → MN 36.

MN 42: There are two Chinese texts appearing as inferred. In this case, they are inferred from MN 41, and since MN 41 and MN42 are, in fact, virtually identical this is a safe inference.

DN 14: SN 12.65 is inferred. This is, however, only a small portion of the text so should be a partial.

DN 20: SN 1.37 is inferred via shared Chinese texts. It is identical with the first portion of DN 20 and so should be a partial parallel.

DN 22: The SHT fragment is inferred from MN 10. I can’t check this, but since MN 10 and DN 22 are almost identical this should be safe.

DN 27: AN 7.26 is inferred. However while it is thematically related it is not a parallel, even partial. The parallel comes two inferences:

DA 5. This is listed as full parallel to AN 7.66, however this appears to be a mistake. DA 5 is very similar to DN 27 and I can’t see any reason to justify treating it differently.
EA 40.1. This is a full parallel of AN 7.66. It should probably be removed from the parallels for DN 27.

DN 28: SN 47.12 is inferred. In this case the inference is correct. In fact DN 28 appears to be an expansion; because it is so much larger it should probably be partial.

DN 31: SN 45.141-148 is inferred via SF 100. However it is a completely different text, not even close to parallel. SF 100 is not on our system so I can’t check that.

SN 1.3: SN 2.19 is inferred, as both share the Chinese texts as full parallels. This is correct, they are identical sets of verses and should be full parallels.

The following cases are similar:

“sn1.12”: [“sn4.8”],
“sn1.15”: [“sn9.12”],
“sn1.21”: [“sn2.16”],
“sn1.26”: [“sn2.4”],
“sn1.29”: [“sn2.28”],:

SN 1.34: although it is inferred as full parallel to 1.36, in fact they share only one verse in common: sn1.34#5=sn1.36#4

“sn1.43”: [“sn2.23”] share only three verses: SN 1.43=SN2.23#2-4

“sn1.48”: [“sn2.20”], here the same verses are repeated with some narrative, so should qualify as a full parallel.

“sn1.50”: [“sn2.24”], correct.

“sn1.77”: [“sn1.79”], these share a similar style, but there is in fact no common text between them so they are not parallels at all.