Majjhima Nikāya parallel list

Aminah · February 18, 2016, 5:05pm

Hello,

On the https://suttacentral.net/mn page, under the Parallels column for MN36 it reads “… MN 85, MN 100*, MN 85*…” I looked up what was meant by the asterisk on the help page and saw that it signifies a partial parallel, whereas suttas with no asterisk are considered to be full parallels. I can’t quite figure out how MN85 can be both a partial and a full parallel so suspect a little error might have been made, and thus flag it up for review.

Best.

sujato · February 19, 2016, 1:23am

Ha ha, yes, it is odd. Thanks for noticing, please let us know if you find any other cases. By sheer coincidence we discussed similar cases yesterday, although I wasn’t aware of this one.

The problem here is that in some cases we infer parallels. So if A=B and B=C, we infer that A=C. In this case, the inferred parallel is present even though the parallel is also explicitly listed as a partial parallel. This is a logical problem with our programming.

We are working on a new underlying system for representing parallels, and we will hopefully fix such problems.

@blake and @Vimala, take note! We need to ensure that inferred parallels are filtered out in such cases.

In fact, I think the whole notion of inferred parallels is dodgy, and although we have talked about improving this, may I suggest a better way.

Generate a list of inferred parallels.
Check them by hand.
If they are genuine parallels, add them to our tables. And then,
Abolish the whole concept. All parallels should be explicitly represented in the data.

Vimala · February 19, 2016, 7:33am

This should not happen any more when we use a json file, where simply [A,B,C] are linked in a set if they are all full parallels of each other. But yes, it will be a lot of work because everything will need to be checked by hand.

sujato · February 19, 2016, 8:00am

We don’t need to check the regular parallels, which would be a huge job.

In this case, it is only the inferred parallels, which are not in our data files but are produced by inference from our code. I think we just need to get it to spit out a list of inferred parallels and check that. I’m not sure how many there are, but I don’t think it’s that many.

blake · February 23, 2016, 9:30am

Hmmm, I remember in old discussions John or Rod make it very clear that inferred parallels are equal in status to the explicitly defined parallels, and the technique was used extensively to reduce the amount of data entry required.

As it happens internally we do flag inferred parallels internally so here is a JSON dump:
inferred.json.gz (8.4 KB)

edit: actually I’ll also upload a pair-wise dump

pairwise-inferred-parallels.json.gz (9.0 KB)

The pairwise dump is the entries which would have had to have been added to the CSV to fully represent all the inferred parallels. I’m not sure which is more useful but the pairwise is without duplicates.

sujato · February 23, 2016, 9:51am

Yes, Ayya subsequently pointed this out, I thought it was just a few cases but that is not so. There are definitely problems with it.

In the case mentioned in this thread, it is both partial and inferred, and obviously this should be filtered out.

I have looked briefly at the data. There’s about 2100 entries in the pairwise collection, so it’s a bit of a job to check them all. Still, it should probably be done at some point. Probably there’s only a few mistakes. I hope!

Meanwhile, do you agree we should eliminate the category of inferred parallel, and instead spell all parallels out explicitly? I.e. we should integrate this list with our main parallels data.