New parallels to add

sujato · March 31, 2016, 9:56am

That’s right, yes. And hopefully that will make it easier to check them. In addition, they need to be resolved against the parallels listed in the csv files, as there will be some parallels found in both places.

Vimala · March 31, 2016, 9:59am

So I will extract them into one file.
When making the json file for the parallels I noticed that some are listed double in different places. That should not be a real problem once we have this implemented and might occasionally happen anyway. The program will filter out the doubles.

sujato · March 31, 2016, 10:02am

Sure, because programs always work exactly as you want them to, and never create unexpected problems.

At the end of the day, it’s the data which is most important. We’re using parallel data that’s over a century old in some cases. Long after SuttaCentral the website has disappeared, people will, I hope, be using our data, expanding and improving it and using it in new ways. We should take the time and care to ensure that all of the data is consistent, clean, non-redundant, and reusable.

Vimala · April 1, 2016, 12:01pm

Herewith a dump of all the cross references (the ones that refer to another page on the site) with the double groups (not double entries within groups) filtered out in the list. This immediately shows some problems because many of these references are obviously not correct.

for instance:

[‘an10.106#1’, ‘dn34#196’]
[‘an10.106#1’, ‘dn34#224’]

The first one links to the top of page dn34 (which should be sc-id = 1 but it is 196), while the second one links correctly.

So we have several problems here. First that the sc-id’s don’t start at 1 on every page as @Sujato already pointed out.
The second problem is that links are often to the top of a page and not the correct passage within the page.
In other cases they refer to the completely wrong page altogether, probably because numbering of pages was later changed.

crossreferences.zip (86.0 KB)

sujato · April 1, 2016, 10:30pm

Thanks. Well, looks like we have a job in front of us.

Extra points for use of the word “herewith”!

sujato · April 3, 2016, 1:55am

AN 7.40 is listed as a cross-reference to AN 6.72. However it is closer to AN 6.24, as noted in BB’s notes. Probably both should be partial parallel.

Note, however, that in this case the cross reference is in fact to the correct reference. Not all hope is lost!

sujato · April 8, 2016, 2:18am

SN 22.10 has a number of listed cross-references. To these we can add as partial parallel AN 7.71, as noted by Ven Bodhi.

Vimala · April 8, 2016, 11:47am

??? Are you sure you mean SN 22.10 … I can’t see any parallels or cross-references, also looked in the html and no cross references mentioned.

sujato · April 9, 2016, 2:37am

sorry, it should be SN 22.101.

sujato · April 20, 2016, 12:42am

sn2.7#2 is full parallel with an9.42#2

sujato · April 28, 2016, 2:29am

There’s a confusion in the parallels for AN 10.45. Two separate sets of parallels have been conflated, and moreover at least one is missing. Here’s the correct sets of parallels:

AN 10.45, EA 46.6, pi-tv-bu-vb-pc83#5-14
SA 495, MA 48*, AN 5.168, AN 11.4-5.

sujato · May 4, 2016, 10:57am

Some that are mentioned by BB:

AN 10.119 = AN 10.167*
AN 10.120= AN 10.168*

sujato · May 10, 2016, 11:02am

SN 22.90 and SA 262 are full parallels of each other.

In addition, they are both partial parallels of the set of full parallels to SN 12.15. In fact SN 12.15 is quoted in them.

Vimala · May 17, 2016, 7:50am

I have uploaded a new file with the dhp parallels on github as well as its json equivalent. Note that these use the new numbering system that is not on SuttaCentral but on Staging. The file now contains over 3500 parallels between the various Dhammapada versions in Pali, Chinese, Sanskrit, Prakrit and Gandari.

github.com

suttacentral/suttacentral-data/blob/fa7b08e089b2c2106633b0c50687dd4eee1d9247/table/newnumbering/paralleldhp.csv

dhp#1,pdhp#1,,
dhp#1,gdhp#201,,
dhp#1,uv31#31.23,,
dhp#1,t213.31#vagga-gatha-num-sc31.13,,
dhp#1,t212.32#vagga-gatha-num-sc32.13,,
dhp#1,t210.9#vagga-gatha-num-sc9.1,,
dhp#1,mkv#66,1,
dhp#1,ne37#10,,
dhp#1,pe2#6,,
dhp#2,pdhp#2,,
dhp#2,gdhp#202,,
dhp#2,uv31#31.24,,
dhp#2,t213.31#vagga-gatha-num-sc31.14,,
dhp#2,t212.32#vagga-gatha-num-sc32.14,,
dhp#2,t210.9#vagga-gatha-num-sc9.2,,
dhp#2,mkv#66,1,
dhp#2,ne37#65,,
dhp#2,pe8#47,,
dhp#2,pe2#6,,
dhp#3,mn128#6,,

This file has been truncated. show original

Vimala · May 17, 2016, 8:01am

I have a question about inferred parallels that are on the site now. Some are easy to see the relationship but some are not. For instance, in our current file it says:

an3.37-38,ea24.6,
an3.37-38,ma202,
an3.37-38,sa-2.46,
an3.37-38,sa1117,
an3.37-38,sht-sutta55,
an3.37-38,t87,1,
an3.37-38,t88,1,
an3.37-38,t89,1,

and

an3.70,ea24.6,
an3.70,ma202,
an3.70,sa861-863,
an3.70,t87,1,
an3.70,t88,1,
an3.70,t89,1,

So these are partly overlapping, but not full parallels. The system at the moment just groups all that together and makes an3.37-38 a full parallel of an3.70, which is incorrect. How should this be put into correct groups? I feel a little bit hampered in not being able to read the Agama texts so I cannot make a judgement on whether texts are full or partial parallels.

My initial feeling about this is to make ea24.6 and ma202 into partial parallels for both an3.37-38 and an3.70. But is that correct?

sujato · May 17, 2016, 8:08am

Sadhu Sadhu! That’s fantastic work, Ayya, i hope we can get this implemented live ASAP.

Meanwhile, when you get a moment, could I ask you to make a separate post on Discourse to describe your sources and methods, the texts included, and so on. Maybe Git it as well, for posterity.

sujato · May 17, 2016, 8:11am

I don’t believe that there’s a programmatic way to solve this, which is one of the weaknesses of the current system.

One question: when the parallels that are currently inferred are implemented in the new system, are they marked as inferred? Or is that info lost? (As it is, for example, in our current display, where we don’t distinguish these things)

I would simply suggest make a list of all inferred parallels to be checked later. If most of them are unproblematic, just include the ones you have doubts about. When we have the final list, we can ask Rod to check them.

Vimala · May 17, 2016, 8:29am

I have a list of “doubles” where they occur (see below). There are not too many but I cannot make the judgement. It’s not a programmatic way to solve this that I am looking for but some general guidelines on how I should restructure the current files for making it suitable for JSON. I can only do that if I know which ones are partial and which ones are full parallels. But of course it would make sense if Rod could do that.

The new system is a completely different way of representing parallels. There are groups that belong together with A,B,C as full parallels and D,E,F as partial parallels to that group. That is basically the only info that is in our current csv files, apart from the fact that now a certain text can appear multiple times in different groups as “full parallel”, which results in the grouping of the two initial groups.

So in the new system, when you click on A, B or C, it shows you all parallels of the group and its partials (so A-F). If you click on D, E, or F, it only shows you A-C. because D,E,F might not necessarily have anything in common. If they do, they should be marked as such in a separate group.

[‘ea24.6’, ‘ma202’, ‘gf10’, ‘sa875-876’, ‘an4.8’, ‘an5.140’, ‘t1536.15d’, ‘t1536.16b’, ‘t115’, ‘ma202’, ‘ma113’, ‘t59’, ‘sa1250’, ‘sa-2.145’, ‘sa919’, ‘an4.162’, ‘an3.36’, ‘an11.17’, ‘ma211’, ‘ma29’, ‘sa-2.200’, ‘sa-2.201’, ‘sa-2.202’, ‘sa-2.203’, ‘sa-2.204’, ‘sa954’, ‘sa-2.6’, ‘sa1163’, ‘sa1163’, ‘sa1269’, ‘sa966’, ‘sa967’, ‘sa968’, ‘sa969’, ‘sa970’, ‘sa971’, ‘snp2.1’, ‘ja531’, ‘sa-2.284’, ‘sa1286’, ‘sa-2.105’, ‘sa1192’, ‘sa-2.290’, ‘sa1292’, ‘dq104’, ‘sn12.2’, ‘sa285’, ‘sa354’, ‘sa450’, ‘sa954’, ‘sn16.11’, ‘sa1316’, ‘sa-2.170’, ‘sa586’, ‘sa-2.138’, ‘sa1001’, ‘sa-2.187’, ‘sa593’, ‘sa-2.136’, ‘sa999’, ‘sa-2.189’, ‘sa595’, ‘sa-2.363’, ‘sa1343’, ‘sn2.26’, ‘sa-2.306’, ‘sa1307’, ‘sa-2.172’, ‘sa588’, ‘sa-2.309’, ‘sa1310’, ‘sa2’, ‘sa72’, ‘sa5’, ‘sa198’, ‘sa199’, ‘sa164’, ‘sa142’, ‘sa152’, ‘sa164’, ‘ea27.8’, ‘sa274’, ‘sa146’, ‘sn35.121’, ‘sa239’, ‘sa240’, ‘sa274’, ‘sa206’, ‘t505’, ‘sa207’, ‘sa473’, ‘sa474’, ‘sa476’, ‘sa476’, ‘sa-2.142’, ‘sa1004’, ‘sa106’, ‘sa748’, ‘sn45.8’, ‘sa778-780’, ‘sa705’, ‘sa705’, ‘sa715’, ‘an5.193’, ‘sa747’, ‘sa747’, ‘da18’, ‘sa498’, ‘sn47.24’, ‘sa652’, ‘sa642’, ‘sa675’, ‘sa673’, ‘sa537’, ‘sn55.46’, ‘sa1125’, ‘sa417’, ‘sa-2.101’, ‘sa1188’, ‘sa-2.355’, ‘sa1335’, ‘sa-2.363’, ‘sa1343’, ‘sa17’, ‘sa9’]

sujato · May 17, 2016, 8:37am

Okay, that’s good. Can you supply a list in a more meaningful form? I’m not sure how to make sense of that big cluster. if we have just a simple list, Rod can go through and mark them as full or partial, then we can proceed.

Vimala · May 17, 2016, 8:41am

Sorry, yes, this just means all the items that are mentioned twice or more in the correspondence.csv list we currently have. So if you have the first item: ea24.6, you just search in correspondence.csv and see where it appears several times. I can make a more readable list for Rod if you like.