A More Systematic Approach to References on SC

sujato · May 14, 2018, 1:05am

Background

SC inherits tens of thousands of texts, with probably hundreds of different referencing systems. Dealing with these elegantly and predictably is one of the hardest problems that we face. We have over the years discussed this many times, and spent a great deal of time and effort during development to get it right. Nevertheless, the system as it is is still flawed. I’ve been trying to think of ways to improve it, and I think I have an idea.

The problem

We inherit multiple different numbering systems, and they need to be handled in various ways. Ensuring that our system does this correctly in all cases has turned out to be trickier than we thought. Perhaps the problem lies with an insufficient degree of abstraction. We have been doing things case-by-case, and it has not worked out as well as I had hoped.

A proposed solution

I suggest that we introduce a set of three classes of references. For each item, I give a tick or cross to indicate whether I think it is currently being handled properly. Of course I may be wrong! We can discuss each case in more detail, but this is simply to indicate what areas I think need attention.

First class references: Each text must have one—and only one—set of first-class references.
- Use for all internal SC coding, processes, and data. ☒
- Use as the basis for segmenting.
- Display as “Textual Information” both in unsegmented and segmented views.
- Use as unprefixed numbering in URLs, eg. dn1/en/sujato#4.5. ☒
  - (Prefixed numbering, of course, should work as well, i.e. dn1/en/sujato#pts-cs4.5 = dn1/en/sujato#4.5.)
- Is the recommended reference system for users. ☒
Second class references: Each text may have zero or more sets of second class references.
- Display as “Textual Information” in unsegmented view only. ☒
- Has a URL with prefixed numbering.
- Maintained for backwards compatibility with older sources.
Third class references: Each text may have zero or more sets of third class references.
- Do not display in “Textual Information”.
- Maintained as inherited metadata in texts, but is rarely if ever used for referencing.
- Users can only access them through the raw data, not the UI.

How to do it

My understanding of how the coding works is very basic, so here are just some thoughts if they are useful.

Go through all the texts, identify all the referencing systems in them, and assign them to one of the three classes.
- Keep this in a new file, reference-classes.json?
On the front end, inject a css class for each reference as appropriate, i.e. first-class-ref, etc.
Refactor the handling of references throughout the code, so that instead of dealing with each individual kind of reference tag, we simply deal with the three classes.
The only time the system needs to know the specifics of the individual references is for naming them.

UI

Make this system transparent to users, and allow them to choose how to interact with it via the “Textual Information” dialogue.

Replace the current toggle switch with radio buttons (like the text-view buttons). Give three options, with explanation. Note that this UI governs more than just references, so the explanation must reflect that. Here is how it might work:

View textual information

Display information such as reference numbers, variant readings, and other textual apparatus.

None.
Just the text.

☐ Display, including first-class references.
Shows textual apparatus, including the standard reference numbers for SuttaCentral. Use this to refer or link to our texts.

☐ Include second-class references.
The same as the previous, but also displays various legacy reference numbers. Use this if you’re looking something up from a source that uses a different system, such as the PTS volume/page numbers.

Pitfalls and problems

Well, when you refactor things, it’s always complicated. So there’s that!

Also, we should carefully consider the use cases, to ensure that the three classes will be sufficient to work in all cases.

Vimala · May 14, 2018, 5:36am

Just note that all parallels are referenced (unprefixed) to SC-paragraph numbers.

Bernat · October 28, 2018, 12:54pm

Perhaps it’s time that the Buddhist scholarly community would modernise a bit and ditch old systems of citation which are dependent on printed material; as I call it, sliced-tree version.

I came across this very interesting article: Academic citation practices need to be modernized – Advice for authoring a PhD or academic book – Medium

I find it annoying when reading a paper that cites a sutta by PTS page and then I have to resort to conversion tables or go through the whole index of a nikāya in order to find out what sutta it’s talking about , instead of how simple it is to type SN56.11 here on SC or other sites. So when writing something academic, intuitively I’d provide only sutta number, but then I bend to the established guidelines and add the PTS page. The result is that any sutta I want to reference has a long, ugly, double citation, e.g. (SN 12.43, ii 73)

It’d be interesting to adapt some of the issues raised in the article, and move towards an online reference point (I propose SC) and use its paragraph numbers. Even just provide links: SN 12.43. The article suggests substituting page references for quotations that identify the passage and people can simply Control+F search it, but the EBT are too repetitive, that wouldn’t work. The good thing is that PTS page numbers are still provided in SC textual information, although the PTS editions contain more on the variant readings. Either this or to digitise the PTS source texts in a way where one can search by volume and page.

Anyone knows of a good scanned pdf of the PTS tipiṭaka?
Could it be possible to type PTS page references on SC urls?

sujato · October 28, 2018, 8:23pm

Indeed, I couldn’t agree more. The Bible folks solved this hundreds of years ago: use a consistent set of numbered segments for all editions, based on the semantic divisions in the text. There’s no reason we can’t do the same.

I am currently working on some introductions for the nikayas, and in them I note:

You may encounter various other referencing systems. In academic works, texts are often referenced by volume and page of the Pali Text Society (PTS) edition of the original Pali. This is a regrettable and clumsy convention, since it binds references to a specific paper edition. I hope it is swiftly abandoned in favor of proper semantic references. However, the PTS volume/page numbers are displayed on SuttaCentral in case you need to look up a legacy reference.

And elsewhere:

One innovation that was not pursued consistently was the introduction of chapter and section numbers. These were added to the PTS Pali editions of the Dīgha Nikāya and the Vinaya, and are used in subsequent translations. However most of the PTS editions lack such sections, with the unfortunate consequence that academic referencing of Pali texts is still based on the volume and page of the PTS edition, a system that is neither practical nor precise.

Robbie · October 30, 2018, 11:40pm

The Dictionary of Pali Proper Names, Concise Pali English Dictionary and the PTS Pali English Dictionary on SuttaCentral use the PTS reference system. This makes it hard to find the suttaplexes they refer to. Do you think it would be a worthwile undertaking to change them into SuttaCentral style references? If so, I’d love to volunteer.

sujato · October 31, 2018, 7:50am

Thanks, that would be awesome. We have a long-term project to improve and enhance our reference handling by incorporating referencing between multiple editions.

Essentially we will end up with arrays that will match the reference from any one edition to any other. When this is done we can supply links for the various dictionaries on the fly; the references in them can stay as they are, but we can give links to the SC references. Or if we like, we can add double references or change the originals.

The process is briefly outlined here:

If you look at the bottom, there are 4 steps. In fact the first one has been done (mostly) already. If you’re interested to undertake this, I’ll be happy to work with you on it.

sujato · October 31, 2018, 7:55am

Thanks, that would be awesome. We have a long-term project to improve and enhance our reference handling by incorporating referencing between multiple editions.

Essentially we will end up with arrays that will match the reference from any one edition to any other. When this is done we can supply links for the various dictionaries on the fly; the references in them can stay as they are, but we can give links to the SC references. Or if we like, we can add double references or change the originals.

The process is briefly outlined here:

If you look at the bottom, there are 4 steps. In fact the first one has been done (mostly) already. If you’re interested to undertake this, I’ll be happy to work with you on it.

Robbie · October 31, 2018, 6:56pm

I’d feel honored to work on this issue.

I’ll mention right away that I don’t have experience with programming. I hope that is not a problem.

I just made a GitHub account: rpdejonge · GitHub in case that’s needed.

sujato · October 31, 2018, 10:35pm

Okay, well, that sounds promising. Let’s work together and see how it goes. If it turns out to be too much, well at least we tried!

May I ask, where are you based? We should try to talk via hangouts/skype etc. if we can. For the next few weeks I’m travelling in Asia, then I’ll be in Sydney.

Robbie · November 1, 2018, 11:41am

I’m based in Rotterdam, the Netherlands, so CET UTC+1.

Time and Date has a useful scheduling tool: after inputting two cities the Meeting Planner link below gives overlapping working hours.

Skype works for me. My Skype username is robbie_op_skype.

sujato · November 2, 2018, 10:29am

Okay thanks Robbie. I’ll be busy this next few days, but after that I should have some free time. I’ll try to contact you then.