Vinaya translations handle missing Pali titles in a breaking way

Snowbird · June 8, 2022, 4:52am

So I just added Bhante @Brahmali’s Vinaya translations to the SC Light interface (sc.readingfaithfully.org) and noticed that there are segments with English titles that do not have a matching segment in the Pali segments. This causes code to break a bit.

For example: https://sc.readingfaithfully.org/?bupd2

See the “undefined” in the Pali title above “Origin Story”? This is what the html looks like:

And this from the translation json:

From the html json:

You can see in both there is a pd2:1.0 segment. However the Pali does not have this segment:

The opposite situation is quite common in the suttas, i.e. when there is a Pali segment that is not included in the translation. In this situation there is simply an empty segment. E.g.:

So, I would suggest if you are going to have official segments, then they should exist in the root texts. I’m sure, Bhante @Sujato, you have already considered this (among countless other considerations), but I thought I would point out a use case where not having the segments in the Pali causes a problem.

Of course I will fix the problem in my code to accommodate this. I just wanted to point it out.

sujato · June 8, 2022, 7:19am

The only entire set of segments is in the html. Often translation is missing, and sometimes, as here, the root.

The presence of extra headings in the root was really to accommodate Brahmali’s Vinaya translation. In the Suttas I don’t think it occurs.

I’m not planning to add the requisite empty segments to the root files, but if you want to do that, make a PR and I’ll merge it. It’ll avoid this kind of issue. It’s not, I think, unreasonable to expect that the root is always present.

Snowbird · June 8, 2022, 7:24am

I’m curious if Bilara automatically creates an empty segment, "" when a translator leaves it empty. If that’s not the case, then it may be better for people using the data to expect empty things.

It would be an interesting coding project to figure out how to add them to the existing root.

sabbamitta · June 8, 2022, 8:25am

Well, it depends.

When a translator just never types anything in a segment, it won’t exist in the translation file.

But if they first type something and later delete it, in this case you will see an empty segment, at least in “unpublished”. Possible that on the data’s way to sc-data, these empty segments get removed again. I think so.

Snowbird · June 8, 2022, 8:49am

The last screen shot from MN51 seems to show them as there with empty quote marks.

Snowbird · June 10, 2022, 9:56am

OK, I have done this. I did it programatically by comparing the html to the root. I only did it for bu-vb and bi-vb as those seemed to be the only ones that had this situation.

I started to do a PR, but then I became unsure.

In my script, I got the data from the published branch: raw.githubusercontent.com/suttacentral/bilara-data/published/root/pli/ms/vinaya.

Should I have been getting it from unpublished instead?

Because as I understand it now, I should be making a fork of unpublished, add my changes, and then make a pull request to merge my branch with with the unpublished.

Snowbird · June 11, 2022, 8:34am

OK, I did find something in the kd.

SuttaCentral this is a peyalla where there is an English translation for segments but there is no segment in the Pali:

Do you want these segments added as well?

sujato · June 13, 2022, 1:13am

sure, why not. go ahead.

Gabriel_L · June 13, 2022, 2:11am

Sometimes the translator may also enter a blank space as well, like at the end of the texts where you sometimes have mnemonic Pali verses quite hard to translate and not meaningfully useful to readers.

sabbamitta · June 13, 2022, 2:18pm

I normally don’t enter anything there.