Making DN 24 Easier to Parse

Dear Ven @Sujato,

I was attempting to read your excellent translation of DN 24 today and found myself getting mighty confused about who was speaking to whom about what.

It’s a bit of a layered, story-in-a-story thing and to make matters worse, one character (not the Buddha) is repeatedly addressed as “Bhaggava” :crazy_face: (It didn’t help that I was attempting to read the sutta from the middle where the reference I was interested in is found, not from the beginning, where the scene is established.)

Looking at it again, I realized the biggest part of my confusion is that long, multiparagraph quotes don’t get the starting quote marks which is standard in English:

Each of these paragraphs should have at the beginning, for example.

And the problem gets even worse as we go down:

Here, there is a multi-paragraph quote-within-a-quote! You see the ” at the end of that paragraph but, Oh, no! That’s not the end of the Buddha’s monologue! That’s the end of a third-level quote! :crazy_face: :crazy_face:

It’s a bit like trying to read code with no indentation. Technically possible if you’re reading from the beginning, but quite a headache even then.

So, here’s my modest proposal: follow English convention and start these paragraphs with all the “ ‘ things!

It might also help to introduce “Bhaggavagotta” as “Bhaggava Gotta” so it’s not such a surprise when half the name is missing later… or, perhaps even translate the name (Mr. Potterson?) to avoid confusion with “Bhagavā” among the Pāḷi-Dyslexic such as myself.

Anyway, thanks as always for listening to us ungrateful rabble-rousers! :pray::blush:

3 Likes

This has long been something that bothered me, but I was too polite to complain. :wink:

Edit: Maybe I did complain at some point now that I think about it. Guess I’m not that polite.

I always assumed it had something to do with the segmentation.

I can appreciate that perhaps 90% of the suttapitaka would need to start with the open quote. And I also know that punctuation isn’t a thing in the texts themselves.

However… From a spiritual perspective, I really like to be reminded that this is all something spoken by the Buddha. So when the text doesn’t follow English convention it goes against my expectation that this is all being spoken by someone.

3 Likes

Deducing speaker is even more difficult for Voice listeners, who cannot hear silent quotes.

Interestingly, the mind adapts on repeated listening. And ever so gradually, the meaning snaps into place and unfolds gradually as the mind orients itself to the semantics of the interchange. Pali was transmitted as an oral tradition, so the contorted quote-within-quote is actually part of the canon to be understood even by those who cannot read.

One might be inclined to conclude that the contortion was retained as a mnemonic or pedagogical device. As a learner informed by listening, I’ve noticed that the experience of listening is quite different than that of reading. The eye, with its glancing gloss, tends to miss subtle details only perceived with extended repetition. And the ear must make do without quotes.

:thinking:

I too would achieve easier comprehension if the suggestions of the OP could be adopted.

2 Likes

There are a couple DN/DA sutras that get pretty crazy re:nested quotations. I actually changed to a different web font on Dharma Pearls because the Adobe font I was using wasn’t clear where one quotation mark ended and the next began. When there were three or four quotation marks beginning a paragraph, they all ran together. So, I switched to the super simple Noto font because it 1. Could render all the diacritics and 2. separated quotation marks a bit.

I think the Janavṛṣabha Sutra was the one that made me change fonts. It does tempt a person to just forget it and blockquote or something instead.

The problem here is that paragraphs can’t be assumed to be stable.

In certain views, paragraphs are seen, in other views not. It’s also reasonable to expect that in different environments, paragraphs will be handled differently. For example, on the web, there’s no cost to adding paragraphs, but in print, there is, especially when there are many exchanges between speakers. I think Ven Bodhi doesn’t use paragraphs in this case, IIRC.

So there are compromises either way. At least this way, we have an unambiguous semantics: open quote mark opens speech, close quote mark closes it.

Quotes in suttas are hard, and this sutta is a good example why. Hey, at least we have quote marks!

I mean it’s tempting to handle it entirely in markup, strip the quote marks from the sutta altogether. But that introduces another layer of abstraction, and abstractions always break.

This is actually when BDK has been doing with their print publications, but I think the source of the inspiration is Japanese formatting, which is to blockquote all quotes. :boom: No more quote marks!

I think I may experiment with blockquoting now that I think about it again. Three quote marks deep is just kind of grating when it’s repeated over and over. So maybe blockquote the story that goes there.

Isn’t that just kicking the can down the road, though?

How do you display multiple nested quotes? You can’t have that many levels of indent on a mobile (50% of users right there). And what about ereaders? Audio? There’s no real precedent for UAs that want to handle multiple nested layers of quotes. And you certainly can’t just expect that they’ll do it right.

Quotes can be added in CSS: p:before { content"“ "; } They don’t have to be added in the back end.

I just did this: padding-left: min(5vw, 4em); It works well enough on both mobile and desktop. The indent doesn’t need to be big to be visually effective, especially with a bit of background-color: rgba(120,120,120, 0.1); or somesuch.

No, no. I would just blockquote the story that forces a third quote mark. BDK’s multiple levels of blockquote is just as annoying to a normal English reader, which they do because their audience (at least, originally) was largely bilingual Japanese American Buddhists. As far as audio goes, there’s nothing to be done other than perhaps hire a cast and use different voices.

None of this is cool. You want to know what’s cool?

Proper metadata to mark the speaker of each line in the canon. Now that’s cool!

Then we could:

  • give a tooltip to identify the speaker
  • color the different speaker backgrounds
  • automate speaking in different voices
  • other cool stuff

There are 136000 paragraphs on SC, so it’s a bit of work.

3 Likes

Another thing is, although starting each paragraph with a new quote mark may be standard in English, this is not necessarily the case in other languages. I would find it rather confusing to see such a thing, for example. So starting each paragraph with quote marks would perhaps add new challenges to internationalization.

1 Like

Well, it would only be in the English translation, wouldn’t it? I don’t think we are talking about punctuating the Pali. Or am I misunderstanding?

1 Like

This wasn’t entirely clear to me.

Oh, actually I’m not sure either now. @Khemarato.bhikkhu ?

I was only thinking about the translation since, as you say, different languages would have different punctuation systems.

1 Like

Yes, I was thinking, if something like this was introduced:

that would perhaps not only affect just one translation language. But I don’t know for sure.

(Not speaking of the fact that in different languages, quote marks look entirely different!)

Indeed.

Bhante Sujato did his usual, genius thing of generalizing the problem and trying to think up general solutions, but I, a simpleton, was just concerned with making Bhante’s otherwise-excellent, English translation of this one sutta, DN 24, easier to parse. (As, indeed, the title of this thread states)

Indeed it would, if added exactly as is to SC’s global CSS file. Obviously that’s not what I meant. In reality you’d scope it down using specific classes, etc.

My point was to gesture in the general direction of a technical solution to the problem brought up by Bhante in his reply. If I had a full solution, I wouldn’t be posting feedback here, I’d just open a PR on GitHub.

Exactly! We’re not speaking about that :rofl: (At least not in this thread. I’m happy to opine about the beauty of French and Chinese quote marks in a new thread, though! Just not this one. :stuck_out_tongue: )

2 Likes

I suppose at a certain point, a project has run its course. Software projects all have a lifecycle. At a certain point, adding features or changing data structures becomes too difficult to contemplate. Not sure what’s so difficult about adding a few tags to a single sutta and a little CSS to match, but reasons, I’m sure.

I think this is why I’ve kept Dharma Pearls simple and only coding sparingly when it becomes absolutely necessary. It keeps my options open (or delays the inevitable point where I throw up my hands and give up on a legacy app).

2 Likes

Indeed, the internationalization is a whole other ballgame.


I really would like to do the “speaker in metadata” thing, though. If you look at Brahmali’s Vinaya, it has an awesome set of semantic markup that structures the whole thing. The Suttas are less intricately structured, so much of that wouldn’t be applicable. But metadata for persons and places; perhaps general literary forms such as “background narrative”, “conversation”, “teachings”. Perhaps also for “doctrines”.

In other texts there are different requirements; for example, in Sanskrit we have things like “reconstructed”, “uncertain”, and so on.

I don’t have a detailed plan for this, so I’m just spitballing!

In the Vinaya, it’s currently maintained as classes in the HTML. But perhaps, in the spirit of Bilara, we should split it out and have a separate set of, let’s call it decorations.

There’s a set list of decorations:

  • narrative
  • speaker
  • genre

and so on. These are mainntained as key-value (speaker:buddha, place:rajagaha)

So we might have:

  "dn2:1": "speaker:compiler, genre:narrative, place:rajagaha",
  "dn2:2": "speaker:ajatasattu",
  "dn2:4": "place:ambavana",
  "dn2.7": "speaker:buddha, topic:conversation"
  "dn3.1": "speaker:buddha, topic:gradual-training",

Then you can do various things with this:

  • style it
  • supply extra info (eg. a map for the place, a bio link for the speaker, an explainer for the doctrine)
  • use it for extracting and analyzing language (eg. comparing narrative to teaching segments, this might help determine historical strata).
  • use different voices for different speakers in SC-Voice!
  • things I haven’t thought of.

Anyway, it’s a lot of work!

The good part is, it can be done gradually; it’s a progressive enhancement. And because it’s separated out in Bilara, it can be ignored by any UA that wants to ignore it, and applied across all texts and translations if desired.

I don’t see how this metadata, awesome as it would be, solves the problem in the OP. In this sutta, nearly the entire thing would be marked speaker:buddha, including the parts where the Buddha is quoting someone else (let alone where the Buddha is quoting that person quoting someone else).

1 Like