Are we talking at cross purposes here? I’m using “segment” to refer to the inline segments as done in pootle, and considering how to apply this to create a granular reference system site-wide.
Okay, this is a complex issue, and I’m not persuaded—yet. The advantage is that it makes the scope of elements explicit: an ID scope = the scope of that HTML element.
However we run into problems when it comes to the nitty gritty. Consider, say, vol/page IDs. These typically occur in the middle of a paragraph. If we are to explicitly define the scope of these it is a complex matter indeed, as we run up against the old “<p>
tags are an absolute barrier to inline tags” problem in HTML. We would have to do something like:
<p>Some text. <span class="vp" id="vp1" some-other-identifying-thing="X"> Some more text. and some more text.</span></p>
<p><span class="vp" some-other-identifying-thing="X">Some more text. And more</span></p>
<p><span class="vp" some-other-identifying-thing="X">Some more text. And more</span></p>
<p><span class="vp" some-other-identifying-thing="X">Some more text.</span><span class="vp" id="vp2" some-other-identifying-thing="Y"> And more</span></p>
And so on. I don’t know about you, but I’m not feeling it. Or, we could do as we do now:
<p>Some text. <a class="vp" id="vp1"></a> Some more text. and some more text.</p>
<p>Some more text. And more.</p>
<p>Some more text. And more.</p>
<p>Some more text. <a class="vp" id="vp2"></a>And more</span></p>
Clean and simple. This is a common use case, but it is just one of the complexities that we would rapidly encounter.
We could, of course, adopt a hybrid system, with IDs defined by the scope of the HTML element in some cases, and in other cases not. Yeah, i don’t think so.
It seems to me that the current system works basically fine, and, importantly, it doesn’t fight against the nature of HTML. It is what it is. HTML simply doesn’t offer a native way of defining things across multiple paragraphs like that: that’s not how it works. You can define things as a block level, a span inside a block, or a point, and that’s about it. Of course you can do all sorts of complicated things to work around this, but you need a really good reason.
Think about how HTML normally defines document structure. A section of a document can be explicitly marked as such, but normally it is just by heading level, as I mention above. The scope of a section headed by <h2>
is not defined by the </h2>
but by the start of the next <h2>
. Document structure is inferred, not explicit. Of course you can make it explicit, and the <section>
tag is there for that, but it is not needed unless there is a special reason.
By defining IDs as points, we are working in a similar way. The scope of <a class="sc" id="sc1"></a>
is inferred to be up to the beginning of <a class="sc" id="sc2"></a>
. As long as this is understood consistently, it shouldn’t be a problem.
Agreed, the potential for confusion is there, if we’re clear and careful I can’t see a problem.
Note also that this is not just in line with how HTML document structure works, but also with how our source texts work. The scope of a page number is defined by the next page number. You don’t explicitly write “This page ends, new page begins”. There is a point, and the scope is inferred.
Actually this was something I was going to mention. The “sc” numbers are essentially a fallback. In general, they should only be used in the case where there are no suitable references in the source text. The idea is that we should avoid adding our own referencing system but should, so far as is possible, inherit the system developed by others. Occam’s principle of referencing: thou shalt not multiply reference systems unnecessarily. We haven’t always been as clear with this as we should, but that’s the general idea.
This is why, in my original post, I suggested we treat text segments as subsets of an existing reference system. This is something I’ve discussed with Blake previously. On Pootle, the segments are simply numbered sequentially for the beginning of the text. However, I am suggesting we restart them after each occurrence of our main reference in that particular text. That way, the reference is still meaningful even if the segments aren’t there.
Our system needs to not only work in and of itself, but, so far as is possible, to be consistent and interoperable with other works.
With regard to the Dhammapadas, the IDs should be assigned a suitable class, and additional information given in the “title” attribute. Eg the Gdhp is from the Brough edition, so <a class="brough" id="brough1" title="Verse number in the Brough edition."></a>
Absolutely, these are critical. We need to display ways of getting to the text. It is an awesome feature of SC: not only do we supply the information, we tell you what it means when you need it. No other site does this, so far as I know, and it makes searching for references much, much easier.
Again, absolutely! These were hand-added by Ven KB for SC, and they are awesome. If you’re reading the English translation, you can go straight to the Pali passage. Again, nowhere else does this, so far as I know. Also, these numbers are used in many of our other languages, so they are useful for more than English.
Again, yes, it certainly is. Consider, for example, the referencing used in the English translation itself. Even there, when IB Horner is referring to the same text she is translating, she uses multiple systems: the volume/page of the Pali, the volume/page of the English, the rule number, or the section number of the Pali. It is a nightmare. Our SC text smoothes the path by defining each reference in each occurence so you always know what is being referred to.
Yes, as a minimum, if a text does not have paragraph numbers, we should add them. (But as <a>
tags, unless we decide to adopt Blake’s suggestion).