SuttaCentral Ubiquitous Language

Hey folks,

I’m working on our data loading process - testing, refactoring and optimising. I’m also interested in Domain Driven Design and how it might be applied. Creating a ubiquitous language is one of the most important aspect of DDD. I’d like use this thread to build this up in collaboration with our users.

So here’s a question:

We have the following:

  • Root texts in Pali and Chinese
  • Segmented texts: texts that have been broken up into segments. Only the pali is complete at this point
  • Aligned text: a translation of a root text into another language, where each segment in the root and translation are mapped to one another. This is usually produced via Bilara.

Then there are legacy texts. All the translations that exist only as HTML and have not yet been segmented fall into this category.

My question then is: is a root text that has not been segmented a legacy text?

Cheers,

Ajahn J.R.

4 Likes

Summoning Ven @Snowbird and Ayya @sabbamitta as the power-users most likely to have an idea

2 Likes

I’ve never heard the term “legacy” applied to a root text, but I don’t know. I wasn’t among the creators of SuttaCentral, and perhaps they had something more specific in mind than just the fact that a text is not segmented. So I’d probably summon Bhante @sujato as well.

2 Likes

FWIW, @ausername called them “legacy roots” in this post

And @karl_lew used the term “non-Bilara texts” here

3 Likes

This one refers clearly to translations, not root.

4 Likes

No, I haven’t seen anyone do so.

3 Likes

Hmm… that tells me there isn’t a common term used in this case. Feel free to come to a consensus on what we should call them.

3 Likes

Yes, this has long been an annoyance to me. I know naming things is hard.

I am not sure what we call un-segmented root texts. @cdpatton?

I think segmented === aligned, btw.

I really don’t like the name “legacy”. Since we are still adding them it doesn’t make any sense. And it doesn’t really mean what they are. Just think about translating the term and you can imagine the problem.

That said, I have no idea what else to call them. Naming things is hard.

4 Likes

I guess they might be called ‘un-segmented root texts’? :wink:

5 Likes

When I first started reading SC, I thought “aligned” meant “aligned with the values / doctrine of the website” rather than meaning what it does… :grin:

How about SC / Sangha division? SC texts are written on Bilara, with segmented root texts. Sangha is texts from sangha at large. Though not sure if this implies SC is not Sangha.

:laughing:

1 Like

Speaking of naming things being hard… Here in EBT land, sangha means either the aria sangha or the ordained sangha.

3 Likes

:man_shrugging: I just call them texts. My impression, which is even less informed than @sabbamitta’s, is that “legacy” refers to the older website format that’s still preserved in sc-data, which is deprecated in favor of the bilara data structure. In practice, Bilara is an international translation project for Sujato and Brahmali’s translations. The technical difficulties with importing other root texts continue.

3 Likes

Thankfully https://legacy.suttacentral.net is gone now :sweat_smile:

Hey! I built that you know?

4 Likes

That works for me. In the code I’ll use LegacyText and UnsegmentedRootText

2 Likes

One thing I struggle with is the type instance homonym “text”. It means both “one of our texts” and “the text in the file or on the screen”. The former is a domain concept, the latter an implementation detail. And yes, a translation of a Pali sutta can be “on the screen” too.

Next up: volpage

This is used a bit in the current code, and perhaps with the legacy site if I recall correctly. It’s a contraction of Volume and Page and is shown with the Pali, and also the Chinese. As far as I can tell volpage is simply a programming abbreviation and not a part of our domain.

This being the case?

Is it correct to use the phrase Volume and Page for the Chinese?

Ti 421a12 is obviously some sort of reference, as is volume and page for the Pali.

1 Like

@cdpatton should perhaps know?

Not Ti, T i

Taisho Volume i Page 421 Column a Line 12

4 Likes

To give it context, below is the first page of the Taisho. There are three columns, and usually around 29 lines to each column, numbered from right to left in the references. The text is read traditionally, top to bottom, right to left.

You can look at scans of the Taisho at CBETA: CBETAonline 影像

4 Likes