Automatically linking up citations to suttas on D&D

Why can’t you just use [:\.] in the regex?

2 Likes

This is because of the way capturing groups work.

Let’s say we have strings ranging from SC1.1 to SC9.9. Then the corresponding regex to match them would be

\bSC\s?([1-9]\.[1-9])\b

where

  • \b matches word boundary
  • \s matches whitespace character
    • ? repeated 0 or 1 times
  • () is a capturing group
    • [1-9] matches range of digits from 1–9
    • \. matches dot character (unescaped dot is a special character representing any character)

Capturing groups are enumerated (from left to right) and can be used in substitution rules using $ and group number, in this case $1.

This regex rule matches all strings from SC1.1 to SC9.9, and captures 1.1–9.9

Substitution rule can than be https://example.com/sc$1, which will produce https://example.com/sc1.1 to https://example.com/sc9.9

If we want to match both . and : in SC1.1 to SC 9.9 and SC1:1 to SC9:9 the regex to match all strings would be

\bSC\s?([1-9][\.:][1-9])\b

but the capture this time would be 1.1–9.9 and 1:1–9:9. We cannot use $1 for substitution this time, because the link would be wrong for captures with :.

We can adjust the regex like this (note the two capturing groups)

\bSC\s?([1-9])[\.:]([1-9])\b

and then construct the substitution rule as https.//example.com/sc$1.$2 (we capture two separate digits and manually substitute the separator with dot).

This seemingly works, so why its it not working in our real case scenario?

Note that all SC ranges in our example were of the same size (1.1–1.9, 2.1–2.9,…,9.1–9.9). What if this was not so? Let’s say our ranges shrink progressively, for example 1.1–1.9, 2.1–2.8,…,8.1–8.2 and 9.1). Now we cannot construct a simple matching regex rule, if we only want to match these specific ranges. The rule is a bit more complex, but still a single rule

\bSC\s?(?:(1)[\.:]([1-9])|(2)[\.:]([1-8])|(3)[\.:]([1-7])|(4)[\.:]([1-6])|(5)[\.:]([1-5])|(6)[\.:]([1-4])|(7)[\.:]([1-3])|(8)[\.:]([12])|(9)[\.:](1))\b

where

  • (?:) is a non-capturing group (the outermost group)
  • | is the OR operator (in this case it helps making separate matcings for 1.1–1.9, 2.1–2.8,…, 8.1–8.2 and 9.1)

But there are more than two capturing groups now (nine pairs) and they are all enumerated from left to right, which makes it impossible to construct the replacement rule as before: $1.$2, because this will only substitute 1.1–1.9, but not the others.

For this to work the one rule must be either broken into nine separate rules, or we must accept that we will match a wider range, and by that capture some invalid combinations $1.$2 for the generated link.

2 Likes

Thank you sooo much for this reply. I appreciate all the time it must have taken to create.

I think I am starting to get it now. The core issue seems to be that while you could easily capture exclusively correct citations with a :, once captured you would not be able to use what you captured in a url, because the url would not work. If we weren’t restricted to regex only, then the solution would be as simple as doing a .replace(":",".")

Am I understanding correctly? And that if we were to create a regex for each of the chapters in the AN and SN that would be a technical solution (although possibly too much work for the software to do quickly enough?)

In my link up app, I’m not restricted to regex only. I just did that because I was lazy. Or maybe I could call it “getting a minimal viable product to market quickly.” :rofl:

In my Citation Helper app, since I am dealing with multiple websites, I have a basic structure object that defines what is allowable as a citation, and then I have an object for each website, "e.g. SuttaCentral) that specifics what suttas are available and what, if any, are only available in a range.

Obviously that is way more complicated and not suitable for the situation here without writing a whole new plugin. Personally I do think that your regex solution is a perfectly good solution here.

Thanks! I’m so happy seeing all the linked up citations as I browse the forum. I wonder if this will lead to more people taking the time to click to the suttas being discussed.

4 Likes

Exactly. The component on D&D only accepts regex, but you have more freedom in your environment. The easiest way to have both notations work correctly would be to match and replace using the exact regex first, and then run .replace(":", ".") on the result.

Too much work for the human :grin:, software can proccess heaps.

Laziness is the mother of invention.

Probably doable using regex too, but whatever works for you is OK. I must always remind myself there are different tools for different jobs (Law of the instrument - Wikipedia).

It seems to work nicely and I’m also happy that you nudged me to do it, D&D looks way cooler with all these links.

3 Likes

I concur! :partying_face: It really does make it much easier to double check people’s references and dive into the suttas themselves. Kudos!

2 Likes

Added support for Vinaya (long and shorthand—with or without dashes, and Pli or Pi (for backwad comaptibility only) in long notation).

All links point to translation by Bhante @Brahmali (where present—some translations are not published yet?, e.g Bi Sk, Bu Pm and Bi Pm).

Bhikkhu Vibhanga
Pli-Tv-Bu-Vb-Pj4 Bu Pj 4
Pli-Tv-Bu-Vb-Ss13 Bu Ss 13
Pli-Tv-Bu-Vb-Ay2 Bu Ay 2
Pli-Tv-Bu-Vb-Np30 Bu Np 30
Pli-Tv-Bu-Vb-Pc92 Bu Pc 92
Pli-Tv-Bu-Vb-Pd4 Bu Pd 4
Pli-Tv-Bu-Vb-Sk75 Bu Sk 75
Pli-Tv-Bu-Vb-As7 Bu As 7
Bhikkhuni Vibhanga
Pli-Tv-Bi-Vb-Pj8 Bi Pj 8
Pli-Tv-Bi-Vb-Ss13 Bi Ss 13
Pli-Tv-Bi-Vb-Np12 Bi Np 12
Pli-Tv-Bi-Vb-Pc96 Bi Pc 96
Pli-Tv-Bi-Vb-Pd8 Bi Pd 8
Pli-Tv-Bi-Vb-Sk75 Bi Sk 75
Pli-Tv-Bi-Vb-As7 Bi As 7
Khandhaka
Pli-Tv-Kd22 Kd 22
Parivara
Pli-Tv-Pvr1.16 Pvr 1.16
Pli-Tv-Pvr2.16 Pvr 2.16
Pli-Tv-Pvr21 Pvr 21
Patimokkha
Pli-Tv-Bu-Pm Bu Pm
Pli-Tv-Bi-Pm Bi Pm
4 Likes

Thanks, @Musiko. And yes, you are right some of the pages are not showing on SuttaCentral. I think the problem occurs whenever a single page cover more than one rule. This is so for the bhikkhunī sekhiyas and paṭidesanīyas. Bhante @ Sujato, are you able to have a look at this?

1 Like

could also be related to this issue: Linking together the Bhikkhuni and Bhikkhu rules which are the same

FWIW, I rarely see people giving citations for Vinaya, but it’s great to have them linking when they do.

1 Like