Automatically linking up citations to suttas on D&D

What happens with quotes like this?

MN1:171.4: ‘Nandī dukkhassa mūlan’ti—
Because he has understood that relishing is the root of suffering,

This is something I do quite regularly.

2 Likes

Ah yes, I misunderstood, but good point. This component (not a plugin) only performs rendering of the cooked html of the post on the client using jscript, so it doesn’t actually change the raw post.

But there is still the issue of links entered in the raw post and this is where this part comes in

which means some magic will be need to be performed directly on the server to replace both direct links https://suttacentral.net/* as well as markdown []() and html <a href> ones.

Luckily this can be accomplished with some clever regex find and replace from the server console, which will have to be done manually.

Bhante @Sujato, for the newly pasted links it would be ideal if SuttaCentral could provide the correct OpenGraph metadata for Title and Canonical, which would enable Discourse to automatically onebox them correctly like this https://discourse.suttacentral.net/t/automatically-linking-up-citations-to-suttas-on-d-d/25637/11Automatically linking up citations to suttas on D&D - #11 by Snowbird instead of this https://suttacentral.net/mn1/en/sujatoSuttaCentral

Then we could have the best combination of entering direct short notation (DN 1 or dn1) into raw post and it would both be renderd as a linkified DN1 using Linkify component and

  • suttacentral.net → SuttaCentral
  • suttacentral.net\dn → Digha Nikaya (or SuttaCentral—Digha Nikaya)
  • suttacentral.net\dn1 → DN 1 (or DN1 or dn1 or SuttaCentral—DN 1)
  • suttacentral.net\dn1\en\sujato → DN 1 by Sujato (or SuttaCentral—DN 1 by Sujato)

which would be linkified using native Discourse oneboxing.

One benefit of fixing the OpenGraph metadata would be that the find/ replace of the text in historical posts wouldn’t be required, as it would automatically fix itself by much simpler rebaking.

Ideally this would also be fixed by setting the correct metadata:

https://suttacentral.net/mn1/en/sujato#mn1:171.4

and you wouldn’t even need to construct the markdown manually, because this link will automatically render as inline onebox with the correct title MN1:171.4

We can of course choose which historical posts to fix or leave as is.

OpenGraph metadata can be checked with http://debug.iframely.com

1 Like

This is the point: I don’t construct them manually.

When I enter mn1:171.4 -om in scv-bilara search, I get this output which I just copy to the post where I want it (the -om flag stands for “output markdown”):

> [MN1:171.4](https://suttacentral.net/mn1/en/sujato#mn1:171.4): ‘Nandī dukkhassa mūlan’ti—
> [MN1:171.4](https://suttacentral.net/mn1/en/sujato#mn1:171.4): Because he has understood that relishing is the root of suffering,

I only remove the second link that is in the translation line.

The open graph data/onebox is really a separate issue, isn’t it? I thought this was about turning the citations that people typed into markdown links. Or are you talking about replacing the citations with links and then letting onebox do it’s magic?

I’m also quite skeptical of any regex wizardry in this situation. We are dealing with user input, remember.

But I’m keen to see what can be done. Thanks for looking into this.

This is more complex than it looks, unfortunately. We do hope to fix it, but it isn’t on the immediate roadmap (waiting for lit-ssr and developer time.)

I’d recommend keeping this plugin simple, don’t try to cover all use cases.

2 Likes

This is actually the easier part (already done), see regex rules here:

This has been implemented for DN, MN, Dhp and Iti (this will work for all existing and new IDs inside posts and it needs browser refresh). Here are some of the samples for valid an invalid sutta IDs, automatically linkified to SC:

DN MN Dhp Iti
DN0 MN0 DHP0 ITI0
Dn0 Mn0 Dhp0 Iti0
dn0 mn0 dhp0 iti0
DN1 MN1 DHP1 ITI1
Dn1 Mn1 Dhp1 Iti1
dn1 mn1 dhp1 iti1
DN 2 MN 2 DHP 2 ITI 2
Dn 2 Mn 2 Dhp 2 Iti 2
dn 2 mn 2 dhp 2 iti 2
DN 34 MN 152 DHP 423 ITI 112
Dn 34 Mn 152 Dhp 423 Iti 112
dn34 mn152 dhp423 iti112
DN35 MN153 DHP424 ITI113
Dn 35 Mn 153 Dhp 424 Iti 113
dn 35 mn 153 dhp 424 iti 113

But there are also several issues with the Linkify words in post - theme-component - Discourse Meta component:

  • regex OR operator (|) is reserved as a separator for multiple rules input tool and this is currently still hardcoded in core, which means all compact regex rules must be expanded into multiple rules (this is doable, but harder to maintain)
  • the component allows for mapping spaces and non-break spaces, but not all whitespaces using \s (but this should cover most cases anyway)
  • the crucial showstopper is that the component delimits words at word boundary, which includes dot (.). This makes it impossible to linkify AN, SN, Snp, Thag, Thig and Ud.

Fixing this shouldn’t be too hard, as this component is pure javascript. Best approach would be to create a fork from GitHub - discourse/discourse-linkify-words: theme to auto linkify urls in discourse and add this functionality:

  • add configurable word boundary list instead of hardcoded ones
  • use a substitute character for | (configurable, with default ¦) for rules input and replace it to | in code

Bhante @snowbird, what do you think?

2 Likes

Is it possible to link to a text in another translation language? Or how about just linking to the Suttaplex card?

1 Like

It’s only possible to have one base link per ID, so it makes the most sense to link it directly to the most complete collection of translations.

This will cover the vast majority of cases, and for advanced users, the Suttaplex is only one click away.

1 Like

What I had been afraid of is that explicit links to other places will get overwritten by this default, but this is not the case, as far as I can see. So my apprehension is dissolved! :smiley:

1 Like

Yes, I forgot to mention this, the existing links and manually entered links via markdown or html or pasted directly into composer are not modified.

Only the specific ‘words’ in the cooked html of the post are linkified on the fly by the client’s browser (there is absolutely no modification of the raw post).

3 Likes

Ah, thank you, this has been the background of all my questions in this thread. Maybe I should have been more explicit.

1 Like

I agree. It’s important to keep in mind that this is kind of a bonus feature. If people take the time to link to something specific, then great. (If I’m understanding correctly) Otherwise it’s a help to the readers of the post even if the person posting doesn’t make the extra effort to put in the link.

Thanks so much @musiko for working on this.

2 Likes

I love that it makes clear (visually) that these are automated links and not user-supplied links. A+

1 Like

Ah, it actually wasn’t clear to me! I think dotted underlines usually indicate a definition is available if you hover, eh? From the standpoint of a general reader of posts, will it really matter if they were auto linked or not? I guess my point is that for that general reader, they would have no idea that dotted links were auto generated or even what auto generated links were at all.

I think for usability even the auto generated links should be styled in the same way as regular links, possibly with the addition of dotted underline.

As a side note, I’m an Accessibility Dark Theme user so for me the links are usually yellow, which I think is great. But to see what was going on, I tried out the Accessibility Light Theme and I find the gray color of the regular links to be awful. I don’t see how gray text could be considered accessible at all. Usually it indicates something is unavailable or possibly a link visited. I find it really hard to see where the links are. But I guess if no one else has complained then it’s not an issue? If it was really going to be an accessible theme, the links should be underlined for maximum usability. Anyway, just my $0.2.

@musiko, Is this live on the site now? Looks like it is. Amazing!

Edit: Just for completeness, I tried the legacy theme and it looks like auto generated and regular links are both styled the same. I’d also note that the very thin yellow underline on links doesn’t seem great for accessibility either.

1 Like

Once an auto link is quoted, it appears like a regular link, which I think is just fine. :+1:t2:

1 Like

This just in—all major collections are now supported. No need to copy and paste links to the suttas from browser any more, simply put an abbreviation to the specific sutta in the post and let Discourse do its magic (may require browser refresh once).

It’s best to use standard abbreviations: DN, MN, SN, AN, Dhp, Iti, Snp, Thag, Thig and Ud for consistency, but upper, proper and lowercase all work.

DN MN Dhp Iti
DN0 MN0 DHP0 ITI0
Dn0 Mn0 Dhp 0 Iti 0
dn 0 mn 0 dhp 0 iti 0
DN1 MN1 DHP1 ITI1
Dn1 Mn1 Dhp1 Iti1
dn1 mn1 dhp1 iti1
DN 34 MN 152 DHP 423 ITI 112
Dn 34 Mn 152 Dhp 423 Iti 112
dn34 mn152 dhp423 iti112
DN35 MN153 DHP424 ITI113
Dn35 Mn153 Dhp 424 Iti 113
dn 35 mn 153 dhp 424 iti 113
an 0 1 2 max max+1
0 an 0.0 AN0.1 An0.2 an0.9999 an0.10000
1 AN 1.0 AN1.1 An1.2 an1.627 an1.628
2 An2.0 AN2.1 An2.2 an2.479 an 2.480
11 an11.0 AN 11.1 An 11.2 an 11.1151 an 11.1152
12 an12.0 AN 12.1 An 12.2 an 12.9999 an 12.10000
sn 0 1 2 max max+1
0 sn 0.0 SN0.1 Sn0.2 sn0.9999 sn0.10000
1 SN 1.0 SN1.1 Sn1.2 sn1.81 sn1.82
2 Sn2.0 SN2.1 Sn2.2 sn2.30 sn 2.31
56 sn56.0 SN 56.1 Sn 56.2 sn 56.131 sn 56.132
57 sn57.0 SN 57.1 Sn 57.2 sn 57.9999 sn 57.10000
snp 0 1 2 max max+1
0 snp 0.0 SNP0.1 Snp0.2 snp0.9999 snp0.10000
1 SNP 1.0 SNP1.1 Snp1.2 snp1.12 snp1.13
2 Snp2.0 SNP2.1 Snp2.2 snp2.14 snp 2.15
5 snp5.0 SNP 5.1 Snp 5.2 snp 5.19 snp 5.20
6 snp6.0 SNP 6.1 Snp 6.2 snp 6.9999 snp 6.10000
thag 0 1 2 max max+1
0 thag 0.0 THAG0.1 Thag0.2 thag0.9999 thag0.10000
1 THAG 1.0 THAG1.1 Thag1.2 thag1.120 thag1.121
2 Thag2.0 THAG2.1 Thag2.2 thag2.49 thag 2.50
9 thag9.0 THAG 9.1 Thag 9.2 thag 9.1 thag 9.2
10 thag10.0 THAG 10.1 Thag 10.2 thag 10.9999 thag 10.10000
thig 0 1 2 max max+1
0 thig 0.0 THIG0.1 Thig0.2 thig0.9999 thig0.10000
1 THIG 1.0 THIG1.1 Thig1.2 thig1.18 thig1.19
2 Thig2.0 THIG2.1 Thig2.2 thig2.10 thig 2.11
16 thig16.0 THIG 16.1 Thig 16.2 thig 16.1 thig 16.2
17 thig17.0 THIG 17.1 Thig 17.2 thig 17.9999 thig 17.10000
ud 0 1 2 max max+1
0 ud 0.0 UD0.1 Ud0.2 ud0.9999 ud0.10000
1 UD 1.0 UD1.1 Ud1.2 ud1.10 ud1.11
2 Ud2.0 UD2.1 Ud2.2 ud2.10 ud 2.11
8 ud8.0 UD 8.1 Ud 8.2 ud 8.10 ud 8.11
9 ud9.0 UD 9.1 Ud 9.2 ud 9.9999 ud 9.10000
5 Likes

Sadhu sadhu!!!

Great work. It’s really impressive. This is a great enhancement to the forum.

BTW, for me some of the citations linked and some didn’t, but then when I did a hard refresh they all did. :+1:t2:

3 Likes

From my test cases, the only thing I’d ask for would be colon support:

  • SN 5:2

Some omissions were fixed, all sutas should now be automatically linkified (requires browser refresh). Here’s a list of all supported collections with corresponding last sutta IDs:

dhp dn iti mn
Dhp 423 DN 34 Iti 112 MN 152
an 8122
an1 AN 1.627
an2 AN 2.479
an3 AN 3.352
an4 AN 4.783
an5 AN 5.1152
an6 AN 6.649
an7 AN 7.1124
an8 AN 8.627
an9 AN 9.432
an10 AN 10.746
an11 AN 11.1151
sn 3024
sn1 SN 1.81
sn2 SN 2.30
sn3 SN 3.25
sn4 SN 4.25
sn5 SN 5.10
sn6 SN 6.15
sn7 SN 7.22
sn8 SN 8.12
sn9 SN 9.14
sn10 SN 10.12
sn11 SN 11.25
sn12 SN 12.213
sn13 SN 13.11
sn14 SN 14.39
sn15 SN 15.20
sn16 SN 16.13
sn17 SN 17.43
sn18 SN 18.22
sn19 SN 19.21
sn20 SN 20.12
sn21 SN 21.12
sn22 SN 22.159
sn23 SN 23.46
sn24 SN 24.96
sn25 SN 25.10
sn26 SN 26.10
sn27 SN 27.10
sn28 SN 28.10
sn29 SN 29.50
sn30 SN 30.46
sn31 SN 31.112
sn32 SN 32.57
sn33 SN 33.55
sn34 SN 34.55
sn35 SN 35.248
sn36 SN 36.31
sn37 SN 37.34
sn38 SN 38.16
sn39 SN 39.16
sn40 SN 40.11
sn41 SN 41.10
sn42 SN 42.13
sn43 SN 43.44
sn44 SN 44.11
sn45 SN 45.180
sn46 SN 46.184
sn47 SN 47.104
sn48 SN 48.178
sn49 SN 49.54
sn50 SN 50.108
sn51 SN 51.86
sn52 SN 52.24
sn53 SN 53.54
sn54 SN 54.20
sn55 SN 55.74
sn56 SN 56.131
snp 73
snp1 Snp 1.12
snp2 Snp 2.14
snp3 Snp 3.12
snp4 Snp 4.16
snp5 Snp 5.19
thag 264
thag1 Thag 1.120
thag2 Thag 2.49
thag3 Thag 3.16
thag4 Thag 4.12
thag5 Thag 5.12
thag6 Thag 6.14
thag7 Thag 7.5
thag8 Thag 8.3
thag9 Thag 9.1
thag10 Thag 10.7
thag11 Thag 11.1
thag12 Thag 12.2
thag13 Thag 13.1
thag14 Thag 14.2
thag15 Thag 15.2
thag16 Thag 16.10
thag17 Thag 17.3
thag18 Thag 18.1
thag19 Thag 19.1
thag20 Thag 20.1
thag21 Thag 21.1
thig 73
thig1 Thig 1.18
thig2 Thig 2.10
thig3 Thig 3.8
thig4 Thig 4.1
thig5 Thig 5.12
thig6 Thig 6.8
thig7 Thig 7.3
thig8 Thig 8.1
thig9 Thig 9.1
thig10 Thig 10.1
thig11 Thig 11.1
thig12 Thig 12.1
thig13 Thig 13.5
thig14 Thig 14.1
thig15 Thig 15.1
thig16 Thig 16.1
ud 80
ud1 Ud 1.10
ud2 Ud 2.10
ud3 Ud 3.10
ud4 Ud 4.10
ud5 Ud 5.10
ud6 Ud 6.10
ud7 Ud 7.10
ud8 Ud 8.10

should work now (requires browser refresh).

3 Likes

It linkafies but sends me to a malformed url. The : should be replaced with a . in the URL :slight_smile: