Integrate SuttaCentral and Discourse search

@blake @vimokkha @Jhanarato

This is more of a long term aim, but I thought I’d mention it.

It would be sweet to use SuttaCentral’s elasticsearch for Discourse. Currently Discourse search is based on PostgreSQL, which is more limited. There has been some discussion about using a more advanced search for Discourse, but the devs don’t see it as a priority.

However they do say that implementing elasticsearch is not that difficult.

For our use case, let’s assume someone on SuttaCentral is searching for a particular topic, say “cat”. They get references to cats in the English translations, probably showing up in some dictionaries, and so on. But they have no way of knowing that there has been this awesome discussion of cats in Early Buddhism here on Discourse.

So if elasticsearch had discourse indexed, we could show those results as well.

A more modest implementation would be simply to add the option, “Search on Discourse” to the results page, and they could link to a full page search for that term on Discourse:

http://discourse.suttacentral.net/?search=cat

Note that Discourse doesn’t properly support full page search results yet. There is a plugin:

https://meta.discourse.org/t/full-page-search-plugin-now-available/13791

But it seems likely the devs will add this as a part of core.

From the other side, searching on Discourse currently gives no results for SuttaCentral. So if I search for MN151 there are no results, since no-one has mentioned this sutta here (until just now!) But someone looking for this may well want to know about the relevant info on SuttaCentral.

@blake’s awesome ID-conversion plugin already pulls in some information, so a similar thing could happen for search.

I don’t think, however, that you’d want to see identical results in Discourse and SuttaCentral; the results should be prioritized to those from the site you’re on.

While we could no doubt go some way by tweaking PostgreSQL on Discourse, ultimately I think an integrated elasticsearch will be the best. One of elasticsearch’s main features is the swift indexing of user-generated content like this forum.

Those are some good thoughts. Obviously everything is easier on the Sutta Central side, if we wanted to include discourse discussions in the Sutta Central search results, that would be quite doable.

One really simple hack on the discourse end of things, would be to add a link to the discourse search ‘Search on Sutta Central’, which just takes the user to the Sutta Central search results page for that query.

Tangential to development, I wouldn’t expect to see Elasticsearch integrated with Discourse, the reason, as I’ve mentioned a few times, is that Elasticsearch is a bit of a pig when it comes to resources. PostgreSQL is just so much more straightforward for the Discourse use case.
When the dev said it wouldn’t be that difficult to implement, the specific context was that it wouldn’t be difficult to include extra dependencies in the docker container - however that is only one specific objection removed. There would still be the difficulties of writing the code which interacts with that new dependency, and also the added resource use.

1 Like

Great, let’s do it.

Also sounds good.

Okay, fine. As far as the resources go, we can just add more power if we need it, yes? The real thing is the time for coding, and there would probably be a lot of work before we made something that was substantially better than what already exists. As we know, Discourse is evolving rapidly, and I wouldn’t be surprised to see other people wanting to do a similar thing, integrating Discourse search with their main sites. Perhaps a plugin will be developed to smooth the process.