Do any bored web devs feel like making a searchable version of GRETIL?

GRETIL has this great collection of Indic texts, I use it all the time.

https://gretil.sub.uni-goettingen.de/gretil.html

The site is simple and reliable, but one great limitation is that there’s no search. I usually work with offline files and search locally. But it would be great to have an online searchable site.

Of course I might have missed something, let me know if there is a utility for this somewhere.

One trick will be to disambiguate the different versions of the same text, as GRETIL often has the same text in several markups.

Bonus for scraping relevant texts from TITUS and adding them!

3 Likes

You can do that search in BuddhaNexus. We have Gretil, DSBC and some from SuttaCentral. We’re working on a new version with more possibilities for search so you don’t get all Pali matches when you just want Sanskrit etc.

4 Likes

Oh, cool, thanks for letting me know, I’ll check it out.

By the way, have you ever discussed the TITUS texts? It seems to me an absurd situation. While the TITUS texts are good, the TITUS website is the pinnacle of unusability, an ancient, user-hostile deterrent to doing any serious work. Yet they claim, without any legal foundation, copyright over their Sanskrit texts, so they are not hosted on GRETIL. They include a number of important texts which are unavailable elsewhere, and are as good as locked away in a cabinet forever. I’d love to liberate the TITUS texts, just put them on a plain page like GRETIL, so they can be actually read.

1 Like

For some reason I never got a notification of this post and only just now saw it in my feed.

TITUS, yes, we didn’t bother. 1990s frame-style website and the fonts are not working easily.

I scraped the site but get stuff like this:

payodhar��candanapa�kacarcit�s
tu��ragaur�rpitah�ra�ekhar�� /
nitambade���calahemamekhal��
prakurvate kasya mano na sotsukam // 6 //

There are however a few Buddhist texts that are doable. Which ones would you be interested in for SC?
Something like this can be coded (I removed a lot of junk already).

<br>Ucchvasa: 1<a name="Asv._Bcar._1">
<br>Strophe in ed. EHJ: <a name="Asv._Bcar._1">
<br>Verses 1.1-24 sqq. have no equivalent in ed. Johnston.
<br>
<br>
<br>Strophe in ed. EBC: 1<a name="Asv._Bcar._1__1">
<br>Verse: a<a name="Asv._Bcar._1__1_a"> śriyaṃ parārdʰyāṃ vidadʰad vidʰātr̥jit /
<br> śriyam~ para-ardʰyām~ vidadʰat~ vidʰātr̥-jit /
<br>
<br>Verse: b<a name="Asv._Bcar._1__1_b"> tamo nirasyann abʰibʰūtabʰānubʰr̥t /
<br> tamaḥ~ nirasyan~ abʰibʰūta-bʰānu-bʰr̥t /
<br>
<br>Verse: c<a name="Asv._Bcar._1__1_c"> nudan nidāgʰaṃ jitacārucandramāḥ /
<br> nudan nidāgʰam~ jita-cāru-candra-māḥ /
1 Like

Hmm, must be some old encoding.

Okay, well at least that’s Unicode.

I think anything would be useful, in fact I’m more interested in the non-Buddhist stuff, there are lots of Brahmanical texts that are absent from GRETIL. I think nothing fancy, just basic structure HTML on a static website. These days you can get a static website framework (eg. Rocket I used a few years ago, not sure what is there now), throw in HTML or markdown, and it’ll automatically index it for client-side searching. So long as there isn’t a vast amount of texts it should work fine (we put the whole of the English Nikayas in client-side).

I mean, even just plain static HTML pages would be something!

1 Like

Have you also tried the TITUS Text Retrieval Engine for searching through texts?

A better search system for Sanskrit is also going to be part of the new Dharmamitra project. GRETIL is a modest collection with less than 250MB of text in IAST transliteration, Dharmamitra will host several GB of Sanskrit text. :slight_smile:

3 Likes

When do you think it will be available for people to use?
As Bhante is mainly interested in the Brahmanical text on TITUS that are not on GRETIL, I take it that those texts are included (maybe from a different source than TITUS)?