Summary:
-Single PDF of whole suttapitaka available.
-Issues with searching suttapitaka in general; possible fix.
I find it very useful to be able to search the whole suttapitaka for words, phrases or phrase variants. I have mainly been using the Pāli Reader for that, integrated into Firefox.
I have been advised to also use the Chaṭṭha Saṅgāyana edition inputted by Dhammakāya organisation. I found the online search function for that dysfunctional. I am aware of the Chaṭṭha Saṅgāyana CD-ROM but so far as I know that software does not work with Mac, so I can’t use it.
So I downloaded the entire text from http://gretil.sub.uni-goettingen.de/#Pali and put all of the files together in one PDF for the whole suttapitaka, which means it can be searched for terms as one (rather large) document. I did the same for the vinaya. If these files would be useful to others, I am happy to have them put online somewhere. One thing that I couldn’t do is create an Outline (the table of contents in the left column) which would be very useful (if anyone wants to add that, would be great!)
Problem:
Now, the issue I have found, is to do with inconsistency of input. I would like to know if this same problem exists when using the Chaṭṭha Saṅgāyana CD software. I somehow expect it may. I will give an example:
The word ceva is sometimes written as:
c’eva
c’ eva
c’; eva
I don’t know why they have done that - if anyone knows, please say! But if I search for a phrase which has ceva as a part of the phrase, I have to use all of the variants to do a proper search. Now that’s just one word - there must be many other such examples!
Another slightly different but same type of problem is the choice of where to put commas or full stops. There may be an identical sentence, but it cannot be found by searching the sentence because one has different punctuation. And that issue is there on the Pāli Reader also.
Potential solution:
I do not know if this has already been done (if so, please say!) But I would think that the most functional text for the Canon would be if it were able to be searched without any punctuation or spaces between words. (Presumably that is how the Pāli is originally?) That is inconvenient for reading, but should be far better for comprehensive searches, which are so important for research.
I could imagine that software could be designed to ignore all punctuation and search merely the letters, as if they were all one string. So that you would not have to actually edit the whole Canon. Or perhaps that is already done, or someone has a method? I would love to hear about it!