Using the search on SC, I regularly find that there are certain quirks in Pali that a clever search engine should know about, but doesn’t. As @blake evolves the search, it would be good to include these when possible. They may well be useful for the Pali lookup as well. Sometimes these matches might be best done only if regular matching doesn’t work.
I’ll update this list as I come across new cases.
Meanwhile, if anyone has any other cases, please put them in the comments. Note that this topic is specifically for what I have said here. If you have other suggestions for search, best open a new topic.
- Initial vy- = by-. These commonly differ in different editions. Our dicts usually have vy- while the text has by-.
- ṅ = ṁ = ṃ.
- Final -ti. Obviously there’s a problem differentiating it from verbal endings. Still:
- “long vowel-ti” —> “short vowel iti” or “long vowel iti”
- Final -nti —> " iti" (i.e. drop sandhi)
- ñeva and ññeva —> yeva
- Often our text includes quote marks, which makes this trivial: bhiyyo’ti, nirodhetun”ti
- Final -pi (= api) is similar to -ti, except less readily mistaken for a verbal ending. Examples:
- cepi —> ce api
- -mpi —> -ṃ api
- Final -ce and -ca. These commonly create a sandhi ñ, eg. puthujjanānañcepi = puthujjanānaṃ ce api. So:
- -ñce —> -ṃ ce
- -ñca —> -ṃ ca
- Final -va can stand for iva or eva, and it’s usually not possible to differentiate, so match both. The usual sandhi issues apply, eg mayañceva = mayaṃ ce eva:
- -āva—> a iva or a eva or ā iva or ā eva.
- Initial a: try stripping it and finding the positive term. Sometimes there’s a sandhi.
- asekha —> a-sekha
- aneka —> a-(n)-eka
- Initial n: see if it’s an na-:
- neva —> na eva (although as it happens this example is found in our dicts already)
- nāsaññā —> na asaññā
- ṃ before s is inconsistent in some words: mahisa vs. mahiṃsa.