Omissions from the Concise Pali English Dictionary

JDavid · April 23, 2021, 5:42pm

Here are the first entries of my promised omissions from the CPED. These are entries appearing in the CPED that are missing or inadequately covered in the SuttaCentral Dictionary.
David.

akataññū
akampiya
akaraṇa
akaraṇīya
akaronta
akamaka
akāla
akāsi
akiriyavāda [S=Secondary entry in CPED]
akiñcana
akilāsu
akuṭila
akutobhaya [I-Indirect entry only]
akusala
akovida
akkandi
akkandana
akkamati
akkami
akkocchi
akkosi
akkositvā

sujato · April 28, 2021, 12:05am

Thanks David.

We’d have to review such cases.

For example, the first one, akataññū, is not an omission. It is an adjective and thus may take either final ū or u depending one whether it qualifies a masculine or feminine noun. (See the Critical Pali Dictionary entry.) The NCPED is by design a simple dictionary and we do not attempt to qualify all such cases. We state that it is an adjective, and a Pali student should understand the implications.

This looks like it should be added.

Not sure what the problem is, these are in fact in NCPED.

Do you mean akāmaka? If so, that is also present.

Anyway it seems as if there’s some misunderstanding here. Are you sure you’re using the correct source?

github.com

suttacentral/sc-data/blob/master/dictionaries/simple/en/pli2en_ncped.json

[
  {
    "entry": "aṁsa",
    "grammar": "masculine",
    "definition": "shoulder"
  },
  {
    "entry": "aṁsakūṭa",
    "grammar": "masculine, neuter",
    "definition": "“shoulder prominence”; the shoulder",
    "xr": [
      "aṁsa",
      "kūṭa"
    ]
  },
  {
    "entry": "aṁsabaddha",
    "grammar": "masculine",
    "definition": "a shoulder strap"
  },

This file has been truncated. show original

Other words might be omitted if they are not attested in the Tipitaka, whereas CPED also includes commentaries, etc.

JDavid · April 28, 2021, 4:04pm

Hello Bhante @sujato,

It seems that I was offline when looking up the first few words. (It’s worth noting that SuttaCentral does not detect this so that cached information, perhaps from JavaScript, is used. Therefore is such a case one can’t distinguish between an omission and being offline.) I’ll recheck against the JSON.

I agree that the CPED’s use of different word forms from the PED causes problems. However, note that akataññū does not show akataññu under “Similar Spelling”, and similarly for akāmaka vs. akāmaka. The problem in such cases would seem to be the metric for similarity.

Thanks to all for your good work on SuttaCentral!

David.

JDavid · May 1, 2021, 4:16pm

Hello Bhante @sujato and all.

It seems that the inclusion of the CPED along with NPED had two benefits. First, it helped readers of the EBT with forms that occur as headwords in the CPED, but not in either of the NPED or the PED. Second, it improved the automated lookup of such words (which are now sometimes fails, when with the inclusion of the CPED it succeeded). The main drawback is that such words (as in the cause of akataññū above) the CPED uses alternate spellings of some words

Both both benefits of inclusion of the CPED can be illustrated by the aorist form ahosi, “he, she, or it was”. Without the CPED the aorist form is no longer found. In its place is found a + hosi, “you (sing.) are not”. I suppose that the latter could occur, although I’m not aware that it does. If so, the automated lookup should, in theory, find both.

A native Pali speaker would have discovered and resolved such local ambiguities subconsciously in a context-dependent way that cannot, I think, be easily modelled by lookup alone. In any event, optimal Pali lookup seems a long way off, especially considering the problems posed by sandhi.

In the short term, however, it might be worth considering adding a new lookup mechanism to replace what the CPED provided for word forms where the current lookup mechanism now fails or is incomplete. In many cases, the lookup failed before the CPED was removed. I suggest that the best way to do this is to have the additions of new word forms user-driven.

David.

sujato · May 14, 2021, 12:29am

Indeed, that is a good point. Ultimately, what is needed is a dedicated digital Pali dictionary with proper handling of all these situations. The SC team doesn’t have the resources to do it all. But! Luckily there is a promising new project:

https://ippd.ovh/

My thoughts on this would be:

focus on doing something useful, and gradually improve it. As my dad said, if something’s worth doing, it’s worth doing even badly.
create a data API so that it can be selectively consumed for different contexts.

This is the hard issue. It seems that current technologies can get a lookup working with maybe 90% effectiveness. Short of manually tagging each word in the canon, I don’t see any real way to get it much better than that.

JDavid · May 16, 2021, 4:22pm

Yes. I agree with all you are saying, especially the following:

The current system was certainly worth doing, even though some lookups are bad but rather funny at times. Like automated translations. People, I think, can cope with the failed lookups, especially when translated texts are available.

I’ll look into this further and get back to you.

David.