Pāṭimokkha and paṭimukka

A nice summary of the various suggestions for the word pāṭimokkha can be found here:

I want to focus on just one element. I’m going to quote the relevant portion of Nyanatusita’s introduction. This is slightly different than the one posted above, but it arranges the material more conveniently:

[I.B. Horner] quotes the Pali English Dictionary, according to which pātimokkha is said to have the same meaning as paṭimokkha at J V 25: “that promise to be obliged has not been released” (taṃ saṅgaraṃ paṭimokkhaṃ na muttaṃ). A few other references also support the future passive participle etymology: J V 166: Yaṃ (bandhanaṃ) natthuto paṭimokkh’assa pāse: “(the bondage) which was tied through his nose (of the nāga) in a noose” and D I 12 & 181: osadhīnaṃ paṭimokkho: “the binding on of medicinal herbs” or, in accordance with the commentarial explanation of this, “removal of/releasing from [caustic] medicinal herbs.”

Zeroing in on the examples given, in the MS edition we have:

ja513:6.2: Taṁ saṅgaraṁ paṭimukkaṁ na muttaṁ
ja524:10.2: Yaṁ natthuto paṭimokkassa pāse

These clearly mean “binding”. Note that the spelling of the two words is different, and it also differs from the examples cited by Nyanatusita, presumably from the PTS editions. The most striking variation is the loss of the aspirate -h, but none of the changes affect the etymology, only the inferred grammatical form.

There is also:

dn1:1.27.2: osadhīnaṁ paṭimokkho

The latter appears thrice with and once with t. Again the spelling variants, though relevant to the argument, are not noted. Here the meaning is not as clear, but I think the sense “binding on of herbs” is much simpler and more obvious that the commentary’s explanation (which seems to me backformed from their explanation of the pāṭimokkha itself). It is a common procedure in traditional medicines to apply herbal salves and the like locally by binding them on with a bandage to the affected area; I have had this done to reduce knee swelling.

Missing is the following in the Thera-apadāna:

tha-ap25:5.3: Tena kammena sukatena,
tha-ap25:5.4: paṭimokkhāmi duggatiṁ
By means of that well-done deed
I am freed from bad rebirth.

Here the PTS reads parimokkhāmi, which is perhaps why it has been overlooked. But it cites paṭimukkhāmi as a Sinhalese variant, and since both MS and BJT have paṭimokkhāmi it is fairly well attested. The sense is clearly “release”, regardless of the chosen reading.

So far we have shown that:

  • there is a fair deal of variation in the relevant readings
  • the examples cited by Nyanatusita do not exhaust the Pali
  • there are clear attestations of the sense “binding” and less clear attestation of “release”.

Most interestingly, there is variation in the forms with and without aspirate. Why is this interesting? Because the form paṭimukka is well-attested in early Pali in the sense “binding”. It commonly occurs in contexts such as paṭimukkassa mārapāso, “caught in Māra’s trap”. Despite the lacking aspirate, it is from the same root as pāṭimokkha, or at least as the most obvious root, from muñc “free”, with the prefix paṭi creating an antonym.

  • (The variation in the aspirate can be explained by the Sanskrit form mokṣ. It is common linguistically for Sanskrit sibilants to weaken into aspirants in Pali. Compare Sanskrit skandha with Pali khandha.)

The two words have a straightforward relationship.

  • paṭimukka is “bound”, past participle with adjectival sense.
  • pāṭimokkha is “(that which is) binding”, “obligation”, future passive participle (“gerundive”), with the implicit affix -ya and a lengthened initial vowel after the style of, eg. pāṭidesaniya.

The Compendious Grammar notes of this affix:

This is called a kicca affix but is included in the kita chapter of Kaccāyana (Kacc 545) – an affix of the future passive participle (Kacc 540).

Kaccāyana lists pāṭimokkha under the taddhita constructions where the initial vowel is lengthened, which argues against certain of the commentarial readings, and supports ours. (As usual, spelling is not standardized: VRI edition has , while the excellent and super-helpful translation by Ashin Thitzana reads t.)

Despite the fact that the spelling with a dental t (pāti) is most prevalent in modern editions, I think this is probably due to normalization in line with the commentary. Elsewhere the prefix is usually spelled paṭi, and I see no reason why this should change with the lengthening of the initial vowel; again, cp. pāṭidesaniya.

In conclusion:

  • derivation is paṭi + muñc + ya
  • sense is “(that which is) binding”, “obligation”
  • spelling is pāṭimokkha

Etymology in sanskrit:
The word corresponds to sanskrit prātimokṣya, and is derived from prati+muc (meaning towards release). Prati is an upasarga (preverb) meaning towards. Mokṣa is the noun form of the verb muc (and means release,freedom etc). Prātimokṣya (with the lengthened intial vowel in the first syllable prā) makes it a vṛddha form. The ya at the end of the word is a kṛtya suffix called ṇyat in pāṇinian grammar - and this kṛtya suffix ṇyat causes a lengthening of the initial vowel.

The meaning of prati changes depends on whether it is used as a upasarga (pre-verb), gati (not as a preverb) or as a separable preposition - so prātimokṣya & pratimukta need not be semantically similar. Pratimukta can therefore mean ‘bound’ while prātimokṣya means ‘leading towards release’

Etymology in Pāli
Paṭimukka is a lexical variant for paṭimutta (this word is different from pāṭimokkha) Since the underlying word is pratimukta, sometimes k is duplicated (paṭimukka), other times t is duplicated (paṭimutta) to arrive at the Pāli form. It is a past passive participle.

The dental ‘t’ in pati is normally retroflexed (to ṭ) in Pāli (and Gāndhāri) to show the presence of a preceding ‘r’ (of ‘pra’) in the underlying language that has now gone missing in the current (pāli/gāndhāri) form of the word.

The ya at the end also goes missing in Pāli as the complex syllables are simplified/geminated when the word becomes Pāli so (prātimo)kṣya becomes (pāṭimo)kkha.


But, but … What about the fact, brought out by Ñāṇatusita, that all the wordplays in the ancient commentaries and even the Abhidhamma only work when the spelling is Pātimokkha, that is, the t is not retroflex. This seems like a significant piece of evidence. The Sanskrit tradition similarly does not seem to know the reading with a retroflex t.

This should not, however, affect the meaning.

1 Like

The sanskrit tradition doesnt need the retroflexed t because the pra (in prati) is not simplified in sanskrit to pa, so the orthographic custom of retroflexing the t serves no purpose. In pali and other types of prakritic variants, the t’s retroflexion serves to indicate that the word has a preceding r (in the underlying language) of which the pali word is but an orthographic approximation.

The wordplays (in that case) would have worked in the pre-Pali version of the text (assuming there was one) where the prāti was spelled as prāti and not as pāṭi.

1 Like

Yes, thanks, I understand your point. One problem, however, is that we don’t know the underlying language. The consensus among scholars seems to be that it is a Prakrit rather than Sanskrit, which means the underlying form may have been pati/paṭi rather than prati. Now if the underlying form were paṭi, then presumably the Sanskrit equivalent would be praṭi. Is this correct? If so, then a Sanskrit form that includes the retroflex is indeed supporting evidence that the Pali also should be written in this way.

Yes, but again this begs the question of what was the pre-Pali form, even if there was one. There are scholars who argue that the Buddha’s language was a early form of Pali or very similar. And even if it was not Pali it would have been a Prakrit, in which case the pr consonant cluster is unlikely.

Thanks for that!

This is an oversimplification. Cone lists the senses of paṭi:

as preverb to verbs and their derivatives expresses: towards, against, back, again, in return, counter.

They don’t need to be, but they can be. I already pointed out that we find both the senses “bind” and “release”.

The relevant Abhidhamma passage is here:

vb12:9.1: “Pātimokkhan”ti sīlaṁ patiṭṭhā ādi caraṇaṁ saṁyamo saṁvaro mokkhaṁ pāmokkhaṁ kusalānaṁ dhammānaṁ samāpattiyā.

Not sure how much weight I’d give the “wordplay” here, more significant is that it clearly points to the idea “release”, supporting srkis’ proposal “towards release”. On the other hand, we know that even Vibhanga explanations can misrepresent the suttas (eg. muta).

Also I have to apologize to Nyanatusita, I was misreading his text. I was looking at the simplified version, and in the full version he does indeed cite the examples I thought were missing, including paṭimukka , of which he says:

Cf. the past participle paṭimukka “bound down/fastened,” e.g. S IV 91: paṭimukk’assa mārapāso.

I thought it was odd that he’d miss this!

I still think my conclusions stand, although that Vibhanga passage does give me pause. Regardless of what one might think the original spelling was, however, it is reasonable to spell it with t as that prevails in the manuscripts.

We can narrow down the underlying language - because the evidence limits our possible alternatives to just one language

  1. Consider one core/nucleus language – In my understanding the Buddha spoke only one mainstream language (the Pali canon does not evidence the claim that he or his interlocutors changed languages based on location or audience, or needed to translate - such an evidence of code-switching is not found in any co-eval Gandhari, Ashokan or Sanskrit text either, so paradoxical as it may seem, the evidence indicates that there was only one mainstream Indo-Aryan language spoken in the Buddha’s culture). Even if he spoke other fringe/minor/secondary languages, the core underlying language of the EBTs is unlikely to be one of those fringe/minor dialects. So the idea is to identify that core dialect, not all the other ones (even assuming others did exist).

  2. It was necessarily an Indo-Aryan language i.e. a linguistic tradition that originates from Old-Indo-Aryan (Vedic and derivatives). This eliminates non-Indo-Aryan (i.e. traditions that lie outside of Old-Indo-Aryan origins) - such as independent prakritic linguistic or literary traditions that could have ostensibly existed/developed side-by-side with the Old-Indo-Aryan (Vedic/sanskrit) tradition. Assuming a parallel development of a largely-unattested Indo-Aryan linguistic tradition independent of Sanskrit and Vedic would require the reconstruction of a new language or language family - which to my knowledge no philologist has so far shown interest in. Besides Sanskrit and Vedic are capable of explaining the etymology of 99.5% of Pali vocabulary, and the rest are Dravidian/Munda/Greek/Iranic loanwords. So there is no need to invent a parallel origin theory for the underlying language of early buddhism.

  3. It can only be a pre-existing language. This removes from our scope languages whose first attestations (or inferred existence from co-eval sources) post-date the time of the Buddha.

  4. It cannot have been a language that was limited/unique to early-Buddhism or to the EBTs but was a broader language (wider than Buddhism) of the wider lay society. This removes Pali itself from the list - which is a language restricted to the Pali canon. The same can be said of Ardha magadhi of the Jain canon. The right word to use for these languages is - ‘canonical registers’ as these languages are limited to the canonical texts. The Buddha’s language was therefore not merely a ‘Buddhist’ language.

  5. It would have been a language that is sufficiently close to the language of the EBTs. Here we don’t have a choice as (except the oldest forms of Vedic), most Sanskrit and early Prākrit is very close to the language of the EBTs, so that only removes the earliest forms of Vedic and later forms of prakrit from our scope.

  6. That language must be capable of being, linguistically, an etymological source for the language(s) used by the EBTs. Without this, it would be impossible to meaningfully preserve the buddhavacana word by word faithfully in a derivative register such as Pali, BHS or Gandhari.

  7. There would have to be a plausible reason why such a language could not be preserved in its original form in the EBTs (and had to be changed into the language of the EBTs)

  8. There are other limitations (too numerous to list here - from the internal evidences present in the EBTs, and external evidences from co-eval texts outside Buddhism, such as the language of the Ashokan edicts, the co-eval Indic words surviving in dravidian, greek & persian texts from the time of Alexander onwards etc) that further limit the possibilities of this nucleus language of proto-canonical Buddhism.

One example of internal evidence from the Pali canon - there is in the Ud 5.6 a statement that Soṇa Kuṭikaṇṇa recited (abhaṇi) the aṭṭhakavagga “with svaras” (sarena) to the Buddha, and at the end of the recital with svaras (sara-bhañña-pariyosāne), the buddha lavished praises on him for his clear and correct enunciation. But what exactly are the svaras? I take them to be the vedic svaras (the tone accents) - he recited the verses with the tone accents. The accents (svaras) exist in classical sanskrit (where they are optional) as well as vedic (where they are compulsory), so what other language other than Old-Indo-Aryan (Classical Sanskrit or Vedic) could Soṇa Kuṭikaṇṇa have recited the aṭṭhakavagga to the Buddha in? The Vedic svaras (tone accents) are inherited from the Proto-Indo-European, see Proto-Indo-European accent - Wikipedia and Vedic accent - Wikipedia

“Evaṁ, bhante”ti kho āyasmā soṇo bhagavato paṭissutvā soḷasa aṭṭhakavaggikāni sabbāneva sarena abhaṇi. Atha kho bhagavā āyasmato soṇassa sarabhaññapariyosāne abbhanumodi: “sādhu sādhu, bhikkhu, suggahitāni te, bhikkhu, soḷasa aṭṭhakavaggikāni sumanasikatāni sūpadhāritāni, kalyāṇiyāsi vācāya samannāgato vissaṭṭhāya anelagaḷāya atthassa viññāpaniyā

So in summary - it was a mainstream Indo-Aryan language belonging to the same Indo-Aryan tradition as Vedic and Sanskrit, it was pre-existing at the time of the historical Buddha’s birth, a language not limited to Buddhist texts, but a language that early-Buddhists had inherited from their culture, a language temporally and linguistically close to the languages of the EBTs, a language that can possibly claim phonemic and morphological ancestry to the attested linguistic registers of the EBTs, a language in which writing was a novelty in the time of the Buddha, a language where there was an option for Soṇa Kuṭikaṇṇa to recite the aṭṭhakavagga with svara intonations, a language that Ashoka would have understood, a language that was spoken by Brahmins in many janapadas, and a language whose word-forms are potentially attested in Indo-Aryan and non-Indo-Aryan texts of that period.

I leave it to your imagination what language could have ticked so many boxes - but you probably see that most of these postulates above are commonsensical enough to be uncontroversial.

There is no Sanskrit form of prati (or prāti) that includes a retroflexed t. The retroflexion of the t is related to the disappearance of the preceding r - they happen together when the Sanskrit prāti is prakritized (or pali-fied) into pāṭi. Sometimes the retroflexion is skipped (because the writer fails to follow the above convention), resulting in a form such as pāti, but that would have been confusing to the reader. In the first 4 nikāyas, words starting with paṭi occur 4300 times, and words starting with pati occur only 379 times - showing that the retroflexed ṭ was clearly the norm. However in the words starting with pātimokkha, it is always spelt with a dental t rather than a retroflex ṭ - I don’t think it makes much of a difference either way.


Just picking up this idea, it’s an interesting suggestion. Pali scholars have typically read it simply as “voice, sound”, but I think you’re right, in these contexts it does seem to have a more specific technical sense. For example, the sarabhañña that is allowed in the Vinaya is typically simply translated as “chanting”, but perhaps it has a more specific sense. I mean, surely the point is that certain kinds of chanting are allowed or disallowed. The compound would seem to literally mean “tones to be spoken”, i.e. “tonal recitation”.

When Sona chants sarena abhaṇi, sara can hardly mean “voice”, it must mean something like, “recited with tones”. Perhaps English “intoned” would serve better.

The reason Pali scholarship ignores this point is obviously because there are no tones in Pali. But this is a double-edged sword: if the text is referring to tones, then it pushes our understanding of the text’s original language a certain way; but if the original language was toneless, then it pushes the understanding of sara a certain way.

I’m reminded of chanting in Thailand, where the non-tonal Pali is recited by folks used to a tonal language, who then introduce a kind of quasi-tonal system into Pali. So in Thai chanting, saṁ is often recited high, even though it would be rising in Thai.

It seems to me that, given that Pali is non-tonal, and that even in the Sanskrit form of the time the tones were obsolete, yet the Pali texts, especially verse, draw heavily from then-archaic Vedic prototypes, the most likely inference is that they used a chanting style that mimicked or echoed the Vedic style. The fact that it is described as sarena could be taken as evidence that that is a distinguishing feature, i.e. that regular language is not sarena.

At DN 21:1.6.1 we have:

tantissaro gītassarena, gītassaro ca tantissarena

Where gītassara clearly means “singing”.

Elsewhere qualified with āyatakena gītassarena which is disallowed because it sounds just like lay folk singing.

@Brahmali what do you think?

1 Like

Both gītassarena and tantissarena are clearly referring to music - the word tanti (sanskrit: tantrī) referring to a “stringed” lute instrument such as the vīṇā which are normally used in classical music performances.

Indian classical music is svara-based and rāga-based music - where svara means a solfa-note representing a particular pitch. It is said that this type of music originates in the musical-type enunciation of the sāmaveda which is sung using 7 notes (as in music) rather than chanted. In any case, the svaras of music and the svaras of the vedas represent nearly the same thing conceptually -i.e. pitch-based tones. The Vedic language was itself pitch-based, and the classical music tradition where instruments and singing were used were also svara-based. Singing normally uses many more svaras than chanting, though - and the function of a pitch in music differs from chanting.

The pitch-based tones of the vedas are a semantically-relevant linguistic feature i.e. which could change the meanings of the words used (a feature also shared by other ancient Indo-European languages like Ancient-Greek, Proto-Germanic (the ancestor of English) etc, but the pitch-based tones of music are purely a musical feature which do not impact semantics - the musical-svaras probably originate conceptually from the linguistic-svaras of Vedic, and the two are related to each other, but are not identical in usage. You can hear the 7 svaras of classical Indian music at Madālasā | SANSKRIT SONG from The Mārkaṇḍeya Purāṇa - YouTube (between 0:40 and 1:10).

That such a svara-based form of classical music tradition did exist in the vedic period is also evidenced in Vedic texts. This is an extract from the Gopatha Brāhmaṇa of the Atharva-veda:

om iti vyāhṛtīḥ svaraśamyanānātantrīḥ svaranṛtyagītavāditrāṇy anvabhavac…

In the above phrase you can see the words svara (pāli: sara), tantrī (pāli: tanti), gīta (pāli: gīta), nṛtya (pāli: nacca) etc where the discussion is about music and dancing.

The language of the other vedas (apart from the sāmaveda) and classical sanskrit, on the other hand - have only 3 svaras (udātta, anudātta and svarita) - and these 3 svaras are what Pāṇini (in circa the 4th century BCE) says are used in spoken classical sanskrit. Paninian grammar explains the svara system of classical sanskrit (and of vedic) in great detail and the evidence both from Pāṇini and from other literature from the time indicates that the svara enunciation in classical sanskrit was still in use (but not invariably used like in Vedic) in the 4th century BCE.

The svara enunciation in classical sanskrit has now gone out of use in speech, but in the 4th century BCE it was still employed optionally. There is no hard and fast rule demarcating classical sanskrit from vedic sanskrit, and vedic grammatical and other archaisms did continue in classical sanskrit even after Pāṇini (for example Patañjali, in his 2nd century BCE commentary on Pāṇini’s grammar, the Vyākaraṇa-Mahābhāṣya, written in classical sanskrit, uses the vedic ‘tavai’ infinitive affix – “tasmād brāhmaṇena na mlecchitavai nāpabhāṣitavai”).

Many other examples of the continuation of Vedic grammatical, phonetic & linguistic usages are available in a lot of BCE texts, including in the Pāli canon. So classical sanskrit in the time of Pāṇini is a broad term that simply means late-Vedic (as spoken in the 4th century BCE), it does not mean that it had rid itself of all grammatical/phonemic/lexical archaisms that characterise Vedic.

I agree there are no svaras in Pali, neither are there any svaras in any of the other prakrits.

However, chanting the aṭṭhakavagga with svara would not be the same as singing it, if it were singing, it would be described as singing (gītam), also singing would have involved a lot more than simply using svaras i.e. it would probably have had to be set to a musical mode (rāga) first. Setting the aṭṭhakavagga to music is not evident to me from the quoted phrases.

Besides the Divyāvadāna also evidences the intoned recitation that I have postulated above – see “athāyuṣmāñ śroṇo bhagavatā kṛtāvakāśaḥ asmāt parāntikayā guptikayā udānāt pārāyaṇāt satyadṛṣṭaḥ śailagāthā munigāthā arthavargīyāṇi ca sūtrāṇi vistareṇa svareṇa svādhyāyaṃ karoti” - here, apart from the word svareṇa, the usage of the word svādhyāya is also significant. Svādhyāya in the vedic tradition refers to chanting the memorized vedic hymns periodically. Each brahmin clan is associated with a specific set of vedic verses handed down in an oral patrilineal tradition that goes back to the original vedic Ṛṣis who composed those verses. So svādhyāya means they periodically recite the memorized set of hymns that they had inherited from their ancestors with an intention to pass it down the line. In the buddhist sense, the above statement in the Divyāvadāna (and in the Pāli canon where too sajjhāya is mentioned as being done both by brahmins and buddhists of the buddha’s lifetime) would mean not an oral transmission across generations (unlike the vedas) but chanting the suttas of the arthavargīyāṇi (aṭṭhakavagga), śailagāthā (sela-sutta), the munigāthā, the satyadṛṣṭa, the verses of the pārāyaṇavarga, udānas etc as these early suttas would have been regarded as foundational texts of the buddhist canon, and memorizing / reciting them with their original accents was evidently considered a form of svādhyāya.

Yes it was a distinguishing feature, but only in late-vedic. A lot of people didnt or couldnt speak intoned sanskrit (they spoke non-accented sanskrit which has since, become the invariable norm), but evidently still understood intoned speech when spoken to – back then – as some others could and did speak the language with accents. The svaras when employed incorrectly changed the meanings of words (expressed contrarian senses), and examples of such semantic shifts occuring as a result of incorrect use of the accents are discussed in sanskrit grammatical texts.

However since the Sanskrit tradition (from which the Divyavadana quotes) also refers to such an intoned recitation, the issue is not unique to Pāli.

1 Like

Thanks so much @srkris for this very detailed response. I do intend to reply in a bit more detail, but for now I just to wish to express my appreciation.

1 Like

The registers of the EBTs are in Prakrit or hybrid Prakrit forms (BHS). Much scholarly work has been done on looking at underlying Prakrit word forms with phonemic flexibility leading to different word variations in the manuscripts/EBT registers, for example, the article on isipatana you shared in the other thread. It seems to me that the language of the Buddha would have been a form of koine Prakrit, something which also would have emerged from contact with various non-Indo-Aryan groups of the time and place (Dravidian, Munda, etc.). That is, Prakrit seems to be the more common-folk colloquial language in the time, whereas Vedic / Sanskrit was less widely spoken and accessible to all.

I tend to follow the Proto-Pāli hypothesis, for several of the reasons you listed above. That is, that the Buddha spoke in a Prakrit relatively close to what we now call Pāli, and that the Buddhist communities tried to preserve the oral tradition in this Prakrit. Naturally though, as it spread to various regions, the Prakrit shifted into more local-friendly forms with only minor variation. A lot of this amounts to mere pronunciation differences with perhaps some minor grammatical tweaks, etc. This all before it was “translated” to BHS and Classical Sanskrit from whatever Prakrit register(s) the texts happened to be in by the translators.

Are you suggesting though that it would have been a form of late Vedic/Sanskrit as spoken in the region of the time — generally without svaras etc. — rather than early Prakrit?

1 Like


Pali is practically synonymous with the Pali canon. It has had no independent existence outside of the Pali canon and its derivative literature. Since the earliest times, it has not been used by anyone except Theravada monks, and they too have been using it only under the firm mistaken belief that it was the Buddha’s language. But the Buddha could have only used a language in which he was spoken to by the rest of his society - and not in a language peculiar to himself or to Buddhism, which necessarily therefore was not a Buddhist-only language like Pali.

So I don’t see with what logic/confidence Prof. Gombrich (and/or other scholars) would be able to claim that an exclusively-Buddhist language like Pali could have been the Buddha’s language.

Buddhist Hybrid Sanskrit (BHS) is not a Prakrit, but rather a form of Sanskrit restored from Gandhari texts. It is about 90% identical to standard-sanskrit, it shares its remaining 10% peculiarities with Gandhari & Pali. In my understanding, BHS represents the language of EBTs progressively restored from Gandhari to Sanskrit - and hence it contains many incorrectly sanskritized expressions (or hyper-sanskritisms, as they are called). Logically therefore it appears that those who were restoring from the Gandhari were not themselves sure what exact sense many of those words and expressions in the Gandhari EBTs made.

It is very likely that the Pali canon is also redacted from a Gandhari prototype (where the Gandhari manuscripts available were quite fragmentary (with words and phrases and even paragraphs missing here and there), which is how we usually find them today. This would explain why Pali suttas appear to have so many canned phrases (or ‘band-aid phrases’ as I like to call them) to make the suttas ‘whole’ again. It may have been a common technique also used by the BHS EBTs but nowhere does the use of the canned phrases appear more prominent than in the Pali Suttas. Outside of the EBTs, such canned phrases are virtually unknown in other Indic prose literature (including Vedic prose).

As far as we can see, most extant Gandhari texts are also mainly (though not exclusively) Buddhist (and its use was geographically limited to the Gandhara area), so again it is unlikely to have been the Buddha’s language.

As I said, BHS is for the most part Sanskrit itself (and its peculiarities stem mainly from the fact of its restoration from Gandhari), and therefore doesnt have much of an independent value.

That leaves us with late-Vedic, which, based on what I know, was a massive language spoken by nearly all Indo-Aryans in the Buddha’s time (and which in its 4th century BCE grammatically-standardized form is known as Classical Sanskrit) - and which was optionally spoken with svara intonations.

There remains the question - Why does Gandhari and Pali even exist in the first place if late-Vedic was spoken by almost all Indo-Aryans? It’s because (as I mentioned elsewhere):

  1. There were Achaemenid-era Iranic linguistic influences in India that gradually were causing phonetic and morphological confusions in late-Vedic. This must also have been the reason why the extremely thorough generative grammar of Panini had to be composed at this time specifically to validate grammatically correct usages.
  2. The introduction of writing in Aramaic script (which caused further phonemic confusions as the script was not designed for a phonetic representation of Late-Vedic Sanskrit) followed by marginally improved scripts like Kharoshthi and Brahmi which were still not capable of phonetic representation until they evolved sufficiently enough over the centuries and were fully suitable to write sanskrit phonetically accurately only by the beginning of the common era.
1 Like

Thank you for your very interesting and helpful ideas in this, and other threads.

You bring up a very good question here. I believe Prof. Gombrich’s idea (going by memory and not having looked at his book for a bit- please correct if I am mistaken), is that Pali served as a somewhat ‘inside’ language for the Buddha and his close followers, a particular mode of expression used to express the Buddha’s unique and novel insights.
Perhaps the Buddha and followers spoke a more vernacular language routinely as well.

It has also been pointed out that there doesn’t seem to be any mention of a ‘level 0’ language in the Pali Canon, an original one that was used for translation. And certainly the translation project would have been a huge one considering the size of it.

I didn’t say Pāli; I said Proto-Pāli. That is, a dialect of early Prakrit that precedes modern-day Pāli akin to an earlier register perhaps than the language we find in the Ashokan edicts. This can be called ‘Proto-Pali’ for the sake of convenience, as it is just a Prakrit register that was transformed gradually into our current Pāli manuscripts.

Personally I find it quite unlikely that the Buddha spoke in Sanskrit, which was then completely lost and converted to Gandhari, which was then turned to BHS, which was then re-Sanskritized with several incorrect Sanskritizations, etc. I think it’s much more likely that the Buddha spoke in Prakrit — a language his very-likely non-Indo-Aryan people (the Sakyans) could have learned or spoken as a middle-ground between late-Vedic and native Munda/Dravidian languages. Prakrit would have been much more accessible to people who were from mixed-ethnicity communities and tribes. We see also that the older Svetambara Jain canon is preserved in a form of Prakrit as well (not that Ardha-Magadhi, etc. was the language of Mahāvīra).

Bryan Levman has done a fair bit of research on ‘Proto-Pāli’ (a Prakrit form) and the underlying Dravidian/Munda influences in the vocabulary and phonology. You might find some of his work interesting. There are other scholars besides Gombrich who have, IMO, provided more reasonable arguments for the Buddha speaking a Pali-like Proto-Prakrit closer to what we know today, i.e. Stefan Karpik in a couple of articles.

Have you looked at comparative research with the Chinese Āgamas of various early Buddhist sects? This is not unique to Pāli; it is a general feature of the EBTs. Mark Allon — a Gandhari expert — has shown that the structure and application of stock phrases in the EBTs shows a clear progression:
Pāli texts generally preserve the earlier phrases → Gandhari manuscripts are a middle ground with minor evolutions → Sanskrit editions continue the evolution with even more elaboration in the stock phrases. This is more data in support of Proto-Pali Prakrit EBTs as the core of the textual tradition.
Video Presentation - Mark Allon

There is no evidence AFAIK to claim that the Buddha identified a set of his close followers as forming part of his exclusive inner-circle, to teach them using a different language exclusive to that inner circle. Even if we presume it for argument’s sake, his inner circle wouldn’t have been a static circle - so he couldn’t have been continually teaching them a new language and expressions before (or as a part of) teaching them Buddhism. Hence being contrary to available evidence and reason, I’m unable to see the logic in this argument.

Pali is an exclusively Buddhist language beyond reasonable doubt - the wider-society of the Buddha’s time (except his close followers) were almost totally pre-Buddhist. Pali being a language exclusive to Buddhism could not also simultaneously have been a pre-Buddhist widely-spoken lingua-franca / koine / ethnic language independent of Buddhism. O.V.Hinuber calls it an artificial language.

When you say “translation” - there seems to be an a-priori (pre-conceived) assumption here that Pali was a pre-existing language that could be translated into from another language - like in a normal translation project between two pre-existing languages, say from German to English. We cannot assume that Pali was a pre-existing language, because the evidence shows that the Pali language (and its linguistic uniqueness - including its ‘artificiality’ as O.V.Hinuber puts it) originates with the Pali canon itself i.e. the people who were redacting the Pali canon for the first time were in effect defining the linguistic character of the Pali language ab-initio.

Therefore there was no ‘translation’ as such, there was merely a transliteration from the older (Kharoṣṭhī) script to the newer (Brāhmī) script - and the redactors were following a set of phonetic and morphological principles to uniformize the text into what we now call ‘canonical Pali’ but which they didnt call by that name (as they didnt see it as a separate language back then). It was still word-by-word the same as in the underlying language, so it was still ‘buddha-vacana’ (the words of the Buddha) but written in a new script and using new phonetic and morphological conventions.

The underlying language of the discourses contained in the Pali canon (assuming those discourses were really historically spoken) was however a real widely-spoken natural language, and that ‘level 0’ language must have therefore necessarily existed. Since that underlying language was not a Buddhist-only language like Pali, it cannot be called Proto-Pali (as that name would once again semantically funnel it into the ‘Pali’ tradition) and make it sound exclusive to that tradition, which it was not.

Thank you again for your insightful comments.

I admit that I am not nearly knowledgeable enough about this topic to really present a hypothesis.

If you haven’t done so already, it might be worthwhile to have a look at Stefan Karpic’s paper, “The Buddha taught in Pali: A working hypothesis” (2019).

He refers to Pali as a “sociolect of the educated”

The idea is that Pali would have “ensured mutual intelligibility within the Sangha. “

1 Like

This is just a matter of conventional terminology; it has no bearing on the argument itself. Whether we call it ‘Madhyadesa Prakrit,’ ‘Proto-Pali,’ ‘Pre-Buddhist Koine Prakrit,’ or anything else, the hypothesis is the same: there was a Prakrit language underlying the EBTs that the Buddha spoke which precedes Pāli (a Buddhist canonical Prakrit) and naturally ‘evolved’ into Pāli.

One reason ‘Proto-Pali’ is preferable or useful is because Pāli is the only Indic Prakrit which preserves an entire canon of EBTs and has been the standard representative of this corpus in Buddhist studies for centuries. There is also evidence that Pāli is the closest register to that underlying EBT Prakrit we have documented, but this is uncertain.

Part of this evolution would be the orthographic conventions you have mentioned (among others). I believe it also would have been regional phonetic changes as well. For example, in the Ashokan edicts we sometimes see inconsistent orthographic conventions within the same edict, because of the writers trying to represent the phonetic manifestation of a Prakrit word while also seeming to try to maintain some vague idea of a standard or correct pronunciation at the same time. That is, a mix between representing regional pronunciation with more ‘standard’ (i.e. from the capital?) pronunciation of the same cognates.

This also explains a lot of the variations we see in Pali texts. Sometimes it’s ‘tvam,’ sometimes ‘tuvam,’ sometimes one ending and sometimes a variant one. This seems to be a natural effect of mutually intelligible Prakrit registers passing through various regions of India/Sri Lanka and being written down or semi-standardized in different places and times.

All of that said, I do agree that a Pali-centric bias can be (or has been) somewhat problematic in studying Buddhist texts, and so it’s true that it may be more appropriate to use a more inclusive name to account for Gandhāri, etc. Maybe ‘Proto-Buddhist-Prakrit,’ i.e. a Prakrit register which precedes Buddhist-exclusive languages like Gandhari, Pali, and BHS — but which was not a Buddhist-exclusive Prakrit itself. ?

BTW, Stefan Karpik addresses the -tvā absolutive and other ‘anomalies’ in his article(s). Greiger and Levman have written fairly extensively on an underlying koine Prakrit to later Buddhist registers.



The Proto-Pali that you’ve described in the above 2 posts, is an imaginary language that appears to be arbitrarily assigned most of the characteristics that Classical Sanskrit (i.e. late-Vedic) had historically.

As i said in point 2 of this post there is no need to invent a parallel-origin theory for the underlying language of early buddhism (and this is exacly what you are trying to do). I therefore dont see the possibility of such a proto-pali being discovered or even philologically reconstructed anytime soon.

In the online manuscript catalogue of the National Mission for Manuscripts of India (which lists down the surviving manuscripts in various national and private collections throughout India) I find that out of a total collection of 5,091,431 manuscripts in all languages put together – a total of 496 manuscripts catalogued as being in Pali, 1430 in Ardhamagadhi, 1429 in Apabhramsa, 1605 in Magadhi, 76,670 in other forms of prakrit, and 2,396,252 manuscripts in Sanskrit. That would give you an approximate idea of the relative linguistic footprints of those languages (going by the manuscript heritage).

It is among the languages for which historical records exist (or are known to have existed based on literary evidence) that we are most likely to find the dominant underlying language of the EBTs.

Let’s therefore talk about historically-attested languages or languages mentioned in historical literature instead, to remain anchored to what is real.

Perhaps you have misunderstood me (I didn’t say anything about Sanskrit itself getting completely lost). My claim is nothing more than saying (with several commonsensical reasons) that the underlying language of the EBTs is late-Vedic Sanskrit - earlier in the thread, I cited an incident recorded in both Pali and BHS sources of a monk called Soṇa Kuṭikaṇṇa reciting some very early EBTs to the Buddha with svaras (pitch-accent), and the Buddha appreciating him for the same. I said that this optional svara-recitation would have been possible only in late-Vedic Sanskrit - as accents are compulsory in the Chandas (i.e. Early-Vedic), and are non-existent in every known form of Middle-Indic. I also laid some other common-sense criteria by which we could narrow-down the language of the Buddha.

Gandhari (in early Kharoshthi script) & Pali (in early Brahmi script) in my understanding represent the earliest Buddhist “written” registers of that late-Vedic Sanskrit. Early Buddhists over the next few centuries eventually restored the written registers back to Sanskrit (but not all such restorations were 100% accurate, there still remained some inaccuracies - this has come to be known as BHS in the modern era). It is worth mentioning that Gandhari, Pali and BHS had no separate names in Ashoka’s era, as they were not historically viewed as independent languages separate from Sanskrit. The Pali commentaries (composed over a 1000 years after the Buddha) relied almost exclusively on Paninian Grammar to explain grammatical aspects of the canonical Pali .

The Śakyas (if by that you mean the historic Achaemenid Persians) were an Irano-Aryan clan which was practically a sister language family to the Indo-Aryans, both together formed the Indo-Iranian sub-family of Indo-European. In the Indian subcontinent, the Indo-Aryans formed the vast majority of the population (approximately 80%). The Muṇḍa-language family, being about 0.5% of the population of the subcontinent, were/are restricted to the easternmost-borderlands of India, and are part of the Austro-Asiatic language family, whose native linguistic centre of gravity lies in Vietnam and Cambodia well outside the Indian subcontinent; and the Dravidian language family (who being about 20% of the population of the subcontinent, are/were restricted almost entirely to South-India) had no historical connections with the Śakyas who never inhabited Southern India, so I don’t see the reality behind your assertions.

How exactly would that have been the case - and which mixed-ethnicity communities and tribes of the Buddha’s era and locations are you talking about? Where are those mixed-ethnicity communities attested in the Pali canon or other EBTs? It would be helpful if you substantiate your argument further with names, places and references so I can gain a reasonable sense of what you mean and on what basis you make such conclusions.

I dont see how a widely spoken natural language can naturally evolve into a Buddhist-only language - How exactly does that ostensibly natural transition from being a widely spoken natural language into an exclusively Buddhist language take place? What happens to the rest of the speakers?

I dont follow the logic. If you accept that the pre-existing underlying language of the EBTs was necessarily pre-Buddhist and non-Buddhist - for defining the identity of that language, how does it matter what the future of Buddhism eventually turned out to be?

OK thanks - Geiger is long dead, but who is Greiger?