On the Language of the Buddha

Some time ago, I’ve investigated the question what modern scholarship as well as the tradition thought Pāḷi is, but it was included in a rather lengthy introduction to my Pāḷi grammar ((PDF) Māgadhabhāsā (Pāḷi) – A Compendious Grammar on the Language of Pāḷi Buddhism (Second Edition, Revised) | Ṭhānuttamo Anāgārika - Academia.edu). Given that material tends to be buried in such long-winded discussions, I thought also of reposting the relevant parts here.

I think that this section of the introduction at least shows that there is quite a spectrum of opinions on the matter, which unfortunately sometimes lose their profile in the passing of time, everyday exchanges, and under the weight of the influence of oftentimes only a few selected scholars. I would like to also give them a voice here for easy access. What also has become clear during my survey is that the traditional narrative – namely that Pāḷi has been an actual language spoken in Magadha (not to be fully equated with the Māgadhī dialect proper) and employed by the Buddha – should not be and has not been dismissed, at least not in an outright fashion. Let me start by giving a short presentation of the modern literature before showcasing the account given in the Pāḷi tradition itself.

Pāḷi – What is it?

The Handbook of Comparative and Historical Indo-European Linguistics (Klein et al., 2017: 318) states: “It is generally accepted that Pāli as known from the Theravāda texts was a lingua franca, not a single individual language particular to one dialect area.” However, the scholarly discussions on the subject matter that have been consulted are of course somewhat more nuanced than that generalizing statement in its depiction of the status quo. They state, more specifically, that Pāḷi is either (a) some form of either a lingua franca, [f.n. 1] koine [f.n. 2] or standard dialect (Geiger, 1916/1956: 4–6; Karpik, 2019: 67; Oberlies; 2007: 183; Roth, 1980: 78; Wynne, 2019: 9–10), (b) some form of a vernacular (Childers, 1875: xiv; Roth, 1980: 78; Warder, 1970/2000: 294) or (c) based upon one of these (Levman, 2019: 64–5, n. 1; Lüders, as quoted by Waldschmidt in Lüders, 1954: 8; Norman, 1980: 66; Rhys Davids, 1911: 53–4).

There is also a dissensus as to the question if Pāḷi predominantly constitutes an artificially crafted language (Gombrich, 2018: 84–5; [f.n. 3] Norman: 65; von Hinüber, 1996: 5 [f.n. 4]) or had developed mainly by natural means (Pischel, 1957: 5). It also has to be noted that the first-mentioned views under (a) above premise some actually spoken basis underlying the Pāḷi language, having been significantly morphed or superseded by contrived structures in the course of time – at least in part – and that the second-mentioned view does not assume that the language was safe from any form of change as it relates to redaction, transmission errors etc. Not one text-critically involved scholar, as far as I am aware of, is of the opinion that the Pāḷi as we know it has undergone no changes whatsoever.

The above-presented traditional accounts, reporting the language as found in the texts of the Pāḷi Buddhist tradition to be māgadhabhāsā etc., are by and large considered incorrect by modern scholars. They adduce, inter alia, the peculiar features of the Māgadhī dialect proper as inferred from the Aśokan inscriptions and the medieval descriptions of it by the Indian grammarians and determined these features to be (a) l instead of r (e.g. lājarāja), (b) a-stems in e for o (e.g. lājerājo) and (c) palatal ś for dental s. However, based upon inscriptional and other evidence, Norman (1980: 68–9) demonstrated that these features were found merely within a relatively restricted area and that it is feasible to regard the home of Pāḷi as being outside the region where the true Māgadhī was spoken but still within Magadha, somewhat in the center of the east-Indian region, not far from Kaliṅga. He considers it feasible that Māgadhī – as depicted within the aṭṭhakathā tradition as the language of the tipiṭaka – is a variant of the Māgadhī dialect proper and that the Buddhist tradition can thus be correct. To similar conclusions came already Winternitz (1908/1981: 40), seeing the Māgadhī dialect proper at the base of Pāḷi, and Geiger (1916/1956: 4), to quote the latter:

A consensus of opinion regarding the home of the dialect on which Pāli is based has therefore not been achieved. Windisch therefore falls back on the old tradition—and I am also inclined to do the same—according to which Pāli should be regarded as a form of Māgadhī, the language in which Buddha himself had preached.

What emerges from the above is that the traditional narrative should not be and has not been dismissed outright.

Commentaries, Sub-Commentaries and Pāḷi Grammatical Literature

The aṭṭhakathā and ṭīkā traditions take the language of Magadha (māgadhabhāsā) to be a natural language – a delightful language indeed (Sv-pṭ: 6). As presented already above, the Samantapāsādikā vinaya aṭṭhakathā (Sp IV: 23) proffers the following annotation of the phrase sakāya niruttiyā as used by two Brahmins in the context of one cardinal (as it relates to linguistics) incident recorded in the vinaya, where they, still attached to things Vedic, complain about the way or language by adopting or use of which the Buddha’s teaching was spoiled: “[…] ‘own tongue’ means the common speech belonging to Magadha (māgadhiko vohāro) in the manner spoken (vuttappakāro) by the Perfectly Enlightened One.”

The 12/13th century CE Vimativinodanīṭīkā (Vmv: 125) interprets the relevant portion of the episode thus: “They ruin (dūsenti) the word of the Buddha with their own language (sakāya niruttiyā) as it relates to the canon (pāḷi): ‘Surely, those of inferior birth who have learned [memorized; i.e. the buddhavacana] corrupt the language of Magadha (māgadhabhāsāya) to be spoken by all with ease (sabbesaṃ vattuṃ sukaratāya)’ – this is the meaning.” The Vinayālaṅkāraṭīkā (Pālim-nṭ: 180) from the 1600’s CE in turn as succinctly as possible glosses sakāya niruttiyā as māgadhabhāsā, the “language of Magadha.”

The Samantapāsādikā (Sp I: 94), on another occasion, equates māgadhabhāsā seemingly with the Aryan language as a whole, thereby possibly referring to a supra-regional language. The indigenous Pāḷi grammars basically concur with the above. The Padarūpasiddhi, for example, mentions explicitly that the Buddha spoke a tongue belonging to Magadha (māgadhika), as recorded in the tipiṭaka (Rūp: 32) – for a detailed discussion concerning themes related to the last-mentioned point, see Gornall (2014). The above is, as we have already seen at the beginning of this chapter, a sensible account of what language the Buddha employed, at least primarily.

In this connection, it appears relevant to mention that the aṭṭhakathā tradition is not just an alternative scholarly opinion but rather constitutes strong additional evidence (cf. Karpik, 2019: 74), as Norman (1983: 119) spelled it out:

[…] some parts of the commentaries are very old, perhaps even going back to the time of the Buddha, because they afford parallels with texts which are regarded as canonical by other sects, and must therefore pre-date the schisms between the sects. As has already been noted, some canonical texts include commentarial passages, while the existence of the Old Commentary in the Vinaya-piṭaka and the canonical status of the Niddesa prove that some sort of exegesis was felt to be needed at a very early stage of Buddhism.

Furthermore, Buddhaghosa’s Samantapāsādikā contains over 200 quotations of earlier material, according to the indigenous tradition harkening back in parts to the first council (paṭhamasaṅgīti) held shortly after the demise of the Buddha (von Hinüber, 1996: 104). Surely, Geiger (1916/1956: 4–6) must have based his deliberations to some extent upon the exegeses of the aṭṭhakathā, ṭīkā and grammatical traditions showcased throughout this section when he wrote:

[…] Pāli should be regarded as a form of Māgadhī […]. Such a lingua franca naturally contained elements of all the dialects […]. I am unable to endorse the view, which has apparently gained much currency at present, that the Pāli canon is translated from some other dialect (according to Lüders, from old Ardha-Māgadhī). The peculiarities of its language may be fully explained on the hypothesis of (a) a gradual development and integration of various elements from different parts of India, (b) a long oral tradition extending over several centuries, and (c) the fact that the texts were written down in a different country. I consider it wiser not to hastily reject the tradition altogether but rather to understand it to mean that Pāli was indeed no pure Māgadhī, but was yet a form of the popular speech which was based on Māgadhī and which was used by Buddha himself.

Whatever the case may be when it comes to the nature of Pāḷi, perhaps Bodhi (2020: 3) is right when suggesting: “If by some unexpected miracle transcripts of the original discourses should turn up in the exact language(s) in which they were delivered, one who knows Pāli well would be able to read them with perhaps 90 percent accuracy.” In thus manner, the scope of modern scholarly assessments concerning the nature of Pāḷi partially extends […].


  1. Merriam Webster (“Lingua franca,” n.d.): “[A]ny of various languages used as common or commercial tongues among peoples of diverse speech.”
  2. Merriam Webster (“Koine,” n.d.): “[A] dialect or language of a region that has become the common or standard language of a larger area.”
  3. Gombrich holds that the Buddha was the progenitor of the Pāḷi language or at least a principle figure as it relates to its creation.
  4. Commenting on von Hinüber’s assessment of Pāḷi as an artificial language, Prof. Oberlies remarks: “The ‘artificial language’ of Mr. von Hinüber goes too far also for me” – “Die ‘Kunstsprache’ von Herrn von Hinüber geht auch mir zu weit” (personal communication, May 3, 2020).

Abbreviations/References (Primary)

  • Pālim-nṭ: Vinayālaṅkāraṭīkā
  • Rūp: Padarūpasiddhi
  • Sv-pṭ: Sumaṅgalavilāsinīpurāṇaṭīkā
  • Sp: Samantapāsādikā
  • Vmv: Vimativinodanīṭīkā

References (Secondary)

  • Bodhi (2020). Reading the Buddha’s discourses in Pali : A practical guide to the language of the ancient Buddhist canon . Wisdom Publications.
  • Childers, R. C. (1875). A dictionary of the Pali language . Trübner & Co.
  • Geiger, W. (1956). Pali literature and language (B. Ghosh, Trans.; 2nd ed.). University of Calcutta (original work published 1916).
  • Gombrich, R. F. (2018). Buddhism and Pali. Mud Pie Books.
  • Gornall, A. (2014). How many sounds are in Pāli? Schism, identity and ritual in the Theravāda saṅgha. Journal of Indian Philosophy , 42(5), 511–550.
  • von Hinüber, O. (1996). A handbook of Pāli literature. Walter de Gruyter.
  • Karpik, S. (2019). The Buddha taught in Pāli: A working hypothesis. The Journal of the Oxford Centre for Buddhist Studies , 16, 10–86.
  • Klein, J., Joseph, B. & Fritz, M. (Eds.) (2017). Handbook of comparative and historical Indo-European linguistics . De Gruyter Mouton.
  • Levman, B. G. (2019). The language the Buddha spoke. Journal of the Oxford Centre for Buddhist Studies , 17, 64–108.
  • Lüders, H. (1954). Beobachtungen über die Sprache des buddhistischen Urkanons (E. Waldschmidt, Ed.). Akademie Verlag.
  • Norman, K. R. (1980). The dialects in which the Buddha preached. In H. Bechert (Ed.), Die Sprache der ältesten buddhistischen Überlieferung – The language of the earliest buddhist tradition (pp. 61–77). Vandenhoeck & Ruprecht.
  • Oberlies, T. (2007). Aśokan Prakrit and Pāli. In D. Jain & G. Cardona (Eds.), The Indo-Aryan languages (pp. 161–203). Routledge.
  • Pischel, R. (1957). Comparative language of the Prākrit languages (S. Jhā, Trans.). Motilal Banarsidass.
  • Roth, G. (1980). Particular features of the language of the Ārya-Mahāsāṃghika-Lokottaravādins and their importance for early Buddhist tradition. In H. Bechert (Ed.), Die Sprache der ältesten buddhistischen Überlieferung – The language of the earliest Buddhist tradition (pp. 78–100). Vandenhoeck & Ruprecht.
  • Rhys Davids, T. W. (1911). Buddhist India. T. Fisher Unwin.
  • Warder, A. K. (2000). Indian Buddhism . Motilal Banarsidass (original work published 1970).
  • Winternitz, M. (1981). A history of Indian literature (Vol. I) (V. S. Sar-ma, Trans.). Motilal Banarsidass (original work published 1908).
  • Wynne, A. (2019). Once more on the language of the Buddha. The Journal of the Oxford Centre for Buddhist Studies, 8–10.

Thanks for the clear summary of a complex issue.

Indeed, the difference between the language of the Pali canon and that of the Ashokan inscriptions is probably no greater than that between dialects of English as spoken in different parts of London.


So the rumour that it was originally a spoken form of Elvish, which was subsequently written down and simplified by none other than Bilbo “The Buddha” Baggins was not true then?

Sorry, I’ll show myself out …


Bhante, have you heard of the Hathigumpha inscriptions from around the 2nd cent. BCE? They are also remarkably close. I once tried to translate a passage from it with the help of a Pāḷi dictionary, and it actually worked. :slight_smile:


Some are discussed in Karpik’s article, and yes, they’re just Pali.

Do you know if there is a digital text corpus of epigraphic Pali? Or epigraphs in general? A bunch of text files would be great …


Unfortunately not, bhante. What just came to mind is the work of Richard Salomon, which is, I guess, nothing new to you. Would be happy to know of it, in any form, if you’d find out more about it. :pray:

1 Like