Proofreading the Mahavastu

no problems, take you time. And I hope the meetings go well.

Bhante. I’m currently up to p. 80 of the Mvu. The verses are now starting to become more common. Looking at the remainder of Vol. 1, verses appear to make up to between a quarter to a third of the entire text (perhaps more).

I’m having an issue with our original approach:

A lot of the ‘verse material’ (ie, the text which is indented and italicised in the pdf) is starting to appear is less poetic in structure and more like prose and breaking long verse lines into shorter lines is, in places, starting to become a bit ‘unnatural’.

For example, following our convention, the following verse:

She went, and wandered forth with her women, roaming the forest, glad and happy and eager. While she paced the forest, she espied a tree bearing fresh creepers and shoots, and, in the rapture of perfect joy and gladness she grasped a branch of it, and playfully lingered there. As she held the branch she gave birth to the Conqueror of the unconquered mind, the great supreme seer.


She went, and wandered forth with her women,
roaming the forest, glad and happy and eager.
While she paced the forest, she espied a lumbini tree
bearing fresh creepers and shoots, and,
in the rapture of perfect joy and gladness
she grasped a branch of it, and playfully lingered there.
As she held the branch she gave birth
to the Conqueror of the unconquered mind,
the great supreme seer.

Not only does it take quite a bit of time shortening the length of the lines, I am not sure it is necessary. The original text does not display the verses in the way we are doing it and it can manipulate the verses sometimes in ways that, I think, change the interpretation slightly (maybe not so much with that example, but for some other verses where I have shortened the lines, I have felt that I am changing the context slightly).

You said:

However, I have found that with < blockquote >, regardless of whether the lines are shortened or not, when viewed in the browser, the distance the left margin is indented remains the same. (But perhaps I’m missing a certain point here?)

So, I am recommending that for verses that are less poetic and more prose like, we do not shorten the lines. In fact, I am wondering whether we should shorten any of the lines at all; this would be in accordance with the original text and save quite a bit of time.

I recommend this cautiously however, as I may be missing something real obvious here that I overlooked or was not aware of.

Well, it’s up to you. But I’d still prefer it when it was possible. But don’t sweat it if it’s becoming a hassle.

This is the default behavior for <blockquote>. However our verses use a CSS trick to center on the longest line, following a guideline for typography. (originally based on this, but this is out of date and it’s much easier in modern browsers.)

Perhaps we could introduce a new form of verse markup, something like “unbroken-verse”, to cater for such situations. Some of our other translations are in a similar situation, for example the English Dhammapada.

The thing is, it doesn’t matter too much in an all-verse text. But in a mixed text, we want to be able to distinguish verse from prose.

Bhante. Update to html file. Up to line 1887. (620.9 KB)<a class=“attachment”


Could I suggest for this one that what was probably meant was “attained the deathless”, as a synonym of nibbaana?

Stu, would you like any help with this project? We could divvy up the sections.

I was helping Bhante Sujato out with the Buddhist Mythology Wikipedia article but that’s a bit more intensive than I have time for at the moment (as far as I understand what needs to be done).

Or if you want to take this on your own I understand.

Hi Matt. I am more than happy for you to work on this.

The original HTML file we are editing may be found in the very first post in this discussion topic above.

I am currently working from Vol. 1 of the Mvu.

If you would like to pick up from Vol. 2 that would be great.

If you haven’t already, may I suggest you read the posts in this topic… there is some useful information there, especially about the Pali-English keyboard and the use of an HTML text editor such as Sublime Text.

Yep, I agree that would work better, however, after some discussion, Bhante Sujato and I decided that we would not alter the words of the original text. Even though we may be able to improve it, our goal at the moment is to complete the task of fixing up the html file so that there are no errors in it.

Once this is done, a new project to do a better translation of the Mvu could be done if anyone is interested in doing so.

Ok, nice. I’ve uploaded your latest revision to a server I’ll be working off of:

(Added a html id attribute to the section that starts Volume 2 for easy linking, this can be removed later. Also added html meta robots tags so that the page isn’t archived or indexed by search engines (probably not a big concern). To both of these changes I’ve appended a html comment with my name so that these and any others that would need to be returned to later can be easily found.)

Any changes I make to the file locally will be pushed to that server so should be able to keep track of progress by visiting that page I linked to.

Hi Matt. Is it possible to have a phone hook up to discuss some of this work? I live in Canberra, Australia.

PMed you.

What’s the standard operating procedure for lists? There’s a line that reads, “And, monks, whatever family a Bodhisattva is born in is endowed with sixty qualities.”; and goes on to list the sixty qualities.

I have not encountered lists yet but I think I would probably use < blockquote >.

<blockquote> should be reserved for verses, and extremely rare cases where an quoted insertion is actually present in the text. I only know of one such case in the Pali canon, the Bakkula Sutta, which has interjections attributed to the elders of the second council (!).

For lists, in most cases just treat them as an ordinary paragraph with comma separation. If you want to present them as a list, use the proper HTML list markup:

<li>List item</li>
<li>Second list item</li>

And so on. This is for numbered lists, for unordered lists use <ul>.

However I would recommend restricting this to a very limited set of cases. Essentially, lists should be used when the spatial presentation of data assists comprehension. This is not really the case in most lists, where you just read the items one after the other.

Where lists become useful is in the “wheels” (Pali cakka) which you find sometimes in the suttas, but very frequently in some parts of Vinaya and Abhidhamma. A typical example is structured like this:

  • A but not B
  • B but not A
  • Both A and B
  • Neither A nor B

In this kind of case it can be confusing to parse out the relations between the elements when they are simply presented as flowing text. However as a list the formalism becomes clear. Note that this only works in the case of fairly short items.

If you’d like to use lists in any other kind of context, best let me know first and we can sort it out. Cheers!

1 Like


Not completely clear on what should be considered a verse, but I suppose Stu can explain that when we talk.

I recognize that logic formula from Nāgārjuna, very cool to learn it has a basis in the early suttas!

1 Like

The main thing is the proofreading, we can adjust the markup later, although it’s good to be clear from the start. I think Stuart is mostly not marking verse, in fact, as the original text doesn’t. I think the verse passages are indicated with italics in the original. It’s not such a big deal.

O, yes, very much so. We find it quite commonly in philosophical passges, but also in general use. There’s a sutta that talks about different kinds of marriage partners (AN 4.53):

Chavo chavāya saddhiṃ saṃvasati, chavo deviyā saddhiṃ saṃvasati, devo chavāya saddhiṃ saṃvasati, devo deviyā saddhiṃ saṃvasati.

Where chava means “corpse”. So I translate as:

  • A male zombie living with a female zombie;
  • a male zombie living with a goddess;
  • a god living with a female zombie;
  • a god living with a goddess.

That verse on marriage partners - lol! (621.8 KB)
Up to “The Tenth Bhūmi”.
Posted here as a back up.

1 Like

Fantastic, thanks so much.

Hi Bhante. Checking in with an update. Up to “Appartitions”, p. 140. (622.7 KB)

1 Like

Fantanstic, @stuindhamma

@sujato the attached is an updated version of the Mvu HTML file. (622.7 KB)


This latest updated version contains:

  1. General corrections up to p. 164 of Vol. 1 (of 3) of the Mvu pdf files (used as the reference for corrections to the HTML file).

  2. Changes in format of single quotation marks and apostrophes. An explanation about this is below.


Double quotation marks appear in the original HTML file as a mixture of ‘’ (double dashes) – as Courier would display them, or as “ (sixty-six) or ” (ninety-nine) – as Times New Roman would display them.

As I am going through correcting the HTML file, I am changing all ‘’ (double dashes) to “ (sixty-six) or ” (ninety-nine).


When I started this project I was only changing double quotation marks not single quotation marks or apostrophes as these were all consistently represented as Courier like dashes (’) rather than Times New Roman like sixes (‘) or nines (’).

However, to be consistent, I have now changed all the single quotation marks and apostrophes as well (from dashes to a so-called six or nine). There are over 2000 such punctuation marks in the HTML file and each one had to be inspected before being replaced as sometimes a ’ was not a quotation mark or apostrophe but simply a result of an OCR error.


All the previous HTML files posted here should no longer relied upon as the updates in this latest version contain back-changes to previously updated text. Additionally, whilst I was looking back over the HTML file I found some errors which I missed the first time. The lesson to take away from this is that even though many careful hours have been spent updating the HTML file, unfortunately, there will be errors that I miss. Although the errors were not huge or glaring and should not affect the reader’s understanding of the text, they are errors all the same. (Sigh!)

1 Like