Pootlization of the Vinaya

@Vimala has embarked on the difficult project of adapting @Brahmali’s Vinaya translation for Pootle. this thread is for us, with @blake, to discuss various issues this raises.

Here’s Ayya’s first email.


I started working with these pootle files and have some questions.

First of all I have segmented both the pali and Ajahn Brahmali’s text and use the pali.po to add Ajahn Brahmali’s as a “msgstr”.

So for instance, I then get something like this:

#. </h2><p class="namo"><a class="sc" id="1"></a>
msgctxt "pi-tv-bu-vb-intro:6.1"
msgid "Namo tassa Bhagavato Arahato Sammāsambuddhassa."
msgstr "Homage to the Master, the Perfected One, the fully Awakened One"

What do I do with coding elements in both files? From the pali, most of the coding is added above the text (as above), apart from the occasional notes that stay inside the pali text “msgid”, but do I simply delete all the coding from Ajahn Brahmali’s text? And what do I do with quotemarks that are added in the English and other typography-marks.

This example has both a <span> in the pali and an <a> tag in the English as well as a quotemark in the English.

msgctxt "pi-tv-bu-vb-intro:7.5"
msgid "itipi so bhagavā arahaṃ sammāsambuddho vijjācaraṇasampanno sugato lokavidū anuttaro purisadammasārathi satthā devamanussānaṃ buddho <span class=\"var\" title=\"bhagavāti (s1-3)\" id=\"note1\">bhagavā</span>’, so imaṃ lokaṃ sadevakaṃ samārakaṃ sabrahmakaṃ sassamaṇabrāhmaṇiṃ pajaṃ sadevamanussaṃ sayaṃ abhiññā sacchikatvā pavedeti, so dhammaṃ deseti ādikalyāṇaṃ majjhekalyāṇaṃ pariyosānakalyāṇaṃ sātthaṃ sabyañjanaṃ;"
msgstr "‘He is indeed a Master, a perfected one, fully awakened, complete in true knowledge and conduct, accomplished, knower of the world, unsurpassed trainer of tamable people, teacher of gods and humans, the Awakened <a class=\"pts-vp-en\" id=\"BD.1.2\" href=\"#BD.1.2\">BD.1.2</a> One, the Master. With his own insight he has realized this world with its gods, its lords of death, and its supreme beings, this population with its ascetics and brahmins, its gods and humans, and he makes it known to others. He expounds a Teaching that is good in the beginning, good in the middle, and good in the end, with the right meaning and phrasing."

In the above case for instance, the pali ends in “;” but goes on in the next line. The above example also has four sentences in the English while the pali only part of a sentence. (ending in “;”). So is it best to either split the pali to fit the English sentences or add the whole pali sentence as well as the English translation to this item. So basically the question is: how do I best split up the sentences?

And how about elements that Ajahn Brahmali has not translated like certain headings?

#. </p><p>
msgctxt "pi-tv-bu-vb-intro:3.1"
msgid "Pārājikakaṇḍa"
msgstr ""

#. </p><h1>
msgctxt "pi-tv-bu-vb-intro:4.1"
msgid "<span class=\"add\">Pārājika 1: </span> Paṭhamapārājikasikkhāpada"
msgstr ""

Do I just add the translation myself?

So nice to see! I’m really excited by this. But you know, in a calm and tranquil way.

Only if you can put it back. Keeping as much stuff as possible outside the text is good.

Having HTML code in the PO files is not a problem in and of itself. It is, however, a problem in the source text files, as it tends to ruin the matching. I would suggest stripping variant readings and other such HTML from the source code; I’ve discussed this with Blake; I can’t recall now, were you part of that conversation? if not, I’ll forward it to you.

Whatever way is simple and works.

Having said which, it would be nice to keep consistency with the suttas. However, most of the Vinaya texts do not, in fact, have direct parallels in the suttas, so it doesn’t really matter. It would be nice, though, if the passages that are in common with the suttas (such as your example) are split the same way. Still, i will be revising the sutta segmentation, so we could match them up at that point.

Don’t worry about it. The punctuation in the Pali and English versions will be different (and the pali itself is not consistent). Generally we aim for shorter sentences than the pali, where possible. The main criterion is that the source text and translated text (more or less) match up. As a rough guide, it’s best that each segment be a shortish, meaningful, and potentially reusable string (but this can’t always be done). We use punctuation to aid in creating segments, but only if it’s useful.

(Count the caveats in that paragraph!)

Here is how I break up this passage:

‘That Blessed One is perfected, a fully awakened Buddha, accomplished in knowledge and conduct, holy, knower of the world, supreme trainer of people, teacher of gods and humans, awakened, blessed.’

He has realized with his own direct knowledge this world—with its gods, Māras and Brahmās, this population with its ascetics and brahmins, gods and humans—and he makes it known to others.

He teaches Dharma that’s good in the beginning, good in the middle, and good in the end, meaningful and well-phrased. And he explains a spiritual practice that’s entirely full and pure.

But like I said, I wouldn’t worry too much about it right now. The main thing is to get the basic matching done, we can always refine it later.

As for the <a> tag, this is a vol/page ref for the PTS english translation. This is valuable information, we should preserve it. I would suggest shifting it to the comments, alongside the HTML pulled from the Pali text. It can be used to enrich the references for the Pali text as well. The translation also has “cs” numbers, which stand for “chapter/section”. these should also be retained. The Pali PTS v/p numbers, however, are already in the pali, so these can be deleted.

Note that by this method, all the refs will be applied at a segment level, which loses a little granularity (since they are currently inline). This is not critical, but it does remind us that it’s best for segments to be not too long. Basically, keep them as short as practical.

No, don’t add or change anything in the translation. If there is some issue, ask Ven Brahmali.

But in this case, if something is untranslated, leave it that way. (Pali texts are like Shakespeare: no-one really does the whole thing.)

1 Like

Thanks. I will be able to get on with that.
Will send you a sample of the first chapter once I have done that to check.

pi-tv-bu-vb-intro.po.zip (15.7 KB)

Herewith the first intro chapter that Ajahn Brahmali sent me today. Can you see if this is OK or do I need to make adjustments?

Wow, it looks perfect! May I ask how you did it?

$python sc-html2po.py of both the pali text and Ajahn Brahmali’s files and then just merge them by hand.

1 Like

Did you have to adjust the segments very much?

A little. The English tends to break a lot more often, partly because of the <a> tags in there. So I have to figure out which bit of texts fits where.
There are a few things I’m a bit unsure about like if I split the pali correct in 43.2 and 43.3.
And 35.1 to 38.2 - these sections are repetitions of the same thing with just one word different, which Ajahn has put into one sentence. So not sure if what I did is the best way to go about such cases.

It’s perfect.

There’s always marginal contexts like this. What you’ve done is great, I would do the same thing.

Occasionally I find it’s convenient to include all the terms in one segment, even if the Pali is split over several. Not perfect, but what to do? The texts are complicated!

I can’t wait to see these on Pootle. I’ve already changed one of my renderings, because Brahmali’s was better. I had “head split open” and he has “head explodes”. Much nicer!

So how about this one? Leave it as I’ve done it now?

#. </p><p><a class="sc" id="4"></a><a class="ms-pa" id="MS.44" href="#MS.44">MS.44</a>
msgctxt "pi-tv-bu-vb-pj1:9.1"
msgid "Dutiyampi kho sudinno kalandaputto mātāpitaro etadavoca—"
msgstr "Sudinna asked his parents a second"

msgctxt "pi-tv-bu-vb-pj1:9.2"
msgid "“ammatātā, yathā yathāhaṃ bhagavatā dhammaṃ desitaṃ ājānāmi, nayidaṃ sukaraṃ agāraṃ ajjhāvasatā ekanta­pari­puṇṇaṃ ekanta­pari­suddhaṃ saṅkhalikhitaṃ brahmacariyaṃ carituṃ;"
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:9.3"
msgid "icchāmahaṃ kesamassuṃ ohāretvā kāsāyāni vatthāni acchādetvā agārasmā anagāriyaṃ pabbajituṃ."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:9.4"
msgid "Anujānātha maṃ agārasmā anagāriyaṃ pabbajjāyā”ti."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:9.5"
msgid "Dutiyampi kho sudinnassa kalan­da­puttassa mātāpitaro sudinnaṃ kalandaputtaṃ etadavocuṃ—"
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:9.6"
msgid "“tvaṃ khosi, tāta sudinna, amhākaṃ ekaputtako piyo manāpo sukhedhito sukhaparihato."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:9.7"
msgid "Na tvaṃ, tāta sudinna, kiñci dukkhassa jānāsi."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:9.8"
msgid "Maraṇenapi mayaṃ te akāmakā vinā bhavissāma, kiṃ pana mayaṃ taṃ jīvantaṃ anujānissāma agārasmā anagāriyaṃ pabbajjāyā”ti."
msgstr ""

#. </p><p><a class="sc" id="5"></a>
msgctxt "pi-tv-bu-vb-pj1:10.1"
msgid "Tatiyampi kho sudinno kalandaputto mātāpitaro etadavoca—"
msgstr "and a third time, but got the same reply."

msgctxt "pi-tv-bu-vb-pj1:10.2"
msgid "“ammatātā, yathā yathāhaṃ bhagavatā dhammaṃ desitaṃ ājānāmi, nayidaṃ sukaraṃ agāraṃ ajjhāvasatā ekanta­pari­puṇṇaṃ ekanta­pari­suddhaṃ saṅkhalikhitaṃ brahmacariyaṃ carituṃ;"
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:10.3"
msgid "icchāmahaṃ kesamassuṃ ohāretvā kāsāyāni vatthāni acchādetvā agārasmā anagāriyaṃ pabbajituṃ."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:10.4"
msgid "Anujānātha maṃ agārasmā anagāriyaṃ pabbajjāyā”ti."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:10.5"
msgid "Tatiyampi kho sudinnassa kalan­da­puttassa mātāpitaro sudinnaṃ kalandaputtaṃ etadavocuṃ—"
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:10.6"
msgid "“tvaṃ khosi, tāta sudinna, amhākaṃ ekaputtako piyo manāpo sukhedhito sukhaparihato."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:10.7"
msgid "Na tvaṃ, tāta sudinna, kiñci dukkhassa jānāsi."
msgstr ""

msgctxt "pi-tv-bu-vb-pj1:10.8"
msgid "Maraṇenapi mayaṃ te akāmakā vinā bhavissāma, kiṃ pana mayaṃ taṃ jīvantaṃ anujānissāma agārasmā anagāriyaṃ pabbajjāyā”ti."
msgstr ""

On a separate note, I don’t know if you were thinking about uploading this to Pootle straight away but Ajahn Brahmali has asked to receive the finished files so he can still make changes to them when needed and upload it afterwards.
Also, would it not be best if I upload it myself so I don’t have to post 311 files here?

Yeah, that’s fine.

As for uploading, that depends on how you and Brahmali want to work. I think the issue is that he needs to work offline. Still, that doesn’t mean you can’t uplaod them. Once the file is prepared, it’s easy to up/download. Then I can use it; if Brahmali is making further changes offline, never mind, just upload again when he’s ready.

And yes, you should do it yourself.