SuttaCentral

Testing readability of sutta translations


#1

How important is readability?

A simpler text will be more widely read and better comprehended. Here are two classic studies, quoted from Wikipedia.

In 1947, Donald Murphy of Wallace’s Farmer used a split-run edition to study the effects of making text easier to read. They found that reducing from the 9th to the 6th-grade level increased readership 43% for an article on ‘nylon’. There was a gain of 42,000 readers in a circulation of 275,000. He found a 60% increase in readership for an article on ‘corn’. He also found a better response from people under 35.

In 1948, Bernard Feld did a study of every item and ad in the Birmingham News of 20 November 1947. He divided the items into those above the 8th-grade level and those at the 8th grade or below. He chose the 8th-grade breakpoint because that was the average reading level of adult readers. An 8th-grade text “will reach about 50 percent of all American grown-ups”, he wrote. Among the wire-service stories, the lower group got two-thirds more readers, and among local stories, 75 percent more readers.

George Klare’s studies showed that “an easier style of writing may result in (a) greater and more complete immediate retention, (b) a greater amount read in a given time”.

A recent study of web writing for low literacy users showed that “Lower-literacy users exhibit very different reading behaviors than higher-literacy users: they plow text rather than scan it, and they miss page elements due to a narrower field of view.” This is not a marginal problem: in the US, it is estimated that 30% of web users are of lower literacy. User testing showed that when text was simplified and optimized, readers were much more likely to successfully complete it, they took much less time to do so, and their satisfaction was much higher. While these results were most significant among readers with lower literacy, even high literacy readers showed major improvements in all three areas. The study concluded that usability doubled a website’s ability to achieve its goals. Another study confirms that highly educated readers strongly prefer simpler texts.

Readability is, of course, only one criteria. It is crucial that a translation be accurate, and that it employ good style. A well-written text has a certain flavor that is impossible to define. The purpose of these tests is not to dismiss or downplay the relevance of these factors, but simply to introduce another way of looking, one which, to my knowledge, has not previously been applied to Buddhist texts.

Testing Readability of Sutta Translations

I ran a range of tests comparing the SuttaCentral translation by myself with the same text translated by Bhikkhu Bodhi. Bhikkhu Bodhi’s translations are chosen as the standard of comparison due to their excellent style, consistency, and widespread acceptance as highly readable and accurate translations.

Test results consistently rate the Sujato translation as significantly more readable. On the whole it is about two grade levels simpler.

Below is a table with the results. It goes without saying that such tests are only a rough approximation. Nonetheless, they are widely used in fields where comprehension and readability are critical. They strongly correlate with comprehension.

The Anguttara Fours is the test text. This is the most recent of the Bodhi translations.

For the test, footnotes and the like, as well as numbers, were removed. The last texts were left out, as they trail off into repetition series. Each service gives slightly different results, due to different ways of counting words, and so on.

The statistical results show that the Sujato translation is simpler in virtually every metric, including word length, number of unique words, number of complex words, and unusual words (i.e. not found in Dale-Chall word list). However the greatest difference is in sentence length. The average sentence length in the Sujato text is a little over three-quarters that of the Bodhi text. Sentence length has been consistently found to be one of the most important aspects of readability, and hence is weighted highly in most of the grade tests.

Not only is the Sujato text more readable, it is about 83% of the length. This is because it tends, on the whole, to omit more repetitions, and also to use less wordy phrasing.

Sujato Bodhi Comments
Tested on AN 4.1–4.248
At CheckText
Reading Ease
Flesch Reading Ease 68.8 60.0 Higher is easier to read.
Grade Levels
Flesch-Kincaid Grade Level 7.5 9.7 Average US student grade level. For all grade levels, lower is easier.
Gunning Fog Index 10.7 12.5
Coleman-Liau Index 9.1 9.9
SMOG 10.7 11.8
Automated Readability Index 8.9 11.3
Average Grade Level 9.4 11.0
Statistics
Word Count 64994 78282 This is about 83% the number of words to translate the same texts.
Character Count 313170 380730
Lexical Density 47.1% 43.8% This measures the ratio of content words to grammatical words. Lower is easier. However under 50% is quite low already. I suspect Bodhi’s is lower because he tends to use less direct phrasing, eg. “with the breakup of the body” vs. “when the body breaks up”.
Unique Words 3866 4070
Complex Words 7450 9213
Syllable Count 94127 117475
Sentence Count 4242 4005
Characters Per Word 4.8 4.9
Syllables per Word 1.4 1.5
Average Sentence Length 15.3 19.5
At Readability Score
Readability Indices
Flesch-Kincaid Reading Ease 70 63.9
Flesch-Kincaid Grade Level 6.9 8.8
Gunning-Fog Score 9.8 11.8
Coleman-Liau Index 11 11.3
SMOG Index 7.6 8.7
Automated Readability Index 7 9.2
Average Grade Level 8.5 10
Text Statistics
Character Count 295626 361929
Syllable Count 94126 116150
Word Count 64890 78756
Sentence Count 4678 4395
Characters per Word 4.6 4.6
Syllables per Word 1.5 1.5
Words per Sentence 13.9 17.9
At Read-Able
Readability Indices
Flesch Kincaid Reading Ease 67.9 61.8
Flesch Kincaid Grade Level 7.6 9.5
Gunning Fog Score 10.5 12.5
SMOG Index 8 9.1
Coleman Liau Index 11.3 10.9 The same test on two other sites rated the Sujato text as more readable, so I suspect this is a flaw in the testing program.
Automated Readability Index 7.8 9.9
Text Statistics
No. of sentences 4244 4015
No. of words 64225 79706
No. of complex words 7837 9777
Percent of complex words 12.20% 12.27%
Average words per sentence 15.13 19.85
Average syllables per word 1.46 1.47
Tested on AN 4.1 and AN 4.2 only
At Readability Formulas
Number of words NOT found on Dale-Chall Word List 76 106 This site only tests up to 600 words. However I included it as it was the only one to do the Dale-Chall test, which is among the most sophisticated and reliable of all tests.
Percent of words NOT found on Dale-Chall Word List 13% 26%
Final Score: 6.5 8.4
New Dale-Chall Grade Level 7–8 11–12

A "conversational" rendition of the Dhammacakkappavattana
#2

Well done bhante. I look forward to reading it and have a feeling it will serve as an excellent source for simpler and more accessible translations in other languages as well.
:anjal:


#3

Can these tests be applied to other language than English (eg. Indonesian)?


#4

Bhante, Thank you for all the hard work you put in. My main concern is understanding the overall message of the teaching. Take for example, the word bhava. Is it existence or life. Existence is vague since it carries connotations of being momentary. Life is the period between birth and death. Or, is bhava a hybrid of both?.
Therefore, translation should facilitate understanding the teaching without ambiguity. That is what I expect.
Thanks again Bhante,
With Metta


#5

I am not sure. The tests themselves are fairly simple, based on word length, sentence length, and so on, so it would not be difficult to adapt them to other languages. I guess it’s likely this has been done, but I can’t confirm it.

Indeed, it is one of the most difficult terms to get right. Both “existence” and “life” capture something of the term, as does, in some cases, “rebirth”. But it is important that the meaning be clear in each context. I’ve done my best!

I agree, especially when it comes to the Dhamma. The Buddha was an extraordinarily clear teacher, and unlike many spiritual teachers he did not rely heavily on ambiguity or mystery. Occasionally, it is true, in verses or a riddle here and there, there are passages that are meant to be puzzling, and sorting out the meaning is half the fun. But in almost all cases, the meaning is explicit and clear, and should be translated as such.


#6

Does this mean that Ven. Bodhi’s translations are PhD level, whereas yours are merely at Master’s? :grinning:


#7

As usual, you manage to get right to the essential point.

Be careful, though, I might just run the same tests on your translations, and then what?


#8

My two baht is that this is such a significant achievement. Making these important and critically needed teachings of the Buddha not just available globally on the internet, but comfortable and understandable, is a significant accomplishment. I’ll be the first to admit that I have had the Nikayas in Ven. Bodhi’s book form gathering dust on my bookshelves, at times. Many years ago, I’d thumb through ATI, but often was left adrift. With Sutta Central, the Suttas came alive for me. The discussions on D&D illuminated the translations and enlivened the practice. Whatever grade level they’re set at is a good fit for me.

The Buddha feared that one day his Dhamma would fade away. I do feel that had he known of the project called Sutta Central, he might have changed his prediction a bit.


#9

I deeply enjoy and benefit from the use of the simplest language which can well convey the thoughts.

Once I ate dictionaries, adorned my speech with technical jargon, to seek and cling to status. When I realized I only encumbered myself and others, placing barriers and hoops between myself and those with whom I wanted to communicate, I consciously choose to adopt better habits. This took time, but was an excellent investment.

Sometimes only Pali will do. Realizing this, I anticipate some concentrated effort to come.


#10

Ajahn @brahmali would this means B. Bodhi’s translations only suitable for “Permanent head Damage”??? and Bhante Sujato’s would be a bit too easy for MAMA Brahmali? ( Ajahn Brahmali’s nickname from Ajahn Brahm). :joy:
Thank you for the laughter to start the year of the Dog.:dog::dog2:
:anjal:


#11

Try writing for year 4-5 children- that’s a real test of one’s understanding of the dhamma.

I had to cut down many translations right down the bone, and then some -there’s a girl with speech and language delay that has to be accommodated and I struggle- sometimes I have just used visual images.

Well done for the high score though- quite a remarkable achievement! I think every generation needs its own wave of translators, quite possibly.

with metta


#12

I agree that readability is very important. My personal impression is that Bhikkhu Bodhi’s translations are much clearer than both the older PTS translations and the vast majority of contemporary translations. And, judging from the Sujato translations I’ve seen, such as MN10, those are even more readable.

I suspect that some of the reservations about these new translations are not really about clear and simple sentences. I don’t see how a clearer sentence can ever be a negative! However, there are subtleties that the previous translations highlighted via footnotes. Bhante Sujato has argued compellingly that he wants to create texts that can be read by newcomers without footnotes. Since the Bodhi/Nanamoli/Walshe translations are available with copious footnotes, and the on-line versions will allow rapid comparison with the Pali texts, this seems like a good call.

I understand that there will be short essays on key technical terms, and cover most of these issues:

Apart from discussion about technical translation issues, one of the purposes of footnotes in the PTS/Wisdom translations is to point out where certain issues, or people, are discussed elsewhere in the Tipitaka. I presume that some of that will become almost automatic with the new on-line system, and concordances could be rapidly developed.


#13

Definitely. See, for example, these random search results: https://scholar.google.com.au/scholar?q=readability+formulas+indonesia&hl=en&as_sdt=0&as_vis=1&oi=scholart