Statistical analysis of Early Buddhist Texts

We haven’t heard back from Animitto, so I’ll just make some general remarks.

  • If anyone wants to do statistical analysis of Pali texts use GitHub - suttacentral/bilara-data: Content for Bilara translation webapp.
    • Segmented translations in English are also found there, as well as a growing collection of other languages.
  • For remaining texts, use sc-data/html_text at html-clean5 · suttacentral/sc-data · GitHub
  • If you want more precise or specialized information than a regular search engine provides, clone the git repo and search it locally using Sublime text or some other tool.
  • To export texts into a spreadsheet, use Bilara i/o. Bilara i/o
  • The main SC search uses elasticsearch, SC-Voice uses ripgrep, while our translation webapp Bilara uses ArangoDB. All these have advantages and disadvantages, so you may get somewhat different results.
7 Likes