From our discussion:
- Some files in the name folder have the following structure:
{
"<uid>": "<empty_string>"
}
but this uid
is then found in super-name.json
and if it is, we use the name from there.
Example:
jp-name.json
:
{
"jp": ""
}
super-name.json
:
{
...
"jp": "Jñānaprasthāna", // This will be used as name in the database.
...
}
The same behavior is provided for files with the following structure:
{
"pr": {}
}
// and
{
"mvs": {
"t1545.98": "阿毘達磨大毘婆沙論"
}
}
we can find these uid
in super-name.json
and use their names:
{
...
"pr": "Prajnaptiśāstra",
...
"mvs": "Mahāvibhāṣā Śāstra",
...
}
-
What to do with the
other-group
document, which is a child of thesutta
document in thesuper-tree.json
file? This document prevents the creation of a tree in the database, because there is no such key in the documents collection, and the links between the documents are made exactly by the keys. -
The file
super_extra_info.json
has several documents with spaces in theuid
field. I personally found two of them: a document withuid: sutta
and a document withuid: dk
. You can find them with a file search. It’s not a huge problem right now, but it can make us uncomfortable in general, so it would be cool if someone would fix it. -
The logic of setting the fields
acronym
,volpage
andbiblio_uid
From what we have discussed, the following comes out:
To add these fields to the document we use the filessuper_extra_info.json
andtext_extra_info.json
. First, in the filesuper_extra_info.json
we look for a record where the value of the fielduid
is the same as in the currently processed document. If such a record is found, we take from it the required values. If there is no such record insuper_extra_info.json
then we look for the necessary information in the filetext_extra_info.json
(search logic is the same). -
language
androot_lang
fields
It would be a good idea to repeat what the purpose of each of these fields is and which of them we use in the new data collection.
Now we use the fieldroot_lang
. The value for this field is taken fromlanguage.json
.
What about thelanguage
field? -
Last but not least, what to do with those queries that will break after changing the structure of documents in our data collection? Now everything is tied to the
root
androot_edges
collections (the tree links are defined here). Theroot_names
are also occasionally used, but at this point it is not clear what for.
I want to say that as soon as we add an updated data collection that is based onnames
files, and we build connections based ontree
files - while removing theroot
androot_edges
collections, we will also have to adjust all the queries in the database that are currently designed forroot
androot_edges
(and some forroot_names
)