Ah, very nice! That’s a numbers game of realizing that the total number of characters used in the texts is really not that much.
Okay, I was playing around with Noto Sans a bit, and I found that by setting the font face (TC or SC) and setting the language tags, glyphs are rendered differently.
The following is with “Noto Sans CJK TC”, for different language tags. The form with the broken radical at the top is the form used in Taiwan or Hong Kong.
When Noto Sans SC is specified instead, the results are quite different:
Apparently even when specifying the TC or SC font, and not the other at all, the language tags will still influence which forms are selected. For example, in CSS I have “Noto Sans CJK SC”, and next “Unifont,” and it still picks out glyphs for Taiwan variants for “zh-tw”.
Basically, using TC will cause “lzh” text to have Taiwan variants, while using SC will cause “lzh” text to have Mainland variants (although note all are traditional characters using the same Unicode points).
These sort of things are potentially sensitive issues also because they are regional variants, and so there are always some politics about Chinese characters in East Asia, and to what extent they should be unified (with “unification” using revolving around the Chinese language, and mainland China).
Neither variant is more “correct” than the other. After comparing a few characters and looking at the many variants used between countries, and in the Kangxi Dictionary, it’s apparent that there was always a bit of variation, and that no modern forms are exactly the same as the Kangxi forms.
In terms of numbers and demographics, the mainland has some 40 times the number of people in Taiwan and Hong Kong combined. This inevitably means that most readers around the world are more familiar with the SC forms, and most Chinese books will also use SC forms. Additionally, Chinese (“zh”) text on the Web seems to render in SC forms by default, including everything on CBETA. So my tendency is to use the SC forms (and also because they look more familiar to my eyes).
Nevertheless, Taiwan and Hong Kong have long histories of using the traditional characters, and still use them to this day for everything. Their forms are no less correct, and so it mostly comes down to personal preference.
Finally, here are a few dictionary entries that show different forms. In the sequence, it shows (1) mainland China, (2) Taiwan, (3) Hong Kong, (4) Japan, (5) Korea, and (6) Kangxi Dictionary.
In general, it seems like mainland China, Japan, and Korea often use similar forms, while Hong Kong and Taiwan often agree about slightly different ones.