Optimizing SC for China

Continuing the discussion from Fonts for Buddhism:

I just tested SC at dotcom-monitor which gives results from multiple countries. China is way out there, 40-50 seconds to load. On inspection it seems it is blocking resources from google domains, even benign things like jQuery or the font loader.

Update: I just tested staging on dotcom-monitor, and the difference is astonishing. Basically staging loads in China in about 2 seconds, pretty much the same as anywhere else. The main site takes nearly 50 seconds. The difference is entirely with one file: jquery. For some reason on the main site this served from ajax.googleapis.com, whereas on staging it’s served from cloudflare. Regardless of any other changes, we should fix this right away.

@blake, this seems like a serious issue that will majorly impact our users in mainland China. Can you check it out? It seems we should probably avoid serving anything from a google domain.

This is a well-known issue. The consensus is that it’s best to either provide a fallback to a local copy, or else just use a different cdn, such as maxcdn.

It seems to me that, regardless of the technical merits of one case or the other, the likelihood is that governments like China, and probably others, will do sporadic, unpredictable, and semi-rational stuff like this, so future-proofing by staying away from google’s servers seems like a prudent policy.

Curiously enough, Google Analytics is the only google service that’s not blocked (or is it? see below). Why they see fit to block fonts but not a service whose whole point is to track usage, I cannot say. In any case, as I’ve said before, i’d rather get rid of Google analytics anyway.

Note that this applies to maps, too, so our our lovely map of early Buddhism will be borked in China. Since it’s not an essential feature, probably no need to try to fix it, but it would be nice if at least it didn’t block the page.

Meanwhile, this article gives some nice tips for the web in China.

  1. Using a .cn domain is a good way to get better rankings on Baidu.
  2. Commenters have had mixed experience with Google Analytics, but at least some people have problems some of the time.
  3. Some people say using maps.google.cn works.
  4. According to one commenter: “forget about cloudflare. The great fire wall hates cloudflare. It’s a sure-fire way to slow down your website 10x.” However, Cloudflare is aware of the issues and has partnered with Baidu to provide China services: https://www.cloudflare.com/china/ Also see above, where the cloudflare jquery works fine.
  5. Similar problems exist elsewhere; Iran is often cited.
1 Like

On staging I’ve set it to pull jQuery from cdnjs, which is Cloudflare’s CDN. Since the rest of our site is already delivered by Cloudflare and the load time is only as good as the weakest link it makes sense.

Since you mention Cloudflare, a quick google search indicates that Cloudflare has established limited operations in China as of 2015 (that article is from 2014). Since that’s a special Chinese-operated Cloudflare I don’t know how it works out for the global Cloudflare, but at least Cloudflare takes availability in China seriously.

Apparently HTTP is generally a big negative for availability in privacy-unfriendly regimes because it’s easy to take the policy of simply blocking anything that can’t be censored (or only whitelisting specific HTTPS domains). I don’t think it’s so bad now in 2016, but in any case we seem to be committed to HTTPS so it’s tough cookies for anyone who does live behind a HTTPS-blocking firewall.

1 Like

See https://cpj.org/blog/2015/04/when-it-comes-to-great-firewall-attacks-https-is-g.php

And for giggles: https://www.theguardian.com/technology/2016/apr/06/great-firewall-of-china-blocked-fang-binxing

Indeed. Using HTTPS exclusively can be viewed as a kind of activism against censorship, the more sites which use HTTPS only, the more pressure there is to not block HTTPS.

Yes, indeed. In the Wikipedia page on this, it mentions that some religious sites have been censored in China. So this is a possibility, although it’s very likely it’s Falung Gong or other politically sensitive material.

The GFW has changed over the years, and some years ago it would block more pages based on scanning page content, looking for certain strings. Then it would not allow you to connect back to the domain at all for about ten minutes. One day I was unexpectedly kicked off Wikipedia from a page about Buddhism. Then after experimenting, I found that “Maitreya” had been blocked, probably because of this long history of cults and violent uprisings. One of these even took down the Mongol Yuan dynasty.

These days, most Buddhism related sites should be okay, but some of those related to a certain religious leader living in exile, may be blocked. In general, the “old guard” in the Party dislikes all forms of religion, but as a whole, they have accepted that it is just something that exists in society, so they just regulate it. There is a government bureau that authorizes a small number of major religious traditions in the country and licenses all religious leaders and sites. Chinese Buddhism is the religion that has the best relationship with the government.

@blake, I just tested SC for China again, here:

http://www.websitepulse.com/help/testtools.china-test.html

With the following results.

This is not good. In fact, I’m going to go right out on a limb and say it is bad. But hey, maybe there’s no-one in China interested in Buddhism, right? Right?

1 Like

I always liked WebPageTest.org for the nice data vizualization, here’s a waterfall view of a similar test:
https://www.webpagetest.org/result/170106_Z8_W9N/1/details/#waterfall_view_step1

1 Like

Thanks, well, that’s even more depressing!

My guess would be they are censoring religious content.

I would think though that many Chinese, especially those interested in non-state-sanctioned religion, would know they should use a vpn/proxy to freely access such material.

I have no idea how the Great Firewall works. Maybe try a test with a .cn domain, put up some content, and see how long it takes to make the blacklist? They might additionally crawl for keywords, and if so, I would think it’s a lost cause from the content provider side.

It’s hard to say. The problem is that the GFC is so vast and complicated things change all the time. It is quite normal for sites to be very slow, just because of the re-routing of the data. I suspect they just want to make non-Chinese sites sluggish. But we have no evidence that anything has actually been censored.

I wonder if we should get a specialist developer inside China, who is familiar with the issues and can ensure we work as well as possible. @llt, @blake, @vimala, any thoughts on this?

If SuttaCentral were being blocked for religious reasons, then it would likely just result in an HTTP 404 error, and it would be completely inaccessible. But since it is accessible, and just slow, you probably haven’t made anyone angry. My guess is that the site is just using some CDN type hosting service that is being blindly throttled.

If it’s a host problem, as in they throttle all cloudflare dns/cdn websites that aren’t top 100 and government approved since they can’t see origin server (cloudflare acts as reverse proxy), then I would offer buying a cheap native Chinese vps for testing.

Something like this:
https://www.qingcloud.com/pricing/plan

Other completely speculative possibilites are that they make ISP’s block or throttle based on keywords at the domain level, for instance including a word like “sutta”. That would be pretty easy to do, and we already know they do more sophisticated snooping:

The Chinese government uses Deep Packet Inspection to monitor and censor network traffic and content that it claims is harmful to Chinese citizens or state interests. This material includes pornography, information on religion, and political dissent.[24] Chinese network ISPs use DPI to see if there is any sensitive keyword going through their network. If so, the connection will be cut.

-from the DPI wiki

As SCMAtt mentions, we use Cloudflare. They have developed a China service:

@blake, have we activated this? If not, please do so!

Thanks, but it’s not necessary. For testing I use http://www.vpngate.net/en/, which is awesome. It’s operated by Tsukuba University in Japan as an “academic experiment”, and offers free VPN via servers around the world.

1 Like

I looks like an enterprise only feature, reading a little about the ICP license I can understand why.

It doesn’t seem like a problem, it’s just a registration number issued by the Chinese govt; they are for both commercial and non-commercial sites. Cloudflare offers assistance with the application, they can tell us what kind is appropriate for a non-profit. For non-commercial what we need is:

  • ICP application form
  • Copy of personal ID
  • Forms to authenticate website information
  • Copy of domain certificate

Can you go ahead with the application? You can use my ID if you like.

We should also take advantage of the SSL configuration.

No it’s not simple, the Chinese partner for Cloudflare just does the hard part and also owns the licence and acts as the permanent contact with the Chinese government. Given that this service is only available for enterprise customers and not even business customers (check the plans page) indicates there’s probably significant rigmarole involved for Cloudflare.

One of the most important things which is somewhat glossed over in the Cloudflare page (though hinted at when it starts with telling you to register a new domain with cloudflare) is that the domain must have no content during the application process (presumably because it’s illegal to serve content from China without the licence) , so this means you would create a new domain and get the ICP license to serve content on that domain from a Chinese host.

This looks like something that is for organizations which are serious about having a strong presence in China, it’s certainly not a simple optimization to improve performance in China. We can think about it one day (at least once we have full chinese localization) but now is not the right time.

Okay, I didn’t realize that.

So fine, but that leaves us with the same problem: we are unacceptably slow in China.

Solutions?

@SCMatt suggested setting up a local VPS (and my apologies to him, I misread this as VPN).

But it seems to me the way to go would be to get in touch with a local developer and get them to look at our use case and work out a solution. Otherwise we’re just messing around in a complex area we know little about. Let me see if I can find someone.

As far as I can tell a local mirror would be a good solution, you’d just need someone to handle the ICP side of things, btw I’d favor a mirror over a proxy, that is where you run an entire new instance of the server instead of proxying requests to the main server. That way the traffic doesn’t cross the great firewall (or half the world) at all, ever.

Okay, well I’ll see if I can find someone on the inside. From everything I hear, it’s extremely complex and fragmented, different in different cities, and ever-changing. Ideally we would have a long-term assistant to keep tabs on things and ensure that it still works.