User login for suttacentral?

blake · February 23, 2016, 2:50pm

Since we are in the planning stage for the third major iteration of SuttaCentral, I think it’s time to consider the ability to login.

Completely Optional. The site is as functionally complete as is possible without logging in.
Log in via any OpenID provider (such as Google) and perhaps Facebook also, this means an account is linked and the SuttaCentral server stores no passwords and does not perform authentication itself.

What kind of advantages and possibilities do I forsee?

Persistent site-wide preferences, such as localization language
Persistent text view preferences, such as whether textual information is shown and the lookup is enabled.
The ability to save custom lookup definitions or comments, and perhaps use those from others.
Maintaining a history, such as a reading history (I remember reading this passage… but what sutta was it again?)
The ability to like, favorite or bookmark texts and perhaps passages, and maybe have a “read later” list or create “read lists” which can be shared with others.
The ability to add annotation to texts (and perhaps have these visible to others, ala medium)
All of the above can work cross-device and cross-browser

Thoughts?

sujato · February 24, 2016, 4:15am

If the benefits are genuine, sure.

I’d prefer to completely avoid mass-surveillance advertising providers like Google and Facebook. Originally I planned to enable them for Discourse, but I changed my mind, and we seem to be good, no complaints. It creates a small entry barrier, which I think is a good thing.

Could we hijack the same login as Discourse? It seems clumsy to require two separate logins. And it would be a way of raising the profile of Discourse.

Vimala · February 24, 2016, 6:40am

Sounds good. Agree with the point on using the same login as Discourse if possible.

Would it be good to first finish the localized pages (i.e. start translating static pages (maybe with pootle), fix elasticsearch problem with languages (issue #134), choice of languages on the homepage, etc)?

sujato · February 24, 2016, 7:37am

For sure. I assume Blake was asking in terms of thinking about the overall architecture of the changes we’re working on.

mikenz66 · February 24, 2016, 10:15am

Sounds great if it allows customisation. Agree that it would be convenient to have the Discourse login.

felipe · February 25, 2016, 1:03pm

I agree with all the benefits exposed and I think it would be a nice thing to have them.

I also agree with avoiding third party login service providers, I avoid them myself all the time.

Having the same login system as discourse would definitively preferable, I think it would be tricky since discourse runs on rails (I think) and SC on something else, but still, it sounds like a good thing to do.

Best regards and thank you @blake for the efforts.

sujato · February 26, 2016, 2:44am

Thanks for the support. You’re quite right about the different backend, but i think this is an issue that many Discourse users will face so i’m hoping that there’s an API for it. Anyway, that’s Blake’s problem!

blake · February 29, 2016, 10:37am

There will be an understandable bias to discourse users being happy with discourse login :).

I’m leary of facebook but it is the most widely used social media app in the world, the OpenID protocol (which facebook doesn’t implement) itself is neutral, Google is just the most widely used OpenID provider. You can get an OpenID with a security-paranoid provider, or if you want to go to extremes you can run your own OpenID provider on your own website, for example if I was feeling bored and wanted to set it up I could use SuttaCentral as my OpenID and have the suttacentral server validate me using the OpenID protocol and could use that OpenID on any site supporting OpenID, so it’s actually a wonderful thing for the truly paranoid because technically you don’t need to sign up anywhere at all - the very basis of OpenID is authentication is offloaded onto a site you do trust and then you can use that OpenID on sites you don’t trust to manage authentication (because i.e. you doubt their competence).

I did think of using Discourse login - for the “advanced use” purposes particularly annotations it is a good solution. It can also be done fairly easily as Discourse can be configured as a “Single Sign On” provider:

It does provide a significant barrier to entry for anyone who doesn’t have or doesn’t want a discourse.suttacentral account, which is fine if it’s used for the more interactive possibilities like annotations or pali lookup custom definitions, but for simple things like being able to set a custom locale a barrier to entry isn’t good. Which doesn’t mean it has to be a problem - it just informs aspects of the site design. I think we should probably aspire to model the site after wikipedia.org - where user login is as optional as humanly possible and login is mainly used by those who want to be more interactive.

sujato · February 29, 2016, 11:14am

So let me get this straight. If I log in to a site using my Google ID, this is in fact just using OpenID, which happens to be shared with the ID I use on Google? Google knows nothing about it? Or is it sharing information, at least potentially, with Google, about who logs in when and where?

If Google truly knows nothing about it, then have at it.

But TBH, I really think a minimal barrier, entering an email and password to set up an account, is fine, in fact I think it’s better. We’re not a news site or fashion blog or youtube, we don’t want any old flyby crap. If someone can’t take a minute to set up an account, I don’t think they’re interested enough to really contribute.

As far as doing more interactive stuff, say annotations, we might want a more granular approach. But on the face of it, I’m not convinced. Annotations would have similar feedback mechanisms and so on as Discourse does, so we can always discourage, edit, or ban annotations as appropriate.

Pootle is, of course, a different case, and there’s no need to integrate that.

blake · February 29, 2016, 11:49am

Basically with OpenID you have to trust your OpenID provider. If you log into say, StackOverflow using your Google account, then Stackoverflow doesn’t authenticate you, instead it asks Google to authenticate you and redirects you to the OpenID url on google, if you successfully authenticate with Google (maybe enter your password, or just hit confirm) then Google responds to Stackoverflow that you are who you claim to be and you can then use your Google identity on StackOverflow (or at least to log in). The point is that StackOverflow doesn’t need to store your password, it is offloading authentication.

Naturally Google knows it is authenticating you on behalf of Stackoverflow, but I think once it has performed the authentication step its responsibility and knowledge of your actions on Stackoverflow end. From the wiki:

The Identity Provider does, however, get a log of your OpenID logins; they know when you logged into what website

My understanding is that Facebook abandoned OpenID a few years ago because Facebook wanted to have a higher degree of integration with the other website, but the newer versions of OpenID is built with consensual data sharing in mind so there is probably no real difference now. Although it bears mentioning - as a website designer it’s up to you how much information you send back to facebook or another identity provider. Merely authenticating with a third party doesn’t let them “pull” anything from your site unless you additionally run code provided by them (for example the Google Analytics code could do anything it liked, including knowing when a Google user is logged into the site and what they’re doing - it might not do that for privacy reasons, but it could in principle)

felipe · February 29, 2016, 4:58pm

I think this is the correct approach and it is supported by this:

Some sites just provide as many login methods as they can in order to have more and more users, but this is not the case, the people who arrive in here and who want to be a part of it will take their time to do it. Some might argue that there could be an issue in regards to their possibilities of setting up a new account, but then again, if they have one in Google or Facebook, then they are capable of this process.

One more thing is, I have no Google or Facebook accounts, in my case I would be forced to first start by getting one of those!

sujato · February 29, 2016, 11:05pm

Look, I’d really rather just avoid the issue. If we can provide authentication on Discourse independent of any of the big corporations, I’d be much happier. I don’t think it’s paranoid to think that if Google has access to info, they will take it, and if they take, it will be bought and sold and eventually leaked.

And I am conscious that just to visit SC is, in several countries of the world, a capital offence, or at least could count as evidence towards such a charge.

To be honest, I’ve also grown less comfortable with using Google analytics. How about we get rid of it and just use New relic?

sujato · February 29, 2016, 11:36pm

Don’t worry, we’re not eliminating Discourse login. This is about thinking of logging in the SuttaCentral itself, for various things that are in the pipeline.

felipe · March 1, 2016, 12:16am

I use Piwik which does not seem to provide the same amount of information that New Relic does and I am sure it is also far from Google analytics, but it works for my needs and it is open source.

sujato · March 1, 2016, 12:53am

You know, it’s kind of crazy, but I never even thought of this. Obviously we should be using open source analytics. Piwik looks great! We’ll definitely put this on the to-do list.

blake · March 1, 2016, 10:25am

Unfortunately there’s not too much we can do about that. Whenever a packet, even an encrypted one, wants to travel across the internet it needs to have a destination, and it needs to tell any router along the way it’s destination. So the packet is like "hi-ho hi-ho, it’s off to suttacentral.net I go’. With encrypted packages it’s not possible to see exactly what urls are being visited, only the domain, but you can tell how long a session is active for and (approximately) how much data is transferred.

Users can use a proxy server to get around that, it is then the proxy server which establishes a connection with suttacentral.net, then any surveillance agency can only know the user is connecting to a proxy server.

I’m not sure exactly how it works with Cloudflare as it is a transparent proxy - but I think it works like this. First there is a DNS lookup for suttacentral.net (anyone can see this and we can’t get around that, but a user can use a proxy server) which resolves to a Cloudflare server, future traffic is then encrypted and sent to/from that Cloudflare server, and surveillance at a router level would not be able to distinguish between traffic to Suttacentral and traffic to other sites using Cloudflare (so at best there would be suggestive patterns, possibly strongly suggestive, but nothing as concrete as if Cloudflare was not in the loop). My understanding could however be wrong.

Cloudflare is a double-edged sword since obviously Cloudflare can be compromised, but I think it’s probably a net positive in terms of security.

sujato · March 1, 2016, 11:33am

Sure, we can’t solve the problem completely, but we can avoid adding new sources for leaks.

Of course, we should have a .onion mirror, but that’s another story …