When we talk about authentication for api rest, almost everyone tends to think about oauth1 or oauth2 and their variants defined by service providers. It’s true that there also other auth systems such as token, openid, etc, but they are not as widely used in comparison with oauth.
What do you think about them? Are they all truly stateless?
How authentication works?
Before I start, I’d like to explain how standard (http session) works and compare it to oauth2.
Basic steps on how standard http session based authentication works:
- Client sends its credentials to server.
- Server authenticates them and generates fixed length token.
- Server stores previously generated token in some storage with user identifier.
- Server sends previously generated token to client with new Cookie “sessionid=here-the-random-token”.
- Now, client sends that token in each request using cookies header.
- Server, in each request, extracts session id token from incoming cookie, looks up the user identifier on its key/value and query database to obtain the user information.
Basic steps on how oauth2 works:
- Client sends its credentials to server.
- Server authenticates them and generates fixed length token.
- Server stores previously generated token in some storage with user identifier.
- Server sends previously generated token to client in a response body (usually in json format).
- Now, client sends that token in each request using the “Authorization” header.
- Server, in each request, extracts the token from the authorization header and looks up the user identifier on its storage to query database to obtain the user information.
This is a very superficial overview of oauth2 steps so if you want know more about it please check this article: OAuth2 Simplified
Are they truly stateless?
Now, having viewed the comparison, you can easily see that oauth2 does not differ that much from standard http session authentication. The big difference is that the token is sent using a different header. If you are of the opinion that both methods are stateless, you are wrong, here is why.
Stateless means without state, but as we already saw, http session, oauth, etc… have a (small) state: the token storage. This is not a bad solution per se, but it poses several disadvantages:
- It requires shared storage if you want to scale the number of your servers.
- With hundreds of thousands clients, you are forced to maintain hundreds of thousands tokens.
- With hundreds of thousands clients (and each client allowed to have more than one token) token storage can be very expensive.
What is stateless authentication?
Again, stateless means without state. But, how can we identify a user from a token without having any state on the server? Surprisingly, it’s very easy! just send all the data to the client.
So what would you store/send (send to client/network)? The most trivial example is an access token. Access tokens usually have a unique ID, an expiration date and the ID of the client that created it. To store this, you would just put this data into a JSON object, and encode it using base64.
Now, having a self-contained token, you will need to make sure that nobody can manipulate the data. For this you should sign it using MAC algorithm or any other digital signature method available.
This approach has great advantages:
- The biggest one is that your storage needs are zero, because you are not storing anything.
- An application that forgets about its access token will simply no longer remember it and the data will automatically expire.
- Systems can be entirely decoupled from each other, thanks to no more shared token storage.
How I can use it with my programming language?
This is a simplest step, because almost all modern languages have good cryptographic signature libraries and utils to work with both json and base64.
Lets see an example in Python. You can use itsdangerous:
from itsdangerous import JSONWebSignatureSerializer s = JSONWebSignatureSerializer('secret-key') s.dumps({'x': 42}) # Will output: # "eyJhbGciOiJIUzI1NiJ9.eyJ4Ijo0Mn0.ZdTn1YyGz9Yx5B5wNpWRL2..."
Or even more compact using Clojure, you can use buddy:
(buddy.sign.jws/sign {:x 1} "secret-key") ;; Will output: ;; "eyJ0eXAiOiJKV1MiLCJhbGciOiJIUzI1NiJ9.eyJ4IjoxfQ.-hx2Os..."
Security flaws myths
Before writing this short article, I went to several people and talked about this. In almost all situations I ended up receiving this comment:
“I don’t trust it, because it sends data to the client”
If you do not trust the proposed approach, you are indirectly stating that you don’t trust the widely used MAC algorithms (they are used in almost every security piece of software). This criticism lacks broader knowledge.
“It’s vulnerable to man-in-the-middle attack“
This a different type of criticism. Yes, it is vulnerable… as any other authentication system mentioned here. The most standard ways of authentication are also vulnerable to that attack. SSL should be used to prevent it. This criticism is also very weak in the sense that is not particular to a stateless authentication system.
Summary
Signed self contained token is a nice way to avoid using databases/storage. It allows decoupling, better system scaling and allows you to write different parts of your system in different programming languages.
It’s nothing new, people have been doing this for ages.
Related article: My Favorite Database is the Network
Thank you very much! Your solution is simple, but great!
Have you currently implemented this solution anywhere? If so, what challenges have you faced?
Hi Joe
We have used it in http://taiga.io service for few years, and we are continuing using it there. In other projects we are starting using it and we are very happy, because we can absolutely forget about session storage when we need scale horizontally.
The big challenge, is how clear one concrete session. That is not possible as is, because session does not exists, no state on the server exists that can be deleted for concrete user… But is not a big problem, it has few solutions. If your secret key is compromised, you can replace it with new secret and it automatically closes the “session” for all users.
An other solution is having some of random salt stored in user table and transmitted as part of of token data together with user_id and check this on authentication process. In case of you want close a session of some user, you only should generate a new random salt to the concrete user (the salt really can be very small 8 or 16 bytes of random data should be enough).
In terms of CPU costs is it better than query for the token in mongo?
In terms of CPU costs / blocking time is this solution better than querying a token in a mongo DB database?
Yes, this solution is much better because the cryptography primitives used for generating/verify tokens is much faster that IO latency.
But is not the only benefit. It also saves you from have to store and maintain tokens in your storage that may occupy much space in your database. And much more if you are using something like mongodb that is very space inefficient.
Sorry I am not fully understand it.
Does it means that in client side we will use a key to encrypt the access token, then we send the encrypted token to server side. In server side, we using the same key to encrypt the saved access token in DB, and compare it with the client side sent encrypted token?
So do we still need to keep an access token in server-side? Then is it true stateless?
Thanks.
HI,
I am with Tony Fung.. In your example, the access token contains all the details. How is this different than say Bearer where the token contains the details?
More so, I still think there is some state stored. For example, if you pass in the user_id in the token, that user_id is typically tied to a user in a database. Sure, you’re not storing the generated token in a lookup table that has that user_id as a column as well, then making a separate call to get the user_id, so in that sense you’re not creating state for the token itself. But there is still a lookup needed to get the user details on every request. Or am I misunderstanding that as being state when it is not? I am guessing your use of “State” in this case is purely the creation/storage of the token as state.
I am also still a little unsure about the security aspects. I will ALWAYS assume everyone will use SSL for API requests. It just makes sense to avoid man-in-the-middle attacks. What I am a little unsure of is if the client side has the ability to decode the token or not? I would think if they do, that would be a vulnerability, especially since people can view source code of web sites and could very well see the code used, even inline some code and decode a token in real time and see the details of that token. So I am assuming then, clients would NEVER have the secret used to decode the token, only the server, which generates the token and encodes it, passes it back, to then be sent on subsequent requests. However, there is some argument around sending a request with a token that may have just expired, hence clients would want some way to check if the token is expired first..if so, renew, then send the request with a valid token only. For that they would either need a separate bit of data on every response that includes how much time is left for the token (probably a response header would provide this info), or they would need the ability to decode the token to get that information.
@Tony Fung
We have 2 kinds of tokens:
1) The ones you save in a look-up table and must be retrieved and cleaned from the db at expiration.
2) self-contained this kind doesn’t need to be retrieved from the db as the can be decrypted and get all the info contained in the token.
Type 1 are easier to create harder to retrieve-maintain. Type 2 are harder to create, but easy to retrieve-mantain.
@justin
After few weeks studies, I get some concert about it and can share it with u.
1. Is it stateless? I think yes. When you need to access some resources that are protected, you need to provide an valid token to access it. The lookup for user info is just a part of some resources you need to access. The token is generated in server side when you authenticate the user, server side response it to client.
2. I think the client-side doesn’t need to check the expiry day. As the token is used in each request, when the token is expired, the server will tell the client. So, the server will generate the token and decrypted the token itself, client side only need to store the token and resend it to the server in each request.
@Diego Lavia
Thanks Diego,
I finally got the concept after reading JSON web token (http://jwt.io/)
An implementation of self signed token which doesn’t need to be looked up in database and can be decrypted and verify the token.
I see one major limitation of the stateless token you describe.
You can not revoke it. Imaging you are offering an REST Service such as Mandrill or any other that is used by other backends .
So you are giving away a API token with a very long TTL. So if the API token is stolen or revoked you need to create a store with all the tokens that are revoked and check against it on each request.
Awesome article! Also would be interesting to know what’s your take on refresh_token (https://rundis.github.io/blog/2015/buddy_auth_part3.html)? What do you think of the approach where it uses a token store to manage refresh-token while keep it stateless for access-token?
Hey very cool website!! Guy .. Excellent .. Amazing ..
I will bookmark your web site and take the feeds also?
I’m glad to find so many helpful information right here within the publish,
we want work out extra strategies on this regard, thank you for sharing.
. . . . .
Hi,
Do you have a pattern for revoking a single token?
Let’s say I’ve authorized StackOverflow on my google account and now I want to revoke this. I can go into my google dashboard and revoke the permission I gave StackOverflow and only them.
As far as I understand what you’re outlining this can’t be done without some state.
I’d love to be wrong since I don’t want to go the storage way but I don’t see a real alternative given the above requirement.
Andrey.. We have implemented the same approach.. but we used cookies to store the encrypted token. Is using cookies bad compared to Oauth?
Hi Andrei,
just wondering what are your ideas about signing out the stateless authentication? How would you make sure all the tokens issued to the client are invalidated when users hits log out button?
thanks,
arturs