How does a web session work ?

Finally illustrated with diagrams

Not long ago I had to investigate on a session reset bug that forced me to do some research on sessions. Since I didn’t find the subject well covered, I thought I would share what helped me solve my bug, so that it can also help you in time.

I will describe here sessions in web applications, my area of practice.

What is a session ?

“session” is one of those computing terms that refers to seemingly different things : a shell session, a tcp session, a login session, a desktop session, a browser session, a server session etc. This makes it confusing to understand what exactly a session is. Same for the “cache”, another confusing term (database cache, browser cache, framework cache, network cache..). But actually what defines those confusing terms is the use they’re describing.

So, the first thing to understand for sessions is their use.

Use of a session

Generally you should understand the session as the different states of an application during the time a user is interacting with it, it is specific to the user. I would even say a session is an instance of the interaction of a user with an application, but I’m not sure it clarifies the matter. Now more specifically for a web session, the session is a data structure that an application uses to store temporary data that is useful only during the time a user is interacting with the application, it is also specific to the user.

For example, you could save the user’s name in the session so that you don’t have to query the database every time you need it or you could store data in the session to save state between pages (between pages of a payment process for example).
Think of it as a volatile memory quickly accessible that is allocated to each user who is using the application, and when the user quits, it is destroyed.

This is the general concept, the storage mechanism and how it is implemented is then specific to the application. This temporary storage could be on the file system in text files, on a database or in the internal memory of the program executing the application.

The second thing to understand is the structure of a session.

Structure of a session

The session is a key-value pair data structure. Think of it as a hashtable where each user gets a hashkey to put their data in. This hashkey would be the “session id”. A session data structure would look like this :

Session datastructure

And when you say, “my session” you would refer to your entry in the session object. Every user is able to access only their session. The session can be stored on the server, or on the client. If it’s on the client, it will be stored by the browser, most likely in cookies and if it is stored on the server, the session ids are created and managed by the server. So if there are a million users connected to the server, there will also be a million session ids for those users on the server.

From now, I will focus only on server side sessions.

How does a session work ?

So how exactly do users access their session?
For a single user application, like a desktop application, there is only one user, so there is also one session, it is not difficult for the application to make the connection between the user and their session data. However, for a web application, a server has multiple clients, how does it know which session is yours? That’s where the session id comes into play.

The general principle is that you, as the client, give the server your session id, and in return the server grants you access to your session data if it finds your session id stored in its session datastore. The session structure is like a data locker for users, and the key for the locker is the session id, the server is the guy who shows you which one is your locker.

Let’s look more in details how it works :

How does a session work ?

Let’s start from the moment when you land on a webpage. When you receive a webpage from the server, along with the page content itself, the server sent you (in general, in a cookie) the session id that it set to identify your connection among all the requests that it gets.

Make the experiment, open your console and check the cookies, you will see something that looks like :

Php session id

JSP will send you JSESSIONID, and ASP ASPSESSIONID, here the back end is PHP.

After you logged in, the application validated your password and login and saved your user id in the session so that every time you will make a request, you won’t have to log in again (this will be detailed later).

Now let’s review the diagram above to understand what is going on when you make another request to get more data. For example, let’s say you landed on Gmail inbox after you logged in, and now you want to navigate to your drafts page.

1 – You send a http request to the server asking for the drafts page. Along with this http request you send your session id to tell the server “hey, it’s me from before, give me my drafts page now”. The session id is usually sent in cookies, but it can also be sent in GET or POST parameters, whatever the technique the session id just needs to be sent to the server.

2 – The server receives your request. Before it gives you your drafts page, it checks your session id, looks it up in its session datastore, it finds 5, your session id, so it makes the data in entry 5 available to the code engine (php, java, ruby…).

3 -The server then executes the code corresponding to your request “give me the drafts page”.

4 – The code starts by getting your user id from the session made available by the server earlier, then it uses it to ask the database “give me the drafts of the user who has this user id”.

5 – Finally when the code got your drafts from the database, it creates an html page, puts your drafts in it, and hands it to the server.

6 – The server sends you your drafts page, along with your session id.

Logged in state 

In this exchange, you could have just sent your user id in your request, and told the server you want the drafts of this user id. But that would mean that anybody who knows your user id would also be able to get your drafts and you don’t want that for your private data. You prefer that the application sends this data only to you. So to protect your data the application makes you log in first to make sure the person asking for the data is really you. And normally for any request for private data, it should ask you who you are first.

If there was no session id during this exchange, when you asked your drafts page, the server would not know the drafts of who you’re asking and it would ask you to log in first. HTTP is a stateless protocol so it doesn’t save the fact that you are already logged in. At each request, HTTP doesn’t know anything about what happened before, it just carries the request. So for any request for private data you would have to log in again to make sure the application knows this is really you. This would be very annoying.

That’s the problem that sessions solve. To avoid logging in all the time after the first time, sessions keep you logged in while you are connected to the server. Basically, after you logged in the first time, the server remembers in the session that this is really you and lets you ask for more data without asking again who you are. That’s how sessions are very useful.

So if we zoom out from the preceding diagram, we can observe how user connections are identified and maintained by the server thanks to session ids :

Simultaneous identified connections

Session management is a feature of the server, you need to activate it. For a static website that doesn’t serve private data for example you would not need to activate session management on your server.

Keeping you logged in is one main use of the session, but sessions can also be used to save temporary data that are completely independent of the logged in state. You could decide to put some data in session just because it is quicker to access.

Also as a side note, from the server’s point of view : one connection = one session id. So if you connect from two different browsers, the server will create two session ids. You should remember that the session id only identifies your connection to the server. All the user identification logic is handled by the application.

Debugging session problems

The bug I was trying to resolve was a session reset. I couldn’t explain why the user was suddenly logged out. This bug drove me totally crazy and I couldn’t find any useful help on the Internet because there are so many different kinds of sessions and no one describes the web session as we know it. So I thought I would spare you the pain of searching in vain by summing up what you should look for when debugging a session problem.

From the preceding sections, you have seen that the logged in state is maintained by the presence of the session id on the server. So if you are logged out, it means the server doesn’t know who you are anymore because it didn’t find your session id in its datastore. Most likely your session id never got to your server, you need to find where it got lost, or maybe it’s just not on your server, you need to find out why.

Why would a page reset a session? Reasons for session reset?

Here are some scenarios that will help you narrow down your problem.

1. Bad session replication

Load balancing session reset

To debug this, you need to know your architecture well. In big architectures, there are several servers on the front arranged in clusters and users are load balanced on one of them at each request. You need to make sure that you share the session data correctly across those servers. Here, it is possible that your session id was set by server A, but when you’re load balanced to server B, for all that server B knows, it has never seen you, so it doesn’t have the session id you’re presenting and will reset your session.

2. Transmission problem

Session id is lost

The main reason for a server to reset your session is because it doesn’t recognize the session id you’re presenting and therefore creates a new one to identify your connection. The session id is missing from your request, you need to find where it got lost.

On the way from the browser to the server, there are several places where the session id could have gotten lost :

- on the browser : check how the session id is transmitted, if it’s in the cookies check that the cookie transmission is ok with your browser. Maybe the browser doesn’t allow cookies in its settings for example. Also, be careful to not have http links on a https website with secure cookies. This was my bug. It seems a classic now, but it was not that easy to detect. Because the cookies were secure, the browser never transmitted the cookie over the http connection, because it only allowed cookie transmission over https, hence session reset. I advise you to look out for : https, secure cookies and redirects, those will be good pointers to the cause of your bug.

- on the network : check that you don’t have any CDNs or other proxy that trim your cookies on the way.

- on the server : the server doesn’t read or write session ids correctly.

3. Session reset in code

Session reset in code

Of course you should not rule out this possibility. Maybe some code is intentionally asking your server to reset the session, like when you say $user->logout(). This one should not be very difficult to find in the code.

4. Session expired

Session expires

Usually your session is destroyed only when you close the connection, so when you close your browser. Some servers might have specific directives to reset the session after a timeout before you close the connection, you should check that out.

***

So this is what I wish I could find when I was investigating my bug, here it is now. Those are the main reasons I can find for session reset, I have tried most of them before finding out about the secure cookies… Go ahead and smile. You can always read about the full story of my bug here : Rings, bells and victory.

***

Links :

I wrote this article with Apache in mind for the server
http://httpd.apache.org/docs/current/mod/mod_session.html
http://en.wikipedia.org/wiki/Session_(computer_science)

 

Posted in Technical, Wiki

Tagged

3 Pingbacks/Trackbacks

  • ck

    very informative read. Excellent for beginners

  • Kush Goyal

    Thanks for sharing, hard to get by information.

  • apoorvgaur

    incredible article man.. absolutely top notch, i was having a lot of trouble with sessions, and 10 minutes here blow away all my doubts. cheers.

    • eloone

      Thank you !! i am a woman though :p

  • Jasper

    very helpful and awesome article!!! But I still confused about your bug, “be careful to not have http links on a https website with secure cookies.”. What does this mean? Can you explain more detail? Thanks again!

    • eloone

      Thank you :)

      I meant that if you set your website to accept only secure cookies through your .htaccess for example, then make sure every internal link that you display on your website is in https and not http, otherwise the cookies won’t be transmitted over simple http once the user clicks on the link and it will reset the session.

      My bug was that we had set up the whole website to only transmit cookies over https (only secure cookies) and a webservice was sending us links to display on the website that were in http, so every time we clicked on those links, the cookies were not transmitted and the session was reset. Hope that clears it for you.

  • Gitahi

    Very well explained. Thanks.

  • Sandy

    Very well explained. Thanks!

  • fdas

    great work, thanks for this article, was very usefull, continue your work with other usefull articles

  • Ngwanevic

    Awesome article. Very informative. Thanks

  • Pingback: Perseverance and my new Sinatra web app | Jacinda Zhong

  • Markus

    Great article! Helped a lot!

  • Md Shariful Islam

    Very good article

  • Matthew Bilton

    Concise and well organised. You taught me something. Thanks!

    • http://machinesaredigging.com/ Elodie

      I’m glad to hear that ! Thanks !

  • broke_it_guy

    I am gathering knowledge to write my first Python login app and I really like your article. Your programming perspective with diagrams are really helpful. I never knew the interactions between stuffs that cause your bug and I never seen other articles describe this. It is a gem and probably save me (and others) a lot of (troubleshooting) time. Many kudos….

    • http://machinesaredigging.com/ Elodie

      Thanks a lot for your comment, it encourages me to use graphics in my posts. Indeed I don’t see many posts with good diagrams whereas it’s a vital piece of information when you want to understand some concepts of programming that are too abstract.

  • Juin Chiu

    very helpful, thank a lot!

  • Andrew

    This was a definitive explanation that helped me to more fully understand sessions. Thanks.

  • Rajesh M D

    Awesome dude.. :) very very helpful. Got complete picture of session. Wish i could’v got article this earlier…

    • http://machinesaredigging.com/ Elodie

      Thanks. However I’m a dudess (???) :)

  • etic

    I love the conceptual explanation. Thank you.

  • سامان

    thank so much
    in persian :
    خیلی خیلی ممنون

  • http://vikas-tiwari.appspot.com vikas

    awesum…thnks 4 writing this topic…………

  • Oleksii

    Thanks so much, Elodie! This is the best article on sessions I’ve found so far!

  • 连培培

    Feel like I have to say something expressing my excitement for this article and for your organization of this article.Millions of thanks!!!

  • Mahesh Eu

    Great article. Thanks for that

  • Pingback: Ruby on Rails 4 Session Cookie - Travis Luong

  • Aravind Kumar

    Great!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Thanks…………

  • Pingback: Session Cookies ( transient cookie ) | Codebazz

  • Nice dude

    Incredible article. Very helpful. Explains everything, each and every detail. Thank you for providing such a nice information.

  • yevhen

    Thanks a lot!

  • http://chuntaolu.com/ Chuntao Lu

    What a nice writeup! Thank you!

  • Rehan

    Salute You sir!! really appreciate

  • Sanket Gandhi

    Awsome Explanation.

  • http://machinesaredigging.com/ Elodie

    Hi Aaron,

    When twitter redirects to your page you need to associate the id from twitter with the session on your website. session = website_session + twitter_id (later user). So 1. your website session is website_session, on your server session = website_session, 2. you send authentication request to twitter, 3. twitter replies with twitter_id, 4. on your server, usually if the user does not exist in your db, you create a new user with the twitter_id, if the user exists, you fetch the user from your db with that particular twitter_id, then you do session = website_session + user, and you can save the session to a session database for example. Then every time you need session data you can query “session”. The link is done by saving the twitter id to your database “users” table, and when twitter replies with an id you use that id to fetch the user and put their data in the existing session.

    Hope that helps.

  • mvsagar

    Excellent article!

  • Rabia Naz Khan

    The best article I have every read! What a finely organized research! Really really helped me out! Thanks!! #Thumbs_up!

  • Neil

    One of the most clearly written articles on sessions i’ve come across.

    Can you also write on on different type of auth too?

    • http://machinesaredigging.com/ Elodie

      Thanks! what kind of auth type are you thinking about? the only other auth type I was thinking to write about is ssh connection as it’s also not very well covered.

  • Nikhil Fadnis

    excellent article, will me marry me?

  • metatron

    Great post. Thanks for sharing.

  • Felipe Perestrelo

    Awesome work! I really learned a lot about sessions! Thanks!

  • johnz

    the best article about session i get find on the internet. thank you so much for sharing this. it has saved me tons of time and effort!

  • Tapash

    Excellent explanation. One of the best so far I have seen. Well done and Thanks.

  • Jason

    Well written article… Thank You !

  • Yue Wang

    I am not a native English speaker and really weak in English. So I must make many grammar and vocabulary mistakes.
    I came across your blog when I typed ‘web session’ in google. I really like the style of your blog(its content is wonderful as well), especially the images. Can I call it ‘scratching style’? I was stimulated to make my blog more useful and beautiful.Thanks.

    • http://machinesaredigging.com/ Elodie

      What’s wonderful is your comment, thank you so much : ) I will post more! Scratching/scrapping style, I think I see what you mean ;) Share your blog so I can see the result ^^

  • Wuji Xu

    Thank you,super woman !!!! Hard to find such a good article like this on the Internet nowadays.

  • John A.

    Very clear explanation. Bravo!

  • Jon B

    Just thought I would say how helpful I found this blog about sessions. I am fairly new to all this, and this is about the most concise and clear explanation that I have found after doing a lot of searching. I also thoroughly enjoyed your Rings, Bells and Victory post – a fascinating detective story with a triumphant solution.

    I also think your layout is really good – love the font you’ve used for your menu! Nice work.

  • Zubin Teherani

    Really solid. I’ve read a couple of resources on sessions/cookies and they have been confusing. The diagrams make it so clear

  • Aman Thukral

    Awesome Explanation. I really felt the urge to applaud this great article.

  • Jackson M

    Hi, I have a question, in the second image where you are showing the session flow, you are showing in the step 1 that client is passing the session to Server which does not seem correct to me. It is the server which assigns a session to a client. In response to the very first call, server sends a response header “set cookie”, the value of which is the newly generated session id. From then onwards, client sends the same value as session to server in every call. Somehow, if server does not get the session id from the client in any request, it generates a new session id and attaches with the response headers and thus treats the client as a new client. Kindly let me know where I am wrong, it would be very helpful.

    • http://machinesaredigging.com/ Elodie

      Hi, you are correct, the flow is as you describe. The diagram details the steps after the user logged in, so already has a session id, and asks for user personalized data. The very initial step is described below the diagram: “When you receive a webpage from the server, along with the page content itself, the server sent you (in general, in a cookie) the session id that it set to identify your connection among all the requests that it gets.” Note that the session id can come in a GET parameter for example and not necessarily in a cookie. Hope this helps.

      • Jackson M

        Got it. Thanks for responding. And, I hereby devote myself to your fan list. :) Excellent article…

        • http://machinesaredigging.com/ Elodie

          Thanks and welcome ^^

  • Subhasish Bhattacharjee

    really nice article.. however I am a beginner and I have a doubt. In all the pictures above there is a session id attached to every request. What happens when the browser requests for the first time? Who creates this session id?

    • http://machinesaredigging.com/ Elodie

      The session id is created and sent by the server for the first request.

      • Subhasish Bhattacharjee

        Thanks for the reply.

  • swayam raina

    Really nice article…
    I have doubt though, When sessions are created they are saved as cookies which are stored in files on user’s computer. So a hacker can get the session id and use it. How we can prevent that.

    • http://machinesaredigging.com/ Elodie

      Just to clarify, sessions are not saved in cookies, they are saved on the server, I mean here the session data. I think you mean the session ID can be saved in cookies and a hacker can get hold of it. Indeed, the session ID can be saved in cookies and a hacker can use it, but then it is a security problem the application needs to handle, and it is a big subject that I can’t cover in a comment, but that fortunately is already well covered on the web.

      There are various ways to prevent such attack, if you mean how we can prevent this attack as developers, you should search for how to prevent cross-site scripting (XSS), or browse the OWASP website https://www.owasp.org/index.php/Session_hijacking_attack, or https://www.owasp.org/index.php/OWASP_Application_Security_FAQ which is a reference website about how to handle security issues. Session is a sensitive area that the application has to protect rigorously, and many strategies are possible. Some applications might protect some parts by asking again the user password for example, some might send the session ID both in a cookie and a parameter to control the connection integrity, some expire the session frequently, some only allow a single session per user, there are several strategies recommended that are the responsibility of the application.

      If you mean how we can prevent that as users, you can protect yourself, by clearing the session yourself in the browser, or by logging out after you use an application, especially when you are not on your computer. As a good practice you also need to learn how to evaluate the security level of an application to decide whether or not you should use it at all. Get to know the basic security measures an application needs to implement (as described on the OWASP website) and see if the application implements them. When you see an application is not secure and serious with your data, DO NOT USE IT. Regularly the press also communicates on security breaches in applications, so check that for the application you use.

      • swayam raina

        Thank you for your guidance

        • http://machinesaredigging.com/ Elodie

          You’re welcome!

  • http://machinesaredigging.com/ Elodie

    I don’t see the difference you make between web server and application server, for me it is the same, maybe you can clarify? Session management is always done server side. I’ve mostly used Apache so I find it does the job for session management, I’ve never encountered errors in session writes with it. But I can only answer for standard use because I don’t know your specific needs. Also session management is a small brick in what will make your application performant, reliable and secure. Any good webserver can enable sessions, but once it’s enabled it’s your job to create an architecture around it that makes the application performant, reliable and secure as those will happen in several layers. So I’m going to sound generic, but you can only choose according to your needs, decide what point is important for your needs and study how each server handles that point. If scalability is a big deal for example, you should particularly study that, maybe you will need to share sessions in no sql dbs for high read access, maybe you will need an nginx load balancer to route the traffic and keep the sessions. Performance/reliability/security are not the tasks of only the server that handles the sessions, you just need a server that does the job in managing sessions, but most of the work will be done by other layers.

  • mintCollector

    Very informative and fun to read article. Thank you!

  • Jeyaraj

    Really nice

  • CHYGO

    The best article regarding sessions

  • Rajdeep Hazarika

    Very well written and liked the balance of making someone understand from basics yet keeping the technical aspect of it . I would like to read more of your stuff.