Session understanding - not a "write it for me" post

mrwoo · July 12, 2012

Greetings.

I have been tasked with converting all of my stand-alone applications, mostly DB frontend apps for Access using ADO, into browser based applications. I have avoided web based coding for years because frankly I could do so much more with good old VB.

I also have a fairly extensive amount of scripting under my belt in a few different languages, so I am not I guess a classical "newbie". Either way though, I am no learning a lot at once, so the learning curve is pretty steep.

Anyway, one aspect that I have to tackle is both an intranet and internet site, and I need some security. The idea is simple enough, create a set of users, store thier credentials and any other info in a mysql table. That part is not so bad, using PDO as the connection.

Now, some data will be read only, some will be read-write. I obviously need to maintain integrity of the data, and I understand that I cannot trust user input, I must verify everything. That part I don't know all the tricks of php yet (like htmlentities and escaping characters not to mention stored parameters of PDO), but I think I can find examples and tips easy enough.

The part I am left without a full understanding of, after a ton of research, is the sessions. I have a basic grasp of XSS and CSRF, as well as fixation and MITM and even session poisoning. This will be on a shared server, and I will have the ability to use SSL cert (although that too requires a bit more understanding of), but either way, it is the php sessions I need to understand.

My assumptions are, based on what I have been reading, that:

1. i will have to have clients enable cookies to use sessions with POST, and stay away from any GET work except on very low-level pages. This is just a deterrent, not a solution

2. i could add a hidden field on forms, and check the data within against a server side session variable - again, just one more layer to make it more difficult. This verifies the form data originated from my form, or that it was spoofed succesfully, either way, another easy check.

3. I could assume if the user_agent and remote_addr are present in a request, that they would remain on subsequent requests, so I could use those as well, by hashing them (and maybe padding them) so that I could check if the client data remained the same. Yet again, not a solution but another layer.

These seem to be some common steps I could take to help with security. I also plan to use HTTPS on logins, and know to escape user data to stop sql injection or verify data or in general make sure user data is not some form of code. I don't know how yet as I said, but I understand I must do it.

I could regenerate an ID on every page, but this seems to be viewed as both good and bad, depending what you read. I understand that you are making it hard to get the session ID if you are an attacker, but don't know implicitly really which the best route is. Any advice on this topic?

Finally, I am struggling to fully comprehend just how the session ID is used and what I must do to ensure it is valid.

For example, I know that when I use session_start() it creates a file in tmp, which is sess_<ID OF SESSION>. I have moved that directory to a top level directroy via php.ini.

So, I know the SID exists, I know I have an global array to utilize for this session. What I don't understand is, how do I actually check that this SID is correct? Do I have to set a variable in the session array? Is that variable passed to the client and returned in every POST or GET?

The common theme I see is to verify if a user is logged in, by creating some variable in $_SESSION like "logged_in" = 1 and then checking that to make sure it is. So does that mean that every session variable I set, it is only on server side, or does it get passed to the client as well? So if I do this:

$_SESSION["logged_in"] = "yes"

$_SESSION["var_x"] = "no"

$_SESSION["var_y"] = 88

Then those variables are passed to the client, and the client in return passes the back? Am I checking if they have been tampered with?

I understand on POST, if I have a form input that is hidden and it contains a hashed value of some number I generated, that I should expect that value to come back form the page in the POST, and I can check on that. I just haven't found much else about sessions other than a LOT of very similar examples of how to start the session, how to set variables in the session and how to check those session variables.

What is really happening here, and if I am concerned that a session might be stolen, or just in general, do I need to check that the session is valid?

I know, probably a very common set of questions. Honestly though, I have opened probably close to 1000 different web pages looking to inform myself, and I am still left with these questions.

Anyone care to educate me further?

Thanks.

mrwoo.

ManiacDan · July 12, 2012

First of all, I really love this post. It was really well written and I appreciate someone who tries not to ask for handouts.

1. i will have to have clients enable cookies to use sessions with POST, and stay away from any GET work except on very low-level pages. This is just a deterrent, not a solution
2. i could add a hidden field on forms, and check the data within against a server side session variable - again, just one more layer to make it more difficult. This verifies the form data originated from my form, or that it was spoofed succesfully, either way, another easy check.

3. I could assume if the user_agent and remote_addr are present in a request, that they would remain on subsequent requests, so I could use those as well, by hashing them (and maybe padding them) so that I could check if the client data remained the same. Yet again, not a solution but another layer.

This is all fine. Overkill, actually. Good for you.

know to escape user data to stop sql injection or verify data or in general make sure user data is not some form of code. I don't know how yet as I said, but I understand I must do it.

SQL Inection is prevented through either PDO bound parameters, or by wrapping all string inputs in mysql_real_escape_string and all numeric inputs in floatval. There's more to the theory, but that's the baseline answer.

Finally, I am struggling to fully comprehend just how the session ID is used and what I must do to ensure it is valid.

The session ID is always valid as long as the temporary file still exists. The security problem is usually session hijacking. You can prevent that by either going full SSL (like facebook did) or by storing user-agent and IP in the session and validating it every time. Rotating the session ID on every page load is a tactic used by the super-paranoid who believe man-in-the-middle attacks are likely and not just possible. I'm not saying it's a bad idea, but it feels like overkill.

So does that mean that every session variable I set, it is only on server side, or does it get passed to the client as well?

The session works like this:

- You call session_start for the first time

- A temp file with a random name is created

- A cookie called PHPSESSID (or the cookiename set in php.ini) is added to the header, with the value of that cookie being the name of the temp file

- Anything you add to $_SESSION is serialized in a custom format and stored in the file

- That file is unserialized every time session_start is called thereafter, and placed into $_SESSION. session_start checks for the existence of the session id cookie, and will either create a new session, or load an existing one

No data in the session is passed to the user, it is completely safe unless you purposely echo session values (or the server is compromised, obviously).

I understand on POST, if I have a form input that is hidden and it contains a hashed value of some number I generated, that I should expect that value to come back form the page in the POST, and I can check on that.

You can rely on POST data up to a point. It CAN be tampered with. If you really want to secure your post forms, do this:

-Hash the microtime and username of the current user

-put that in a hidden form field

-put it in the session

-when the form is submitted, compare them

Note that this tactic will stop your site from working in multiple tabs at once, as the session is global for the whole browser window.

What is really happening here, and if I am concerned that a session might be stolen, or just in general, do I need to check that the session is valid?

If you're really this concerned about your sessions, simply switching to straight SSL for the whole site would solve all your problems. The rest is just thought exercises, and none of them are 100% secure. It's like home security. Installing steel plates over your windows is 100% secure. The rest is just locks on a piece of glass. But steel plates are expensive and ugly.

If anything else is still confusing, let me know.

mrwoo · July 12, 2012

First of all, I really love this post. It was really well written and I appreciate someone who tries not to ask for handouts.

Thank you. I appreciate the same thing on other forums I use.

The session ID is always valid as long as the temporary file still exists. The security problem is usually session hijacking. You can prevent that by either going full SSL (like facebook did) or by storing user-agent and IP in the session and validating it every time. Rotating the session ID on every page load is a tactic used by the super-paranoid who believe man-in-the-middle attacks are likely and not just possible. I'm not saying it's a bad idea, but it feels like overkill.

Regarding MITM attacks, I am no so overly paranoid about that. I am more concerned that a typical user would login, then go to some other site (maybe to collect info for something on the form) and they could visit a compromised website, which could, theoretically, mis-use thier current "logged in" process to my site. I don't know if that is a valid reality, only one that I can imagine based on what I have read.

I have thought about full SSL, but two things I face are

1. i don't fully understand yet what I must do in creating pages to make this work. I 'think' all i have to do is make sure everything is not relative, but fully qualified with https:// . I was trying to modify some .conf files and .htaccess files, but honestly they are not the easiest to mess with. I was seeking to enforce a certain directory to always process as https, which would allow my to use relative paths.

2. i have read about the overhead of SSL. I am not experienced enough in web related applications to know how that will really work. I do know that I will be doing a lot of queries to the DB, so I have assumed that SSL would not be so hot during the entire session because of that.

Finally, you mention the session ID is always VALID. I guess I understand that, but... I will explain further down on this

The session works like this:

- You call session_start for the first time

- A temp file with a random name is created

- A cookie called PHPSESSID (or the cookiename set in php.ini) is added to the header, with the value of that cookie being the name of the temp file

- Anything you add to $_SESSION is serialized in a custom format and stored in the file

- That file is unserialized every time session_start is called thereafter, and placed into $_SESSION. session_start checks for the existence of the session id cookie, and will either create a new session, or load an existing one

No data in the session is passed to the user, it is completely safe unless you purposely echo session values (or the server is compromised, obviously).

Ok, that is a great synopsis. So I am correct that the client must accept cookies, as this is really just a cookie, a temporary one for the browser session only, but still a cookie with nothing but the session ID in it. So the session ID will always be included in POST or GET because of the cookie? True?

I have a generic idea of how the serialization works. Not important at this point, but I do sort of get what is happening, and why it exists in a file format and how it is called upon later.

My other question is still coming further down...

You can rely on POST data up to a point. It CAN be tampered with. If you really want to secure your post forms, do this:

-Hash the microtime and username of the current user

-put that in a hidden form field

-put it in the session

-when the form is submitted, compare them

Note that this tactic will stop your site from working in multiple tabs at once, as the session is global for the whole browser window.

I get this. POST can be easily modified, even using simple browser and add-on. I get the hidden field and storing that in session array to be brought up later. And I think what you mean is that since the TIME is part of the hash, only one page will have the TIME stored in the session array, thus multiple tabs, only ONE can possibly have the exact time. Makes sense.

What would be a common way to handle if the client duplicated tabs? Is this a request to the server and a reply to render, or is it generated from cache? Would the timestamp field be correct?

Also, how would you handle if the user had 3 tabs open, but closed the wrong one, and submitted one without the correct hash field? Would it be best to challenge user with credential check to make sure they are still "them", even at inconvenience to user? I would assume so.

And finally, the question I have been formulating and am not certain how to get it written down correctly.

If the server assigns a session ID when the client first visits, it is set in a cookie on client side (hopefully) or it is passed in the GET which is plainly visible (?phpsessid=<blah blah blah>). Either way, when the client then performs another request, and includes in GET or POST the session ID, what do you do with it? I know you said it is always VALID. But do I have to check it is the same?

Or, (this is a hard question/concept to try and describe) is it that once I (the client) get a session ID, every page I request, YOU (the server) just know who I am because I included the session ID in my request. So now YOU just assume I am really me, no questions asked. (I know, statelessness.. so annoying) Since I included a proper session ID, YOU can refer to YOUR session array, and find out what data YOU are storing for ME. Maybe my preferences or my account credentials. Either way, YOU simply assume that since I supplied the correct session ID, that I am ME.

Now, assume the expiration was reached, or YOU wanted to periodically regenerate my session ID. YOU do that, pass the new ID on to ME, then I simply use that on subsequent requests. As long as YOU keep track of what new ID is mine, whenever I request and give my current session ID, YOU are able to see what is in the array for that session.

What happens if someone pretending to be ME sends a request without MY session ID? Here is where the standard checks of "does the session array have logged in = 1" comes in? So every page that has elevated privelages checks to make sure MY session array has "logged in = 1". Then if someone comes in pretending to be ME, but does not have my session ID, but maybe an older session ID, like from a bookmark or maybe from an old referrer log somewhere - this session ID might not exist or just not have "logged in = 1" so they would be thrown over to the login screen?

I guess I am left wondering really how the actual ID comes into play, whether I (as the one coding the pages) need to do anything with it. And if it is transparent to me, as the coder, what implications might I need to know about if the session ID did not come back as valid.

I apologize for my rather poor attempt to describe my question. I am sort of an anal person and really like to fully understand things rather than just coding willy-nilly. Maybe you might get the general idea of what I am trying to convey. Hopefully.

If anything else is still confusing, let me know.

Thank you so much. I appreciate the time you have taken to help.

Again, thank you for your time.

mrwoo.

mrwoo · July 12, 2012

I forgot to mention, I am developing this using XAMPP and testing on a TurnKey LAMP machine. I also have a test site available on a private host for live tests.

mrwoo.

kicken · July 12, 2012

2. i have read about the overhead of SSL. I am not experienced enough in web related applications to know how that will really work. I do know that I will be doing a lot of queries to the DB, so I have assumed that SSL would not be so hot during the entire session because of that.

The overhead of SSL is mainly just extra CPU usage from all the data having to be encrypted/decrypted by the server at each request, and extra bandwidth costs because the encryption inflates the size some. It doesn't really effect the actual processing of the page such as your SQL queries and what not as they are not involved with the SSL process. Basically SSL only effects data you echo out, and the incoming request data.

Ok, that is a great synopsis. So I am correct that the client must accept cookies, as this is really just a cookie, a temporary one for the browser session only, but still a cookie with nothing but the session ID in it. So the session ID will always be included in POST or GET because of the cookie? True?

Pretty much. It is possible to pass the session ID through the URL, and there is even an option in the php.ini file to automatically re-write links and forms to include the ID, but this is generally considered bad pratice as it makes it more likely for the user to inadvertently pass their session to someone else (sharing a link) and might cause fixation problems if a search engine stores your site with a particular ID and references that ID is results.

Because of those issues it is generally recommended that you require the user to use cookies to pass the ID, and only accept a session ID from a cookie and not the URL. I believe PHP is configured this way by default these days.

Either way, when the client then performs another request, and includes in GET or POST the session ID, what do you do with it? I know you said it is always VALID. But do I have to check it is the same?

You don't have to do anything with it if you don't want. PHP handles it all behind the scenes when you call the session_start() function. For requests without a session ID it generates a new one and creates a new empty session. For requests that include an ID it will load that session from the temp file matching that ID. If by some chance that file does not exist, it acts as if it's a new empty session.

Either way, YOU simply assume that since I supplied the correct session ID, that I am ME.

Basically. This is where some of the other protection efforts might come in (checking the user-agent, ip address, etc) if you want make the extra effort. Generally though it's assumed that if you have an ID then you are who that ID says you are.

What happens if someone pretending to be ME sends a request without MY session ID?

Someone can't really pretend to be you and not have your session id. They need that in order for the server to consider them you. If by chance the site re-generates IDs periodically and they manage to get one of your older IDs, then what happens depends on how the regeneration is done. If in the process of re-generating the previous ID has it's data file deleted, then the user would just end up starting a new blank session and not be able to impersonate you. If on the other hand the old ID's data file is still intact and still contains your information then they could impersonate you using the old ID.

As such, if you want to re-generate ID's for your users just make sure that you delete the old ID's data file as part of the process.

mrwoo · July 12, 2012

The overhead of SSL is mainly just extra CPU usage from all the data having to be encrypted/decrypted by the server at each request, and extra bandwidth costs because the encryption inflates the size some. It doesn't really effect the actual processing of the page such as your SQL queries and what not as they are not involved with the SSL process. Basically SSL only effects data you echo out, and the incoming request data.

I guess I did not explain that properly. I do understand the select statement are processed server side and are not really in the realm of the SSL. Properly stated then, assuming that I have a large data set to return, wouldn't the encryption then put a stress on the process?

Maybe I should ask this as well. I am used to returning data into disconnected recordsets and then displaying that in a grid of some size. There is usually a vertical scroll, and the user just uses a mouse wheel to view the grid. I would put in filtering to get the grid to a smaller size and would include a preference of visible fields through an .ini file or something. But in web applications, would I be safe to assume I am only going to return a sub-set of the resultant data set? So my select statements won't really return all matches, just the first 20 or next 20, etc. In this way the navigation of the data displayed is used to basically fill up one "screen" worth of data. So, in that respect, the data sent from server to client, even over a more intensive mechanism like SSL would only be sending a small segment of the data, thus resource consumption would be minimized.

Did I explain that correctly? Please forgive me, as the limitation of web centric stuff is still new to me

Pretty much. It is possible to pass the session ID through the URL, and there is even an option in the php.ini file to automatically re-write links and forms to include the ID, but this is generally considered bad pratice as it makes it more likely for the user to inadvertently pass their session to someone else (sharing a link) and might cause fixation problems if a search engine stores your site with a particular ID and references that ID is results.

Because of those issues it is generally recommended that you require the user to use cookies to pass the ID, and only accept a session ID from a cookie and not the URL. I believe PHP is configured this way by default these days.

I was considering that since GET shows the ID (I presume it does anyway) I would probably be needing POST more often, just to make the barrier a little harder, and as you say, to help with fixation. What I am also confused about is just how and when the ID is transferred. If the client has no cookie support (turned off) then the ID is put in the GET? And if cookies are supported, then the ID is put in the header? If this is true, then I would need a mechanism to only work with clients with cookies enabled? I realize this might limit who uses my site, but right now it is intended to be used by specific people, so I think forcing them to use cookies would be acceptable.

I guess I am not well versed in how a header works and why I have seen so much mention of the GET having the session ID.

You don't have to do anything with it if you don't want. PHP handles it all behind the scenes when you call the session_start() function. For requests without a session ID it generates a new one and creates a new empty session. For requests that include an ID it will load that session from the temp file matching that ID. If by some chance that file does not exist, it acts as if it's a new empty session.

Oh. So each session is already "verified" that it exists. As long as the session has not been hijacked somehow, then when I pull my session variables, they apply only to the owner of that session. Obviously then if someone hijacked a session ID and posted it from thier browse, there is no way to know this, unless I use other methods like hidden fields which the hijacker might not have noticed or checking against user_agent or remote_addr. (I realize remote_addr may not be the best because of proxies and ISPs changing ip frequently)

Someone can't really pretend to be you and not have your session id.

lol, very true point. (I got quite a chuckle out of this)

They need that in order for the server to consider them you. If by chance the site re-generates IDs periodically and they manage to get one of your older IDs, then what happens depends on how the regeneration is done. If in the process of re-generating the previous ID has it's data file deleted, then the user would just end up starting a new blank session and not be able to impersonate you. If on the other hand the old ID's data file is still intact and still contains your information then they could impersonate you using the old ID.

As such, if you want to re-generate ID's for your users just make sure that you delete the old ID's data file as part of the process.

Yes, I guess I meant if they had an id from a referrer log, or they were sniffing the packets of my network, but inbetween the time they crafted a spoofed request and when they sent it, a different ID had been developed.

But it is a good point about old ID remaining. That is why you develop garbage routines then, to make sure old IDs are destroyed?

Thanks a lot for the time to reply. I am slowly getting the larger picture.

mrwoo.

mrwoo · July 12, 2012

So when you call session_start() on a given page, it creates a cookie for the client itself.

If the client then visits another page with session_start(), and the header includes the session ID, then it automatically uses that session?

I ask because I am reading a document about sessions, and they state

If a call is made to session_start( ), and the request contains the PHPSESSID cookie, PHP attempts to find the session file and initialize the associated session variables as discussed in the next section. However, if the identified session file can't be found, session_start( ) creates an empty session file.

This means that you don't have to pass anything to session_start() to indicate that a session ID exists, it is coded into that function to look if one exists?

mrwoo.

ManiacDan · July 12, 2012

Regarding MITM attacks, I am no so overly paranoid about that. I am more concerned that a typical user would login, then go to some other site (maybe to collect info for something on the form) and they could visit a compromised website, which could, theoretically, mis-use thier current "logged in" process to my site. I don't know if that is a valid reality, only one that I can imagine based on what I have read.

Not a valid reality, unless there's a very serious security hole in the browser itself. If the browser itself is compromised, your site is just as vulnerable as everyone else's.

1. i don't fully understand yet what I must do in creating pages to make this work. I 'think' all i have to do is make sure everything is not relative, but fully qualified with https:// . I was trying to modify some .conf files and .htaccess files, but honestly they are not the easiest to mess with. I was seeking to enforce a certain directory to always process as https, which would allow my to use relative paths.

Both methods are valid. Note that absolute paths are probably always preferred anyway, so in case you move a page to a different "depth" in your site, all the links still work.

2. i have read about the overhead of SSL.

Kicken addressed this, but I wanted to back him up: Negligible.

Ok, that is a great synopsis. So I am correct that the client must accept cookies, as this is really just a cookie, a temporary one for the browser session only, but still a cookie with nothing but the session ID in it. So the session ID will always be included in POST or GET because of the cookie? True?

All true. You can use php.ini to make the session cookie last longer than the current browsing session, but that's rarely done.

I have a generic idea of how the serialization works. Not important at this point, but I do sort of get what is happening, and why it exists in a file format and how it is called upon later.

The serialization format is not the same as the one for serialize(). It's very similar, but not the same. Nobody knows why, it's annoying and stupid.

What would be a common way to handle if the client duplicated tabs? Is this a request to the server and a reply to render, or is it generated from cache? Would the timestamp field be correct?

The reason why I mentioned tabbing is because if you use $_SESSION['formCheckHash'] to store this form validation hash, what happens if you draw two forms back to back in the same session? There will be two hashes, and only one value. One form will never be valid. I don't bother using form validation like this, it's too much of a pain in the ass. Rely on your other security methods to secure the session as a whole. PHP (and the web in general) is stateless. There is no "flow". Don't try to arbitrarily enforce one, it ends with this kind of unsolvable problem.

As for the whole "what is the session ID" question, which is a few paragraphs:

You don't ever have to know what the session ID is. Until you start writing your own session handler or changing fundamental behaviors of the session in general, you don't care about the session ID. You don't have to validate it, you don't have to pass it anywhere, you don't have to manually set it, you just trust that session_start() always establishes a session, using an existing one if available. Sessions are maintained by cookies, so while technically "they are passed with GET or POST" is true, it's actually part of the communication header itself, and is processed by the web server. The session comes through on all communication from the browser, including ajax calls.

Then if someone comes in pretending to be ME, but does not have my session ID, but maybe an older session ID, like from a bookmark or maybe from an old referrer log somewhere - this session ID might not exist or just not have "logged in = 1" so they would be thrown over to the login screen?

Yes, the sessions are automatically destroyed by the OS and/or by PHP, so if you close your browser, and someone else uses your session ID two hours, it's possible the session file was destroyed. However, session cookies are never stored on your hard drive, so it would be difficult, if not impossible, for someone to hijack them like that.

But in web applications, would I be safe to assume I am only going to return a sub-set of the resultant data set? So my select statements won't really return all matches, just the first 20 or next 20, etc

Almost always. Some database technologies make pagination really unreasonably difficult, but for the most part as long as you use LIMIT clauses, you're fine. SSL has nothing to do with it.

I was considering that since GET shows the ID (I presume it does anyway) I would probably be needing POST more often, just to make the barrier a little harder, and as you say, to help with fixation. What I am also confused about is just how and when the ID is transferred.

GET does not show the session ID. Anything you see with ?sessionid= in the URL is 5+ years old and should be disregarded. It's no longer a recommended method. The session ID is included in the communications headers alongside the message, it is not part of the message itself. Use POST for POSTing data, and GET for links and GETing data.

Yes, I guess I meant if they had an id from a referrer log, or they were sniffing the packets of my network, but inbetween the time they crafted a spoofed request and when they sent it, a different ID had been developed.

This is the most commonly invented "solution" to man in the middle attacks, but it doesn't think about the asynchronous nature of the web. What if you make a page which contains ajax calls, and the user makes two simultaneous ajax calls AND submits a form at the same time? You'd log them out.

But it is a good point about old ID remaining. That is why you develop garbage routines then, to make sure old IDs are destroyed?

Session garbage collection is built into PHP and you don't have to worry about it. If you move your session directory back to /tmp, then the OS will also clean up old sessions.

This means that you don't have to pass anything to session_start() to indicate that a session ID exists, it is coded into that function to look if one exists?

Yes

xyph · July 12, 2012

HOLY CRAP THIS IS A WALL OF A POST.

I reiterate ManiacDan... this is well written, and to the point. Thank you!

Now, using SSL pretty much solves 99% of your security issues.

You can automatically redirect http:// requests to https:// requests through Apache

http://wiki.apache.org/httpd/RedirectSSL

You can enforce this behaviour in your PHP application directly by checking $_SERVER['HTTPS']. As a backup or alternative, you can also check if $_SERVER['SERVER_PORT'] == 443, or whichever port you've defined.

Now, I'm a little paranoid, and because of that, I dislike PHP's default session handler... mostly the way it generates IDs. It's considered pretty weak from a security standpoint. This can be changed by forcing it to use /dev/urandom in php.ini

session.entropy_file = /dev/urandom
session.entropy_length = 512

Personally, I much prefer designing my own custom session handler, and moving session data over to the DB for easier management. It's fairly straightforward to do, and worth it if you like the extra control. There's tons of information about it as well (bad and good, you'll have to use a little logic to help pull only the good stuff from the bad articles... as usual).

As far as default session behaviour goes, it's designed to be stupid easy to use. You just call session_start(), and if the user has a valid cookie, your $_SESSION superglobal array is automatically populated. Any changes you make within the script will be applied to the file after execution. Garbage collection is done for you, and the settings can be changed via php.ini session.gc_probability, session.gc_divisor, session.gc_maxlifetime. It's designed to 'just work.'

About SSL and stress... it's pretty simple. If you are getting bottlenecked at the CPU level (huge traffic), throw more hardware at it. If you're getting bottlenecked at the network level, buy a bigger pipe. Even when sending large amounts of data, the user probably won't notice the slowdown caused by SSL unless they're using a 56k modem. The initial handshake can take longer than the actual data transfer, and sending the data in one large chunk is more efficient than several smaller chunks. Again though, unless you're pushing your CPU/pipe to it's max already, and your users are on <500kbit connections, the overhead from SSL is pretty much unnoticeable.

As an addition, you may want to check out Suhosin - http://www.hardened-php.net/suhosin/ It's designed to lock-down your PHP install by default, making potentially dangerous behaviour impossible (though this sometimes screws with scripts that safely perform this potentially dangerous behaviour).

Hope this adds to what's been said so far. Tons of great information in this thread!

mrwoo · July 12, 2012

Ah. Those remarks clear a lot up. Well, not so much exactly the underlying "how" so much as that it is all handled, and I really don't need to ensure any integrity on my part for much of it.

GET does not show the session ID. Anything you see with ?sessionid= in the URL is 5+ years old and should be disregarded. It's no longer a recommended method. The session ID is included in the communications headers alongside the message' date=' it is not part of the message itself. Use POST for POSTing data, and GET for links and GETing data.[/quote']

I am still confused about this part. Why do so many tutorials and examples show the evil GET with something like ?PHPSESSID=12345, and that it is bad because since it is visible text, anyone can just use it in thier own request. Am I just reading old material?

And if I understand this correctly, you are saying that now the ID is part of the request or header, visible from globals like the $_SERVER global would show the user_agent or $_POST global some assigned variable, and that it is not in the GET or POST at all. It is very possible I was confused about that in the first place. I should know better than to try and learn html/css/php/ajax/jscript/jquery and server configs and methods on all at the same time. Seems like one question leads into so many different directions with this stuff. There is a LOT more to it than just getting a server and creating content, if you want to do it correctly.

I have looked at more topics than I care to remember, much of them gone except for the link I saved. In wandering around, I was trying to find a good article that, without getting into the technical specs, would give a good explanation of the various steps involved with the web in general. Such as, what is a header, how is it created, what data is there that is not anywhere else. Those sort of things, the underlying foundation that would be really helpful to comprehend, just for questions like I am having. However, seems all the source I've looked at is either very technical or very generic. In fact, most things I seem to find are just repeats of similar articles.

Do you know of any good articles/resources for someone like myself, who is maybe a little more detail oriented without having to search through full blown specs of the model?

And again, I am most thankful for the time taken here to help me understand.

mrwoo.

xyph · July 12, 2012

Because a lot of tutorials are out of date, or flat out wrong. The issue isn't necessarily that it's visible (unless you're worried about reading-over-the-shoulder attacks), it's how easy a user might accidentally copy and paste a link with their session ID in it. When you bury the ID in a place the average user doesn't see (the headers), it makes it much more difficult for them to accidentally give it away.

POST data is contained in the headers as well. There's lots of information about HTTP headers, what they contain, and how they work. Overall, it's a pretty straight-forward concept. It's the 'metadata' of the request, information that both the client and server software use and implement, but the end user doesn't really care about. That's an over-simplification (I think), but it gives the general idea. Start off learning the basics of the HTTP protocol... it's information that will help you understand everything you learn further down the line.

Here's a good article on HTTP headers.

http://net.tutsplus.com/tutorials/other/http-headers-for-dummies/

I skimmed through it, and most of it seemed accurate and to the point. It might be a little over-simplified, but unless you plan on manually crafting HTTP requests or responses, that's not that big of a deal.

Generally, think of a session ID as a temporary password for that user. We use that in place of a password because it's mutable. Even if compromised, the second the session ID expires, the attacker no longer has access to the information protected by that temporary ID.

Also worth checking out is this directive - http://php.net/session.cookie-secure It forces cookies to only be sent over a secure connection. This can muck things up though, if you want sessions to persist between http and https connections. In this case, a custom session handler would be ideal, that can handle multiple session instances (one session for secure data only, and another session for insecure data)

mrwoo · July 12, 2012

HOLY CRAP THIS IS A WALL OF A POST.

I reiterate ManiacDan... this is well written, and to the point. Thank you!

You are most welcome. Wow. One of the very few places I have ever been where I am complemented on the detail I give. lol. Most people don't even reply to such detailed posts, probably because they are so long. But I digress..

Now, using SSL pretty much solves 99% of your security issues.

You can automatically redirect http:// requests to https:// requests through Apache

http://wiki.apache.org/httpd/RedirectSSL

Yes, I was assuming so. I have looked at many of the apache config articles, and many of the .htaccess/mod_rewrite articles. I understand them to a degree, but wow it is a lot to digest at one time. My first thought was, after understanding more intimately how sessions are working and what I need to be aware of to handle them correctly, was to create a simple test page and try out some redirects to force HTTPS. Oh, and some routines to force cookies as well.

You can enforce this behaviour in your PHP application directly by checking $_SERVER['HTTPS']. As a backup or alternative, you can also check if $_SERVER['SERVER_PORT'] == 443, or whichever port you've defined.

I have seen those many times, but did not understand thier use implicitly. Funny how a couple sentences can take abstract ideas and make them much more cement. Thanks.

Now, I'm a little paranoid....<snip>

Hmm. Was not aware of much of this. I have seen reference to the ID needing to be x byte large, I think in an OWASP article.

Garbage collection is done for you, and the settings can be changed via php.ini session.gc_probability, session.gc_divisor, session.gc_maxlifetime. It's designed to 'just work.'

ManiacDan mentioned setting the directory back to /tmp for the session files. I do know there are settings for the /tmp file to be cleaned out, which is why I assume he mentioned this. Now you mention garbage collection is done automatically. Let me ask, is this only true for the /tmp directory, or are you referencing areas of memory where the arrays etc are kept. I am at a loss I guess now for just what gets cleaned up, and what I have to do to make sure it is cleaned up. I think I understand the importance, from a security standpoint, of why it needs to be cleaned up. I had read as well that it was advised to change the /tmp directory because it was a possible breach point.

Honestly, so many articles are available, but you don't know just how reliable any of them are. At least in a forum you are highly likely to get people answer from experience, and others to verify such answers. It just takes more time in a forum

About SSL and stress... <snip>

Points duly noted. Thank you.

Regarding Suhosin, I have seen mention of that quite a bit. However, I am going to have to wait on that - just too much off the beaten track right now I think. Heck the .htaccess stuff is going down a rabbit trail away from my focus enough already lol.

Hope this adds to what's been said so far. Tons of great information in this thread!

Absolutely! I'm not to proud to admit I know jack. School of hard knocks has taught me enough already, might as well glean what I can off those who don't mind sharing.

mrwoo.

ManiacDan · July 12, 2012

I am still confused about this part. Why do so many tutorials and examples show the evil GET with something like ?PHPSESSID=12345, and that it is bad because since it is visible text, anyone can just use it in thier own request. Am I just reading old material?

PHP is a young language. Tutorials which are 5-6 years old are no longer valid, yet they still exist because that's really not that long ago. Using the session in the URL is wrong. Stick to books and tutorials written since the release of PHP 5.3.

And if I understand this correctly, you are saying that now the ID is part of the request or header, visible from globals like the $_SERVER global would show the user_agent or $_POST global some assigned variable, and that it is not in the GET or POST at all.

It's part of the request header, visible from $_COOKIE since it's a cookie. I don't believe it's in $_SERVER, that's another part of the header. It is not in GET or POST

I should know better than to try and learn html/css/php/ajax/jscript/jquery and server configs and methods on all at the same time.

Welcome to web programming. 5 distinct languages, 10 technologies, and an entire theory of programming. Good luck.

Such as, what is a header, how is it created, what data is there that is not anywhere else. Those sort of things, the underlying foundation that would be really helpful to comprehend, just for questions like I am having. However, seems all the source I've looked at is either very technical or very generic. In fact, most things I seem to find are just repeats of similar articles.

The problem you're having is twofold:

1) Most of what you're saying isn't part of "web programming" at all, it's part of "internet communications." The REMOTE_ADDR that you see in PHP's $_SERVER array is not special to PHP, apache, or the web. It's part of TCP itself. That same header is where cookies live, but cookies are unique to web development.

2) It's difficult to answer such a deep question without getting too technical. To give you half the answer is disingenuous, because then you end up thinking you have the whole answer when you don't have much at all.

Do you know of any good articles/resources for someone like myself, who is maybe a little more detail oriented without having to search through full blown specs of the model?

Everything is built in layers. Honestly, if you're that interested, start at the OSI Model. That's the actual "stack" that governs internet communication as a whole. Understanding that allows you to realize that Apache and PHP sit at the uppermost layer (layer 7, application). Once you know how internet communication works and how actual messages are constructed and what constitutes a "connection" (which is really just random packets), then you understand why client-server programming is stateless (and it's silly to try to maintain form post state, like we discussed before).

A good top-level summary of PHP is:

0) The user clicks a link, which instructs his computer to set a GET (or POST) request to your server. The browser formulates the base packet headers.

1) Other things (apache, the OS, the network card itself) manage communications speeds, getting data to and from the client, and handling communication metadata (IP address, user-agent, cookies, forwarding headers, proxying, cache control, and dozens of other pieces of information that come through with every request). POST is part of the header as well.

2) PHP receives a request and $_SERVER, $_COOKIE, $_GET, and $_POST already exist. PHP scripts are, for the most part, safe to assume everything in those arrays is correct (correct, not valid).

3) PHP processes its data, performs queries, reads files, etc. While the session is unique to each user, and $_POST and $_GET (as well as all other script variables) are unique to each request, the files and database are unique only to each server. It's confusing, and must be thought through extensively before you get good at it.

4) PHP spits out (usually) an HTML formatted string, which may or may not contain references to other documents, CSS, JavaScript, etc.

5) That message is transmitted to the user, along with all the header information relevant to that request. The headers that the server sends include the content type, cache control, location headers, status codes, redirects, cookies, and more. This is the packet header, not the HTML <head> tag. Do not get them confused.

6) The user receives the packets from the server and constructs them into a displayable message, rendering the HTML and executing the javascript on the user's end. Javascript is executed on the user's computer and has no access to anything but (usually) the current page contents. If any additional documents were linked to (like <img> or <script> tags), they are fetched as well.

7) A request has been complete, from click to display.

ManiacDan mentioned setting the directory back to /tmp for the session files. I do know there are settings for the /tmp file to be cleaned out, which is why I assume he mentioned this. Now you mention garbage collection is done automatically. Let me ask, is this only true for the /tmp directory, or are you referencing areas of memory where the arrays etc are kept.

PHP clears out memory entirely when your script dies. Each PHP "program" runs for less than a second apiece (if you do it right). PHP also clears out physical session files periodically based on the php.ini settings. Additionally, your operating system clears out /tmp periodically based on file age, since that's what /tmp is for. You are not responsible for garbage collection on any level. The only exception to this is if you store garbage in your database on purpose (like if you store sessions in the database) or if your scripts are using so much memory that they need to be cleaned mid-execution.

xyph · July 12, 2012

The nice part about Suhosin is that you don't really need to know anything about it. You just install, and it gives you the 'default best.' It's no replacement for a good sysadmin with a black-belt in PHP, but overall it's considered 'good enough.'

On session IDs - The larger the bit-size, the more entropy, generally speaking. I think it's just as important to consider the source of data though... pseudo-random data is NOT cryptographically strong, and that's what PHP uses by default for SID generation. For most though, it's 'good enough.' For how easy it is to use /dev/urandom as a source (one line in php.ini), I don't see why you wouldn't (assuming access to /dev/urandom).

Session files contain sensitive information. With the Suhosin patch, these files are transparently encrypted for you, so it's less of an issue with that installed. Because they're sensitive, I see /tmp/ as a bad place for them. It's not a magic place to put anything that needs to be deleted later. /tmp/ should have pretty universal access, and that's why I don't really like it for session data storage. I'm not an experienced *nix system admin though, so I might get corrected here. PHP has it's own garbage collection system though, and how often it cleans up can be changed using the settings I posted above. The php.ini file describes what each does via comments, and it's also posted in the manual.

Sorry I'm not quoting properly, or grammar, or ideas kinda being everywhere. I'm slapping this together while trying to get some work done

mrwoo · July 12, 2012

Hmm. I will have to look into Suhosin a bit more. If it is that easy, it might not take a lot to absorb the initial aspects anyway.

I will inspect the php.ini file a bit on those values. Certainly can't hurt to shove a little more knowledge in the old melon.

I am no *nix guru either, maybe a windows near-guru, but that is altogether different.

Thanks for taking the time. That link on HTTP headers was good, I learned a lot from that. Sometimes learning makes one a little embarrassed about earlier questions, but that is how you learn I guess.

Thanks again.

mrwoo.

Sign In

Session understanding - not a "write it for me" post

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived

Important Information