Monday, April 4, 2011

An Introduction to Cookies



You might have heard about cookies, but what exactly are they and what can we actually do with them? In this tutorial, we will focus on the basics of cookies, and learn about their functionality in various web application and site environments. We will also learn how to use them within our PHP and JavaScript projects, while paying particular attention to security issues that might arise when using them. After reading this, you’ll have acquired the necessary skill set to implement cookies safely into your own web applications.
Step 1. Understanding Cookies
The first step on our journey is to discovered just what these cookies actually are! Even if you have already worked with them, you might still find this part quite useful – so stay with me!
Abstract
You can most easily think of cookies as text files, which are saved to your computer. On the request of a webserver, your browser creates such a file. After this happens, the webserver can read and write content from and to this file. Although this seems like a dangerous feature – after all, no one likes other people writing files to their computer, there are a few restrictions in place to make this process as safe as possible.

  • Web servers can only access cookies which are set to their own domain. This domain is set by the browser when a new cookie is requested by the web server, and can only be the domain or a subdomain of the web server (the web server can choose a subdomain if it wants to). This means that cookies which were set by, for example, google.com can’t be read by mozilla.com, and vice versa.
  • According to the HTTP protocol, cookies can’t be larger than 4096 Bytes (4KB) each.
  • There is a limit to the number of cookies per domain. The number differs per browser, however, the generally used limit is twenty cookies. This is to prevent a single domain from hogging the disk space of the client.
  • There is a limit to the total number of cookies on the client’s harddrive. This number also differs per browser, but is usually limited to around three hundred cookies. When this number is exceeded, an older cookie is deleted before a new one is created.
Cookies have an expiration date. This date is set so the browser can delete old cookies when they are no longer needed by the webserver. If the expiration date is empty, the cookie will be deleted when the connection with the server is closed. This occurs when the site’s window or tab is closed by the user, or when the user closes the entire browser. These cookies, sometimes called session cookies, are mostly used for storing temporary settings.
Technical
Let’s find out what these things look like on a technical level. Cookies are transfered via the HTTP protocol. This is the protocol used by browsers to retrieve and send files to the server. After a cookie has been requested, it is sent to the server every time a new item on the web page is fetched by the browser. Below, we can see a snippet of a server requesting a new cookie (this snippet is a part of a HTTP response).
Set-Cookie: Name=content data; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/; domain=.example.net

Now don’t get scared, it’s all very understandable!
  • Set-Cookie: is to let the browser know that the server would like to create a new cookie.
  • Name is the name of the cookie. Each cookie in a domain must have a different name, so the browser can keep all the cookies apart. After the name comes the =content data where ‘content data’ is the data which is to be contained in the cookie. This data can be a text string or a number, and, as said, can be up to 4KB in size.
  • expires= is the command for the expiration date. The expiration date is in the “Wdy, DD-Mon-YYYY HH:MMS GMT” format (Don’t ask me why it was defined to this rediculous format, because I don’t know either. No user ever sees the expiration date, so why waste memory, hard disc space, and bandwith on long dates?). Don’t worry about it though, because most programming languages have easy to use functions available to you. The browser automatically deletes cookies with an expiration date in the past.
  • The domain and path require some deeper explanation. The domain is the domain in which the cookie will be active. If the domain is ‘ads.google.com,’ the cookie will only be sent to the server of that domain, and if the domain is ‘google.com,’ the cookie will be sent to any server of any of the subdomains of Google, including google.com itself.
  • The path is the path of the domain to which the cookie is sent. This means that, if the path is set to ‘/images/,’ and the domain is set to ‘ads.google.com,’ the cookie will only be sent to the server if the browser requests a file from ‘ads.google.com/images/’. If the path is set to ‘/’, the cookie will be sent to the server regardless of the location of the requested file on the server.
In the next step, we’ll review how these properties can be used in programming languages.

Step 2. How to Create and Read Cookies
Cookies can be created in many ways, but, for the purposes of this tutorial, we will focus on PHP and JavaScript.
PHP
The most important thing to remember, when creating a cookie in PHP, is that you must set all cookies before you send any data to the browser. This means that you should always initialise new cookies before any output. This includes echo() or print() commands, and the or tags. Of course, there are some exceptions, but this is a general rule of thumb.

<?php
/***Creating a cookie***/
$name = 'clientname';
$value = 'Peter Griffin';
//time() gives current time in seconds, and we add 60 seconds * 30 = 30 minutes
//so this cookie expires in 30 minutes.
//You may notice that the expire date is in seconds, PHP translates this to
//the correct format internally!
$expireDate = time() + 60 * 30;
$path = '/example/';
$domain = 'test.envato.com';
$secure = false; //only transmit the cookie if a HTTPS connection is established
$httponly = true; //make cookie available only for the HTTP protocol (and not for JavaScript)
setcookie( $name, $value, $expireDate, $path, $domain, $secure, $httponly);

<html>
.... //all content etc goes here
?>


This should seem familiar by now, except for $secure and $httponly. The ‘secure’ is to force the cookie to only be sent if an HTTPS connection has been established, if set to true, and should normally be set to false. The ‘httponly’ makes the cookie only available through the HTTP protocol, meaning that client-side languages, like JavaScript and VBscript, can’t access the cookie. This helps to prevent nasty stuff, such as Cross Site Scripting, and should be set to true if you have no intentions of editing the cookies client-sided with a language like JavaScript. Also, to prevent misconceptions, “httponly” does not mean that cookies can’t be sent over HTTPS, because they still can, in fact. However, please do note that the above snippet can be made quite smaller (and should be):


<?php
setcookie( 'clientname', 'Peter Griffin', time()+60*30, '/example/', 'test.envato.com', false,true);
?>

Great! Now we can create cookies, but we have to be able to read them as well. Luckily for us, PHP makes this very easy once a cookie has already been created. In PHP, there’s an environment variable called $_COOKIE[], which can be used for extracting the value of the cookie. To use it, just insert the name of the cookie inside the brackets [] like so:
view plaincopy to clipboardprint?

<?php
$cookieValue = $_COOKIE['name of the cookie'];
?>

This environtment variable can be used like any other. Just like $_GET[] and $_POST[], it can be treated directly as a normal variable (once you have checked if the cookie does indeed exist ofcourse) if you want to.
JavaScript
Cookies can be read and written client-sided as well. Even though JavaScript doesn’t offer a nice solution to read and write cookies, it is possible and widely used. JavaScript uses the **************** object for cookie manipulation, as shown in the following snippet:
//get current date 
var expiredate = new Date();
//increase date by 5 hours
expiredate.setHours( expiredate.getHours() + 5);
**************** = 'cookiename=cookievalue; expires=' + expiredate.toUTCString() + 'path=/example/; domain=test.envato.com';

As you might have noticed, this syntax is quite similar to the HTTP protocol notation. This has the advantage of being more in control, but also introduces some potential problems. Below is the snippet for reading a cookie.

var cookieName = 'testcookiename'; 
var textArray = ****************.split(';'); //put all the parts of the string in an array
for(var i = 0; i < textArray.length; i++){ // loop though all string pieces var textPiece = textArray[i]; //contains 1 string piece //filter beginning spaces while(textPiece(0)==' ') textPiece = textPiece.substring(1,textPiece.length); //if the textpiece contains our cookies name if (textPiece.indexOf(cookieName)== 0){ //return whats after the cookies name return textPiece.substring(cookieName.length,c.length); } }

I know, I know; this is a pain. Luckily for you guys, I’m posting some pre-written functions below (you might want to make your own functions for learning purposes though, and you should!).
Please do bear in mind that these snippets don’t contain any error checking.

Step 3. What to do with Cookies
Quote:
Did you know? -
Cookies were invented by Netscape, which wanted to use them for creating a shopping cart for an online shop. Thanks to cookies people were able to keep items their cart, even after disconnecting from the shop.
Nowadays, we use cookies for almost every purpose you can think of. You can use them for saving user settings like name, language, location or screen size. This can improve the quality of the service you want to provide for a client, because you can optimize the service for a client and remember this optimization in the future. For instance, you could save the client’s preferred language to a cookie, and, afterward, show your site’s content in the preferred language every time the client visits your site.
Of course, there are plenty more fun things to do with cookies than this! In the next step, I’ll show you an example of a cool code snippet.

Step 4. Writing Cool Stuff
Finally! Now we can start writing some awesome code! Below is a bonus snippet, which uses cookies to create a relogin mechanism.
“Remember me” login snippet
Before we begin, this snippet contains some MySQL code. If you’re not familiar with MySQL, don’t panic. Even though this snippet is a bit difficult, it should be understandable with a bit of basic PHP and cookie knowledge.
To create a “remember me” implementation, we must have a few things. Firstly, we need a database table containing a username, password, and identification field. Secondly, we need a unique string or number to identify clients safely through cookies (this is the identification in the database table). In this snippet, we’ll use an SHA-1 digest, which is just a string, as an identifier. When used properly, this provides excellent security.
Quote:
Most people just insert a username and password in the cookie, and send it to the server to automatically. This should be avoided at all times! Cookies are usually sent through a non-secure connection, so the content could easily be seen by any potential attackers.


<?php

//this assumes that the user has just logged in
/****Creating an identification string****/

$username; //normally the username would be known after login

//create a digest from two random values and the username
$digest = sha1(strval(rand(0,microtime(true)) + $username + strval(microtime(true));

//save to database (assuming connection is already made)
mysql_query('UPDATE users SET reloginDigest="'.$digest.'" WHERE username="'.$username.'"');

//set the cookie
setcookie( 'reloginID', $digest, time()+60*60*24*7,'/', 'test.example.com', false, true);

//this assumes that the user is logged out and cookie is set
/****Verifying users through the cookie****/

$digest = $_COOKIE['reloginID'];
$digest = mysql_real_escape_string($digest); //filter any malicious content

//check database for digest
$result = mysql_query('SELECT username FROM users WHERE reloginDigest="'.$digest.'"');
//check if a digest was found
if(mysql_num_rows($result) == 1){
$userdata = mysql_fetch_object($result);
$username = $userdata->username;

//here you should set a new digest for the next relogin using the above code!

echo 'You have successfully logged in, '.$username;

} else{
//digest didn't exist (or more of the same digests were found, but that's not going to happen)
echo "failed to login!";
}

?>

By using a digest like we did, the chances of getting two of the same digest is miniscule. A digest is a forty character string, which, in theory, should always provide a complete random ouput if the input is changed. In practice, you should add a time limit in the serverside code, so that the digest isn’t valid after X minutes. This prevents attackers from copying someone’s cookies and using them to login.
Step 5. Best Practices
We’ve almost reached the end of this tutorial. As a conclusion I would like to sum up some best practices:
  • Never insert sensitive data into a cookie. A client could be browsing on a public computer, so don’t leave any personal information behind.
  • Never trust data coming from cookies. Always filter strings and numbers! An attacker could write malicious data to the cookie in order to do something you don’t want your service to do.
  • Try to estimate how long the cookie should be valid, and set the expiration date accordingly. You don’t want to hog the client’s computer with old cookies which are set to expire in a hundred years.
  • Always set the secure and httponly to meet your application demands. If your application doesn’t edit the cookies with JavaScript, enable httponly. If you always have an HTTPS connection, enable secure. This improves the data’s integrity and confidentiality.

0 comments:

Post a Comment

Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More