A study of Session State

We thought that this month we would examine one of the most common problems associated with designing a web site that offers a little more than just static pages. Whilst many of the examples here are for Microsoft technologies, the principles apply for any web server. The problem is one of keep a track of variables between web pages. So what does that mean in English? Imagine three web pages and we will call them imaginatively page1.html page2.html and page3.html .

Well if you have a variable on page1.html, say the user input into a text box on a form then when that form is submitted and loads up page2.html, obviously this page can use the data from the form on page1.html. However if the user now moves to page3.html, this page will know nothing about the previously entered information. So the business of keeping track of variables through a web site can be tricky. The most common occurrence of this is keeping track of the users id in a ecommerce site so that a shopping basket can be maintained. If the web site looses track of who the user is then their shopping basket can not be displayed as this is usually fed from a database with each item market with the user id.

The easy way to loose information

There are several ways to maintain variable information between web pages, each with its benefits and each suited to different web scenarios. The first and most obvious is to use cookies. These are small text files that the web server sends out and the browser stores them on the local hard disc. Many people are paranoid about these and will set their browser to ignore cookies. The Session Object that Microsoft encourages developers to use in Active Server Pages, which whilst making for easy storing of these all important variables, uses cookies to perform the task. There are some inherent limitations to cookies themselves which could restrict certain applications. The RFC (Request For Comments) 2109 (http://www.rfc-editor.org/rfcsearch.html) , which is the documentation that defines how cookies should be implemented in the internet, states that user-agents should at least restrict the size of a given cookie description to 4096 bytes. Secondly, only 20 cookies should be allowed from a single domain. And thirdly , no more than 300 cookies from all web sites visited should exist on the users machine. A quick check on Mark's machine showed nearly 700 cookies, so it looks like Microsoft didn't read that bit of the RFC. Mark is currently developing an e-commerce site where the Client has stated that they do not want cookies used, not the first time this has been requested and achieved. So cookies are out, what else could you use?

Request For Comment 2109

Obviously the ideal solution for an internet facing web site would have to work on a variety of browsers. A solution presents its self with the URL to the page. A normal page request is something along the lines of http://www.mysite.com/firstpage.html. However we can include extra information in the URL which the web server will do nothing with but just pass it on. So if we wanted the next page to know the userid that the first page generated we could change the link on the first page to read http://www.mysite.com/firstpage.html?userid=1234 and if we had more than one value to pass then we could use an ampersand to add to the URL. So in this case we could have http://www.mysite.com/firstpage.html?usrid=1234&username=Fred . In fact this is just what an HTML Form on a page does if it uses the 'GET' method of sending data to the next page.

The syntax for this is something like <form action='secondpage.html' method='GET'> . When you use a form the values from all the form elements , including the hidden ones, are returned in the URL string in the format just described. You have probably seen these enormous and cryptic URLs whilst browsing and, if you are that way inclined you can alter these URLs manually and see what happens to the resulting pages. Obviously as this information is very visible you should not send confidential data this way, but it certainly is an easy way of passing variables between pages.

To read these variables on an ASP page you simply use request.querystring("userid") You don't have to use forms, but can code the variables into a normal HREF tag as we have described above. If however you are using forms and change the method in the FORM tag to 'POST' then variables do not appear in the URL. So what has happened here? The variables are still passed to the second page but are done so in the HTML header. This is part of the HTML message that is not visible on a browser page but you can still access these variables with the ASP code request.form("userid").

Now it would be useful during development to see these variables in the URL and then when the site goes live to switch the form method to POST so that they are hidden from the user. If you plan to do this then in your code instead of the previous syntax you can use the simpler request("userid") and ASP will sort out which type of from method has sent the data. Easy isn't it ? Well yes and no. There is a maximum limit of about 240 characters to a 'GET' string and it can be quite a struggle keeping track of all the variables on a site. The alternative is to store the variables in a database and do a look up each time, hardly a scalable solution. It is a problem which has dogged internet web development.

However, on an Intranet where you can specify the browser or if you have an IE5 version of your web site then you can use an extremely neat technology which is built into IE5 called Persistence Behaviors. These are applied through Cascading Style Sheets (CSS). The first behavior we will look at is the saveSnapshot, this allows a user to save a form to their local hard disc with the form containing the filled in data. In real terms this appears of limited us to us. To achieve this your HTML page with one text input box on would look something like this:

<HTML>
<HEAD>
<META NAME="save" CONTENT="snapshot">
<STYLE> .saveSnapshot {behavior:url(#default#saveSnapshot);} </STYLE> </HEAD>
<BODY>
<INPUT class=saveSnapshot type=text id=oPersist >
</BODY>
</HTML>

The saved file would have the text that the user typed in the text box stored in the value property of the INPUT tag, so if for instance the user typed Fred Smith then the HTML would have changed to <INPUT class=saveSnapshot type=text id=oPersist value="Fred Smith">. Two more behaviours are saveFavorite and saveHistory.

SaveFavorite will store any form variables that were on the page when the user adds that page to their favourites, and these variables will be returned to the page when ever the user clicks on that favourite link. SaveHistory is more useful in that it stores the variables in memory during a browsing session, so that if you return to a page with a form on then the objects on that form contain the details that you had just put in. We have all experienced the frustration of filling in a form only to have to return to it to find all the boxes empty and so having to repeat the laborious process of entering the information all over again. Finally the most useful of the behaviours, userData, which allows you to save arbitrary XML data to disk.

This data is stored on the user's hard disc in their profile under Application Data\Microsoft\Internet Explorer\UserData. Go on have a look.........OK back are we? Good, now userData allows structured XML data to be stored, which make it a much more flexible method than simple cookies. The limitations for this data is 64K per page and 640K per domain. So it is possible to use these to store the entire details of a shopping basket or a share portfolio without having to keep making calls back to the web server. This would speed things up considerably and take a considerable load off the server. As we said at the beginning these behaviours are only for IE5 at present, and as such lend them selves more to intranets than internets. So for many of us it's back to the old way of doing things, but we can dream of the day when this sort of stuff become easy.