:::: MENU ::::

Wednesday, February 18, 2009

Introduction

This article describes a way to use ASP.NET Routing to avoid 404 Not Found errors when changing folder structure or folder names in a website.

What to do with obsolete links to your website?

Having a website means spending some time and effort promoting the site on the Internet, making sure search engines index all the pages, and trying to get exposure through blogs or discussion boards.

And, then you get new ideas and really need to restructure your site – change some folder names, move some pages, etc. What will happen with all those third-party links to your site you were so proud of? Do you want to lose them?

Route old URLs to new site structure

With the arrival of the .NET Framework 3.5 SP1, we have got an elegant way of solving this problem – ASP.NET Routing. Initially, it was a part of the ASP.NET MVC Preview 2, and now it is a part of the framework.

The idea is to add special "Routes" to the site, having a single goal of processing requests to pages which are no longer present on the site. In its simplistic form, the processing can happen in a single ASPX page responsible for proper handling of requests. Here is an example:

The attached project contains all the parts you'll need: WebFormRouteHandler created by Chris Cavanagh representing an IRouteHandler implementation, a Global.asax file registering your Routes, a web.config file where you register the WebFormRouteHandler, and a Default.aspx page responsible for actual request processing.

Let's take a look at the Global.asax:

http://www.codeproject.com/images/minus.gifCollapse

void Application_Start(object sender, EventArgs e)
{
    // runs on application startup
    RegisterMyRoutes(System.Web.Routing.RouteTable.Routes);
}
 
private void RegisterMyRoutes(System.Web.Routing.RouteCollection routes)
{
    // reference IRouteHandler implementation
    // (example created by Chris Cavanagh)
    // see http://chriscavanagh.wordpress.com/
    //            2008/03/11/aspnet-routing-goodbye-url-rewriting/
    var startPageRouteHandler = new WebFormRouteHandler("~/default.aspx");
 
    // exclude .axd to handle web services and AJAX without checking all routs
    // see http://msdn.microsoft.com/en-us/library/
    //            system.web.routing.stoproutinghandler.aspx
    routes.Add(new System.Web.Routing.Route("{resource}.axd/{*pathInfo}", 
               new System.Web.Routing.StopRoutingHandler()));
    routes.Add(new System.Web.Routing.Route("{service}.asmx/{*path}", 
               new System.Web.Routing.StopRoutingHandler()));
 
    // mapping:
    // extracts folder name and page name as items in HttpContext.Items
    routes.Add(new System.Web.Routing.Route("{folderName}/", 
               startPageRouteHandler));
    routes.Add(new System.Web.Routing.Route("{folderName}/{pageName}", 
               startPageRouteHandler));
}

Here, we defined a single route handler - default.aspx, as well as routing rules.

Rule #1:

http://www.codeproject.com/images/minus.gifCollapse

routes.Add(new System.Web.Routing.Route("{folderName}/", startPageRouteHandler));

states that all requests to a URL with the structure "http://mysite.com/something" will be processed by the default.aspx page if there is no actual "something" found on the site. For example, there is a RealPage.aspx page present on the site, so requests to http://mysite.com/RealPage.aspx will be processed by that page.

But, if the client requests RealPage2.aspx, that request will be processed by the default.aspx page according to rule #1. Note that the client will not be redirected to default.aspx, it will be just the web server running code in default.aspx in response to the request. For the client, the response will come from RealPage2.aspx.

You can add as many routes as you want, for example, rule #2:

http://www.codeproject.com/images/minus.gifCollapse

routes.Add(new System.Web.Routing.Route("{folderName}/{pageName}", startPageRouteHandler));

stating that all requests to a URL with the structure "http://mysite.com/somefolder/somethingelse" will be processed by the default.aspx page if there is no actual "somefolder/somethingelse" found on the site.

The code behind default.aspx shows how to extract those parts of the request. As you can see, they will be placed in the HttpContext.Items collection.

http://www.codeproject.com/images/minus.gifCollapse

lblFolder.Text = Context.Items["folderName"] as string;
lblPage.Text = Context.Items["pageName"] as string;

How it works in real life

Here is a real life website actually using this technique - Digitsy Global Store. Besides handling obsolete URLs, the ASP.NET Routing is being used to handle multiple languages on the site, switching CultureInfo on the fly:

http://www.codeproject.com/images/minus.gifCollapse

protected void Page_PreInit(object sender, EventArgs e)
{
    CultureInfo lang = new CultureInfo(getCurrentLanguage());
    Thread.CurrentThread.CurrentCulture = lang;
    Thread.CurrentThread.CurrentUICulture = lang;
}
private static string getCurrentLanguage()
{
    string lang = HttpContext.Current.Items["language"] as string;
    switch (lang)
    {
        case "france":
            return "fr-FR";
        case "canada":
            return "en-CA";
        case "germany":
            return "de-DE";
        case "japan":
            return "ja-JP";
        case "uk":
            return "en-GB";
        case "russia":
            return "ru-RU";
        default:
            return "en-US";
    }
}

As you can see, the default language is English, United States: "en-US". In internal links, the site uses the structure http://{sitename}/{language}/…other things…

So, if you try http://digitsy.com/us/, you'll get the US version, trying http://digitsy.com/japan/ will bring you the Japanese one, and if you try http://digitsy.com/whatever – you'll not get a 404 error, you'll get the US version again.

ASP.NET Routing makes restructuring of the site really easy. The folder structure "{language}/{index}/category/{categoryID}" was recently replaced by "{language}/{index}/shopping/{categoryID}". There is supposed to be no "category" folder in the site structure anymore. But because both routes are pointing to the same handling page, both the folders "category" and "shopping" return valid responses.

Trying http://digitsy.com/us/Electronics/shopping/541966 will use the rule:

http://www.codeproject.com/images/minus.gifCollapse

routes.Add(new System.Web.Routing.Route("{language}/{index}/shopping/{categoryID}", 
           categoryRouteHandler));

while trying http://digitsy.com/us/Electronics/category/541966 will use:

http://www.codeproject.com/images/minus.gifCollapse

routes.Add(new System.Web.Routing.Route("{language}/{index}/category/{categoryID}", 
          categoryRouteHandler));

and both will resolve to the same route handling page.

Things to remember

This is really simple if you know what you are doing. I mean, you should be aware of some implications. Check out Phil Haack's post discussing "one subtle potential security issue to be aware of when using routing with URL Authorization."

You should also verify if your hosting provider supports SP1 for .NET Framework 3.5. Many hosting providers still don't have SP1 installed on their servers because of incompatibility with some old software.

More