Saturday, 20 March 2010

ASP.Net and Custom Error Pages, an SEO nightmare?

It’s a (conscientious) developer’s worst fear:

You’ve slaved long and hard to produce a top-notch, blistering fast website that fully shows off your coding prowess and skills, you unveil the website to critical acclaim and universal client approval (imagine the cheering crowds) but then out-of-hours the database server fails!  Yerrk!

All the developer’s are out celebrating a successful so no-one notices the log file growing bigger and bigger, screaming to be heard…..and then the client notices and it really hits the fan!

After a frantic few hours the DB hardware’s restored and the website returns to it’s former glory….crisis over.

Or is it?

You’re a good web developer, of course you are, you’ve enabled the custom error pages in ASP.Net (you’ve probably event set the servers deployment property to retail to make doubly sure).  Every visitor to the site during your downtime would have seen a nice and friendly message informing them that there’s a problem with the site and it will be up again shortly.

Maybe, you’ve even reduced the potential impact by adding some marketing blurb about how great your products are, nice meaning full copy to make your visitors want to return to the site when it’s back when the sites back up.  In fact, just the sort of copy that search engines love – D’oh!  If your site has had issues whilst being indexed, that error message could very well come back to haunt you!

Now in fairness, it’s unlikely that your error page will be ranked as the #1 authority for a particular subject. Search engines have gotten pretty good at spotting these gaffs, but it won’t be helping your page rankings any either.

Here’s why, when your ASP.Net application throws an unhandled exception it naturally wants to return a 500 status code (Internal Server Error) and if you don’t have custom errors switched on, that’s what you get which leads to the Yellow Screen of Death!  But with custom error handling switched on there’s a neat little HttpModule that detects the 500 status code (Internal Server Error) and turns this in a redirect (302 status code) to your nice custom error page.  This then renders successfully (they’ll be no errors with your error page after all!) and returns a 200 status (OK).

For a visitor (human or web crawler), this looks exactly the same as a normal redirect.

Clearly then what we need is a smarter custom error HttpModule to selectively redirect the visitor based on whether or not they’re a search engine.  In fact, one just like this:

   1: using System;
   2: using System.Web;
   3: using System.Net;
   4: using System.Collections.Generic;
   5: using System.Configuration;
   6: using System.Web.Configuration;
   7:  
   8:  
   9: namespace MartinOnDotNet.Website.Support
  10: {
  11:     /// <summary>
  12:     /// Handles errors in an SEO friendly manner
  13:     /// </summary>
  14:     public class SeoErrorLoggingModule : IHttpModule
  15:     {
  16:  
  17:         /// <summary>
  18:         /// Called when [error].
  19:         /// </summary>
  20:         /// <param name="sender">The sender.</param>
  21:         /// <param name="e">The <see cref="System.EventArgs"/> instance containing the event data.</param>
  22:         protected virtual void OnError(object sender, EventArgs e)
  23:         {
  24:             HttpApplication application = (HttpApplication)sender;
  25:             HttpContext context = application.Context;
  26:             if (context != null && context.AllErrors != null)
  27:             {
  28:                 foreach (Exception ex in context.AllErrors)
  29:                 {
  30:                     ex.Data["RawUrl"] = context.Request.RawUrl;
  31:                     HttpException hex = ex as HttpException;
  32:                     if (hex != null && hex.GetHttpCode() == (int)HttpStatusCode.NotFound)
  33:                     {
  34:                         Logging.Logger.LogWarning(string.Format(System.Globalization.CultureInfo.InvariantCulture, "Requested File Not Found {0} ({1})", context.Request.RawUrl, context.Request.Url));
  35:                     }
  36:                     else
  37:                     {
  38:                         Logging.Logger.Log(ex);
  39:                     }
  40:                    
  41:                 }
  42:             }
  43:             HttpException httpException = context.Error as HttpException;
  44:             context.Response.Clear();
  45:             if (httpException != null)
  46:                 context.Response.StatusCode = httpException.GetHttpCode();
  47:             else
  48:                 context.Response.StatusCode = (int)HttpStatusCode.InternalServerError;
  49:             if (context.IsCustomErrorEnabled
  50:                 && !context.Request.Browser.Crawler
  51:                 && !IsAnErrorPage(context.Request.RawUrl))
  52:             {
  53:                 context.ClearError();
  54:                 string path = GetPathForError(context, (HttpStatusCode)context.Response.StatusCode);
  55:                 if (!string.IsNullOrEmpty(path))
  56:                 {
  57:                     context.Response.Redirect(path, true);
  58:                 }
  59:             }
  60:         }
  61:  
  62:         /// <summary>
  63:         /// Gets the path for error.
  64:         /// </summary>
  65:         /// <param name="current">The current.</param>
  66:         /// <param name="status">The status.</param>
  67:         /// <returns></returns>
  68:         protected virtual string GetPathForError(HttpContext current, HttpStatusCode status)
  69:         {
  70:             CustomErrorsSection customErrors = WebConfigurationManager.GetSection("system.web/customErrors") as CustomErrorsSection;
  71:             foreach (CustomError ce in customErrors.Errors)
  72:             {
  73:                 if (ce.StatusCode == (int)status) return ce.Redirect;
  74:             }
  75:             return customErrors.DefaultRedirect;
  76:         }
  77:  
  78:         /// <summary>
  79:         /// Determines whether the given path (RawUrl) is an error page itself
  80:         /// </summary>
  81:         /// <param name="path">The path.</param>
  82:         /// <returns>
  83:         ///     <c>true</c> if [is an error page] [the specified path]; otherwise, <c>false</c>.
  84:         /// </returns>
  85:         protected virtual bool IsAnErrorPage(string path)
  86:         {
  87:             if (ErrorPages != null)
  88:             {
  89:                 foreach (string s in ErrorPages)
  90:                 {
  91:                     if (path.IndexOf(s, StringComparison.OrdinalIgnoreCase) > -1) return true;
  92:                 }
  93:             }
  94:             return false;
  95:         }
  96:  
  97:         /// <summary>
  98:         /// Gets the error pages.
  99:         /// </summary>
 100:         /// <value>The error pages.</value>
 101:         protected virtual IEnumerable<string> ErrorPages
 102:         {
 103:             get
 104:             {
 105:                 CustomErrorsSection customErrors = WebConfigurationManager.GetSection("system.web/customErrors") as CustomErrorsSection;
 106:                 foreach (CustomError ce in customErrors.Errors)
 107:                 {
 108:                     yield return ce.Redirect;
 109:                 }
 110:                 yield return customErrors.DefaultRedirect;
 111:             }
 112:         }
 113:  
 114:        /// <summary>
 115:        /// Disposes of the resources (other than memory) used by the module that implements <see cref="T:System.Web.IHttpModule"/>.
 116:        /// </summary>
 117:        public void Dispose()
 118:        {
 119:            //clean-up code here.
 120:        }
 121:  
 122:        /// <summary>
 123:        /// Initializes a module and prepares it to handle requests.
 124:        /// </summary>
 125:        /// <param name="context">An <see cref="T:System.Web.HttpApplication"/> that provides access to the methods, properties, and events common to all application objects within an ASP.NET application</param>
 126:        public void Init(HttpApplication context)
 127:        {
 128:            // Below is an example of how you can handle LogRequest event and provide 
 129:            // custom logging implementation for it
 130:            context.Error += new EventHandler(OnError);
 131:        }
 132:  
 133:  
 134:     }
 135: }

You’ll notice that this module also handles logging the error and differentiates between real exceptions (500) and file not found (404)  allowing a custom page to be displayed for either. 

To use the code simply register the module in the system.web/httpModules (for IIS 6) of System.Webserver/Modules (for IIS 7) section of your web.config and you’re good to go.  For best affect, it will probably be worthwhile putting an up-to-day .browsers file in your App_Browsers directory as well.

7 comments:

  1. Nice idea, I have learned the hard way that a very low percentage of people will click on the #1 listing in Google if the title is "Unhandled Exception Occurred" :-)

    ReplyDelete
  2. hi.., myself is azad singh, i have just started to learn .net few times, so i am have no idea about this, but after read your post, i feel, it is really informative information. thanks
    Internet Marketing

    ReplyDelete
  3. thanks for sharing such valuable information in regards of custom error page in asp.net, really helped to me, may be beneficial to other also.
    Fat Burning Furnace

    ReplyDelete
  4. nice post, really have informative information, thanks for publishing this post.
    Underground Hypnosis

    ReplyDelete
  5. Thanks for the post. This is an interesting example and could be useful, but why not instead just always add a "noindex,noarchive" header to the custom error redirect page?

    ReplyDelete
  6. Excellent point and does work (provided the spider respects metatags/headers,etc) but does mean that the tags need to be added on every error page.

    The module provided is generic and reusable, so once configured no further thinking is required per specialised error page.

    ReplyDelete

Got something to say? Let it out then!
Comments are moderated, so it may take a while to for them to be displayed here!