There are many sites that deal with returning a Google sitemap from .NET pages. Most of these need you to adjust the IIS settings (yes this is about Windows hosting).

There are also some that deal with creating a sitemap on-the-fly from the web.sitemap file in your project but here I’ve included the code to return an XML sitemap that conforms to the Sitemap protocol that you can submit to Google without modifying IIS – something that should interest those of you who are on shared hosting.

The ZIP download is available at the bottom of this article.

Basically if you create a blank ASPX page and clear out all the HTML elements from the ASPX page you will just be left with the <% @Page %> definition. Below is an example of the only line that needs to be in the front file (.ASPX).

For your purposes, just add the ContentType=”text/xml” section. It may NOT be necessary once you read through the page-behind code, but I’ve left it in as it doesn’t hurt.

Example:

<%@ Page Language="C#" AutoEventWireup="true" CodeFile="XMLSiteMap.aspx.cs" Inherits="XMLSiteMap" ContentType="text/xml" %>

Next you will need to put the GSiteMap.cs file in your App_Code folder.

In the page-behind code, you can then simply call the class and all the work is done for you. The code uses the filesystem (whether it is running locally or on a remote server) to generate the sitemap. It will also return the correct protocol type (http or https) and the port number if not on port 80.

I have used this method before the generate an XML file in the filesystem but since my hosting provider doesn’t allow ASPNET to write to the root directory of the site, returning the sitemap on-the-fly is the only truely automated method for this.

In the page-behind’s On_Load event:

protected void Page_Load(object sender, EventArgs e){GSitemap _siteMap = new GSitemap();_siteMap.ProcessRequestFS(Context);}

This simply passes the current HTTPContext to the sitemapping class allowing it to replace the Response with your pure XML sitemap.

I won’t go into the full code at this point because you can read through it yourself from the download. It’s worth pointing out the following however:

private string[] _Allowed_Extensions = { ".aspx", ".php", ".asp", ".htm", ".html", ".txt", ".doc", ".pdf", ".jpg", ".gif", ".xml" };private string[] _Restricted_Directories = { "App_Data", "App_Code", "admin" };

1. Put any extensions you want to be indexed in the “Allowed Extensions” array.

2. Put any directories you don’t want indexed in the “Restricted Directories” array.

Where the code pulls a list of files from each directory I initially used a file pattern, ie:

"*." + Extention

but found that some files were being indexed twice – this is because of a flaw in the framework that will return .ASPX files when you ask for .ASP files. For this reason I re-worked the code. It’s less efficient this way but it’s guaranteed to work.

The call to “ProcessRequestFS” iteratively goes through each directory adding files to the sitemap. If a directory is blocked by the “Restricted Directories” array then all sub-directories of that Directory are also blocked.

You can see an example of the output of this code by visiting: (not currently available)

On my site you may notice that I have temporary removed the optional tags from the sitemap. They are however created in the version available for download.

In particular, the priority tag is automatically down-graded for each directory further down the path that the script has to look.

There is no real error handling in this version but you can add that as necessary.

I checked with Google and Yahoo! and as far as I can see they have no problem with you adding a sitemap with the .ASPX extension.

The full code can be downloaded here: http://www.aaronreynolds.co.uk/page/Code.aspx

The full code is unavailable at the moment and will be online again soon.

If you have any problems using the code, please let me know.

AR

If you enjoyed this post, make sure you subscribe to my RSS feed!