PageRank tool walkthrough
- June 18th, 2009
- Write comment
I’ve added a video on YouTube to walkthrough using the PageRank checking DLL you can download from my website.
(by the way, there is no audio).
Archive for the ‘Google’ Category
I’ve added a video on YouTube to walkthrough using the PageRank checking DLL you can download from my website.
(by the way, there is no audio).
Going back a few years I downloaded a UK postcode database that mapped Outcodes (the first part of the postcode) to approximate Lat/Long co-ordinates. When used correctly you could use the data to find which postcodes were closest (distance matching) or simply the distance between two places by comparing the Lat/Long co-ordinates with a bit of maths. The main drawback was the need for a database that was only accurate to within a few km – no good for showing actual locations on a map or for local services – as the co-ordinates returned were for the centre of the Outcode.
Nowadays however Google’s geocoding service allow you to calculate distances between two addresses or postcodes effortlessly. You can then plot these directly onto a Google map and best of all, it’s all free to use.
My latest project provided an opportunity to make use of this facility. I had initially intended to use UpMyStreet or Google Local Search to furnish the information my client needed. Upon inspection however, neither service had a comprehensive enough list of local services.
The decision was made to go back a step and manually enter what the client deemed the most important local landmarks or services. This was done simply by providing a postcode or partial address and from this I was able to provide the distance to the service and plot it on a map.
There are many useful applications for this and I’ll be adding some of these facilities to this website over the next few weeks I hope. As I add facilities I’ll be sure to add source code to demonstrate how simple these operations have now become.
A few days ago I discovered through Google Webmaster Tools that some of my sites were unreachable by Google. In particular Google reported my homepage as unreachable and many of my sub-pages returned the same 403-6 error.
After checking my website both with various browsers and response grabbers located around the world I determined that the problem was definately not with the way I had designed the website or configured the hosting.
The HTTP 403-6 error means that an incoming request has been denied (Forbidden) because the IP address is banned or rejected in some way. Initially I tried to determine if this was because of some response error on my part; this was not the case.
I checked my server logs and could see the same errors showing up on every request that came from the GoogleBot (incidentally all requests were on the same IP). I notified my hosting provider and eventually the problem was rectified and the IP address used by GoogleBot at that point in time was unblocked.
I re-submitted my sitemaps and shortly afterwards the errors started to disappear from Google’s Webmaster Tools portal. I hoped that I had got to the root of the problem in time.
The next day while routinely checking websites I discovered that the homepages had disappeared from Google’s search results. Initially I thought that the sites had dropped on their keyword matches but “site:URL” checks showed the actual pages had been dropped.
It has taken 2-3 days for these pages to reappear in the search results and I am still waiting for some pages to come back in. Personally I found the timing to be very bad as I am trying to build Google’s confidence in my websites.
This all points to one of my main tips for SEO. It doesn’t matter about anything else if Google etc cannot see your website. Choose your hosting company carefully.
AR
There are many sites that deal with returning a Google sitemap from .NET pages. Most of these need you to adjust the IIS settings (yes this is about Windows hosting).
There are also some that deal with creating a sitemap on-the-fly from the web.sitemap file in your project but here I’ve included the code to return an XML sitemap that conforms to the Sitemap protocol that you can submit to Google without modifying IIS – something that should interest those of you who are on shared hosting.
The ZIP download is available at the bottom of this article.
Basically if you create a blank ASPX page and clear out all the HTML elements from the ASPX page you will just be left with the <% @Page %> definition. Below is an example of the only line that needs to be in the front file (.ASPX).
For your purposes, just add the ContentType=”text/xml” section. It may NOT be necessary once you read through the page-behind code, but I’ve left it in as it doesn’t hurt.
Example:
<%@ Page Language="C#" AutoEventWireup="true" CodeFile="XMLSiteMap.aspx.cs" Inherits="XMLSiteMap" ContentType="text/xml" %>
Next you will need to put the GSiteMap.cs file in your App_Code folder.
In the page-behind code, you can then simply call the class and all the work is done for you. The code uses the filesystem (whether it is running locally or on a remote server) to generate the sitemap. It will also return the correct protocol type (http or https) and the port number if not on port 80.
I have used this method before the generate an XML file in the filesystem but since my hosting provider doesn’t allow ASPNET to write to the root directory of the site, returning the sitemap on-the-fly is the only truely automated method for this.
In the page-behind’s On_Load event:
protected void Page_Load(object sender, EventArgs e){GSitemap _siteMap = new GSitemap();_siteMap.ProcessRequestFS(Context);}
This simply passes the current HTTPContext to the sitemapping class allowing it to replace the Response with your pure XML sitemap.
I won’t go into the full code at this point because you can read through it yourself from the download. It’s worth pointing out the following however:
private string[] _Allowed_Extensions = { ".aspx", ".php", ".asp", ".htm", ".html", ".txt", ".doc", ".pdf", ".jpg", ".gif", ".xml" };private string[] _Restricted_Directories = { "App_Data", "App_Code", "admin" };
1. Put any extensions you want to be indexed in the “Allowed Extensions” array.
2. Put any directories you don’t want indexed in the “Restricted Directories” array.
Where the code pulls a list of files from each directory I initially used a file pattern, ie:
"*." + Extention
but found that some files were being indexed twice – this is because of a flaw in the framework that will return .ASPX files when you ask for .ASP files. For this reason I re-worked the code. It’s less efficient this way but it’s guaranteed to work.
The call to “ProcessRequestFS” iteratively goes through each directory adding files to the sitemap. If a directory is blocked by the “Restricted Directories” array then all sub-directories of that Directory are also blocked.
You can see an example of the output of this code by visiting: (not currently available)
On my site you may notice that I have temporary removed the optional tags from the sitemap. They are however created in the version available for download.
In particular, the priority tag is automatically down-graded for each directory further down the path that the script has to look.
There is no real error handling in this version but you can add that as necessary.
I checked with Google and Yahoo! and as far as I can see they have no problem with you adding a sitemap with the .ASPX extension.
The full code can be downloaded here: http://www.aaronreynolds.co.uk/page/Code.aspx
The full code is unavailable at the moment and will be online again soon.
If you have any problems using the code, please let me know.
AR
A couple of weeks ago another chap named Aaron asked me if I had done a copy of the PageRank code but this time in VB.Net
I said I’d take a look because the versions out on the Internet that he had seen didn’t work. Well this evening I found the time to do the VB.Net version.
I started by manually converting my C# code to VB.Net – bear in mind I haven’t worked in VB.Net for over 2 years now. When the conversion was complete I found that VB.Net was converting and handling integer types slightly differently to C#.
When my code failed to work I resulted to using a free online C# to VB.Net conversion utility on devfusion. The conversion was pretty much word for word what I had put and still didn’t work.
I then went through the process of stepping each iteration and for-next loop until I found the cause of the differences in values being retured. Turns out in the end it was simply the “integer division” operator that I needed to use in place of the standard division operator.
So here it is: go to download pagerank tool code in VB.net and C#
Also bear in mind this is .Net v2 code – I haven’t tested it on previous versions of VB.Net
If you use this in your code or on your website, I’d appreciate you linking back to my website.
Enjoy
I’ve uploaded a copy of the code in a demo website for the pagerank tool in C#.
You can download it from: Full PageRank code with demo site
I’ve tried out a few websites and URLs that people have posted to me and got mixed results. I will be updating it if I can find fixes. In particular the hash seems to fail on some sub-directories.
Download it and try it out for yourself. Feedback is always welcome and drop a comment while you’re here.
You can also checkout my online demo at (not currently live)
I’ve had some feedback about the google PR checker I have written in C#.
Basically I will be adding a property that allows you to select the datacentre to query and I will be randomising the PR requests where this property is not set.
Additionally I will be adding an automatic double-check facility so if a site ending in a slash doesn’t return a PR it will check again without the slash and vice-versa.
I will post again once the updates are made and post a link to the download.
For those of you wanting to download the current version, you can find it here:PageRank tool in C#
Thanks,
AR
After much trawling of the Internet all I could find for automated PageRank gathering was a PHP script or two.
Now since I don’t like to write in PHP, I wrote a version in C#. I haven’t included the source code but I have created a downloadable ZIP containing the C# dll and a test project.
In order to use the dll, simply reference it in your project as follows:
using PageRank;
...
private void btnPR_Click(object sender, EventArgs e)
{
try
{
TGooglePR _pageRank = new TGooglePR();
string _PR =
_pageRank.ReturnPageRank(txtSiteURL.Text);
lblPageRank.Text = “PageRank: ” + _PR;
}
catch (Exception ex)
{
lblPageRank.Text = “Fault Occurred!”;
}
}
…
Have fun with it!