19 Sep 09

Many permutations lead to Rome

In the reconstitution of this blog and indeed the exercise of porting out from Drupal to Word Press I gave some thought to the (age old) issue of content duplication and what we have come to know and love as ‘canonical URLs’. To give a really simple explanation, search engines can resolve multiple URLs to the same actual content page e.g. stevenimmons.org, www.stevenimmons.org, stevenimmons.org/index.php (and so on)…

In other words many permutations of the destination URL lead to the same content. This can lead to confusion in the search engine over what is (and is not) duplicate content (in a very rudimentary sense each URL is treated by the search engine as a separate page). If you watch the video below you will discover there are nuances and ways to manage this issue, including using Google Sitemaps and Webmaster tools where preferred canonical URL can be specified.

Your options

  • The best way to avoid this problem entirely is make the Content Management System present consistent / normalised URLs and make your internal links consistent (i.e. not linking to variants, although of course that doesn’t work with inbound linking which is out of your control). I am aware of the irony of having done this very thing in the examples I gave above! (smile)
  • Next up is 301 redirect which can be used to ‘route’ from the variant URL into the destination you wish to make your unique page. Watch your case sensitivity in paths, as this can also be goofy with Microsoft IIS (there is a good illustrated example in the video)
  • Watching out for the management of session id’s, breadcrumbs etc.
  • Canonical Link Element

The Canonical Link Element has been around for 6+ months, if you’re interested in SEO and like to dip below the covers then give this video from Matt Cutts at Google a play. It should be obvious by the time you reach the end that Google don’t see this as a panacea, but it’s certainly useful to have ‘in the back pocket.’ It’s really like a 301 redirect limited to a single domain, and is a useful way to present ‘pretty URLs’. There’s a number of corner cases to consider, so pay attention to the sections on ‘don’t shoot yourself in the foot’…

Reblog this post [with Zemanta]
pixelstats trackingpixel

Post to Twitter Tweet This Post

  • Share/Bookmark

Filed under: Web Technology - Trackback Uri



Leave a Comment