In June 2005 Google announced “Google Sitemaps”, this is a new way for webmasters to communicate with search engines, and help the search engines index their sites, via machine readable (XML) sitemaps. Google’s website describes the service:
The Sitemap Protocol allows you to inform search engine crawlers about URLs on your Web sites that are available for crawling. A Sitemap consists of a list of URLs and may also contain additional information about those URLs, such as when they were last modified, how frequently they change, etc.
Google has made the sitemap protocol an open protocol and it is available for use by other search engines and can be incorporated into webserver and content management software.
Why use a Google Sitemap
- Does Google misjudge the importance of pages on your website?
Perhaps you have a product page in your catalogue that is linked to from lots of sites on the web and has a high pagerank - this page may often appear above your home page in the search engine results page even for searches that are for your company generally, not that specific page. The sitemap protocol enables you to indicate the relative importance of the pages on your site. Like all such changes a sitemap will be only one of a range of factors which influences a page’s position in a search engine’s results page. - Do you have dynamic content that is not indexed by search engines?
There are many ways of tackling this problem, the sitemap protocol is a new tool to help ensure that your site is indexed in depth. As the sitemap protocol is new it should not be relied upon to ensure deep indexing. Those with sitemaps registered with google can access a diagnostics page showing the effectiveness of Google’s crawling. - Do search engines crawl your site too much?
The sitemap protocol enables webmasters to suggest to search engine robots how often particular pages should be indexed. This could potentially reduce the bandwidth used by search engine robots on dynamic sites.
It is worth noting that much of this functionality has been available for a number of years, META tags can be used to indicate the frequency with which a page is expected to change, and RSS and RDF formats have provided lists of deep links to search engines. Google’s “sitemap” webpages note RSS and even a plain text file with URLs on each line as examples of ways of providing some of the information which can now be presented in a sitemap file to the search engines.
Sitemap Resources
- https://www.google.com/webmasters/sitemaps - Google’s official site on the protocol.
- Google’s official sitemaps blog
- Google Groups - discussion on the sitemap protocol.
- Code for producing sitemaps:
There are a number of sources of information from which a sitemap can be generated. The simplest being the webserver’s filesystem - however this is not appriopriate for most modern dynamic databsase driven sites, for which generating the sitemap from the database will probably be the most appriopriate course of action. A further source of information is webserver logs - generating sitemaps from server logs opens up the possibility of assigning priority to particlar pages based on their popularity.- Sci7’s Sitemap generator for Wordpress (Plugin).
- Sci7’s Sitemap generator for Mercuryboard.
- Sitemap generator for vBulletin .
- Sitemap generator for Serendipity sites.
- Google’s Sitemap generator - written in Python - can extract data from log files or a directory structure.
- Sitemap for Movabletype, and another with better handling of the priority and lastmodified fields.
- A Perl Module - establishing a Sitemap object - while this doesn’t provide much functionality at the moment it’s use will aid the interoperability of scripts producing sitemaps complient with the new standard .
- An ASP script for producing a Google sitemap based on the structure of the filesystem.
- Online tool for turning a list of links into an XML Google site map. As Google accepts a plain text file containing URLs as a sitemap this is of questionable usefulness.
- Sitemap Generator for Xoops. (Note installation required creating a copy of xml google.php in the Xoops root.
Telling search engines about a sitemap
Currently Google is the only search engine with a mechanism for allowing webmasters to request their sitemaps are read. Two options are provided, you can signup for a Google account and add your sitemaps manually via this form on Google’s site, this option enables you to monitor how often your sitemap is being reviewed by Google and enables error notifications to be viewed. Another option - designed for automated “pinging” is available via:
http://www.google.com/webmasters/sitemaps/ping?sitemap=http://example.com/sitemap.xml
Sci7 is able to rapidly specify and deploy sitemaps complying Google’s standard and can incorporate automated methods for updating the sitemaps and issuing notifications into existing content management systems. Sitemaps can also be used as part of a strategy for opening up deep content, such as that ususal accessible only via a user’s search results, so that it can be indexed by all search engines. If you would like to talk about optimising your website for both users and search engines, please feel free to get in touch.
June 5th, 2005 at 4:46 pm
I have made a very simple explanation and form to create a sitemap. Would love your comments. No PHP or scripting knowledge is required.
Ben
http://www.googlesitemap.info
June 5th, 2005 at 5:32 pm
Sitemap_gen.asp
A simple ASP script (Using File System Object or database) to automatically produce sitemaps for a webserver, in the Google Sitemap Protocol (GSP)
http://www.iteam5.net/francesco/sitemap_gen/
June 6th, 2005 at 4:52 pm
Google SiteMaps for Movable Type - now with correct Last Modified dates
The two other stabs at creating Sitemap templates for Movable Type don’t take into account that you might want to assign different priorities and scan frequencies to your entries (as described here), as well as correcting ‘Last Modified date’ to inc…
June 10th, 2005 at 11:56 am
Hi,
I’m using Google Sitemap Generator for Wordpress. Very easy to install + use.
Benn
March 31st, 2006 at 12:23 pm
[…] t discusses Google sitemaps Arne Brachholds Google Sitemap Generator Plugin for WordPress Google sitemap generator for […]
May 26th, 2006 at 10:03 am
Using Google Sitemaps
If you’re launching a new site, using Google Sitemaps is a must. The reason is simple - it alerts Google about your existence. Including all your pages in the XML file ensures that Googlebot will have an easier time crawling
May 27th, 2006 at 12:06 pm
[…] ge, etc.". Fair enough that a sitemap should do some good to your blog. Learn how to generate a Google sitemaps and submit it to Google to help boosting your indexing po […]