|
4/21/2007 Sitemap auto discovery via the robots.txt protocol
By: Melanie Prough All the crawlers currently recognize the robots.txt protocol, so auto discovery was the natural evolution. The top 3 engines, Yahoo; Google; and Ask.com., have announced their support of the sitemap inclusion protocol. So supposedly no more submitting sitemaps manually, but I would still submit new sitemaps for a few months to be safe. Here you can read the 4/11/07 post from Vanessa Fox concerning the development and the protocol. I played around with this for several hours, and to my dismay could not validate the robots file after adding the sitemap. After much searching, posting and reading I found some help and suggestions. Putting all that I read in to force...Below is how to add your sitemap without a syntax error. Sitemap: http://www.your_domain.com/sitemap.xml User-agent: * Disallow: /cgi-bin/ Ok first thing, if your map is titled # Robots.txt file for www.your_domain.com Then you will space a line under it before adding the sitemap line. The sitemap line above is accurate for sitemaps.org protocol. If you do not space between the top/title and the sitemap command it will not validate in Goggle's Webmaster Tools. To avoid any other possible syntax issues, I also spaced a line after the sitemap directive. The spaces in theory mean nothing to a robot. I went ahead and got on board with this, I will keep this article up to date as the stats develop changes in either direction. Going forward in this early stage is a risk, but also an opportunity for a lower PR to get a leg up. Melanie Prough [Federation of Webmasters] Feel free to reprint as long a credit & links remain intact. © 2007-2008 Webchronic.com |