August 23, 2007
Sitemap Protocol | Deep Crawling and
Faster Indexing
I recently took place in a discussion regarding the use of sitemaps.
The debate was whether or they are necessary or beneficial on smaller
sites. So today I am going to spell the process out, as I understand
it.
First of all, I will admit there are 2
downfalls to the sitemap protocol as I see it.
-
Your sitemap can provide and easy road map for scrapers. I recommend
hiding your sitemap. put it in an odd directory somewhere. This will
only help a little if your sitemap is in your robots.txt as the new
auto discovery calls for. The fact is if a scraper wants your
content still they will just crawl using one of the many available
free programs, and they will have it.
-
The second problem with the protocol is only 50K pages are allowed.
If your site has over 50K pages, I'm jealous, but check in to nested
sitemaps.
They are far more benefits for using the
sitemap protocol which is now supported by Google, Ask.com, Yahoo, and
MSN / Live.
-
Sites with complicated structure or deep
clicks to internal pages are more easily crawled.
-
You will receive a more complete crawl, and save bandwidth by
setting low priority on the pages that spiders don't need to visit
as often. You can set a different crawl rate for every page.
-
When you update or add new pages, you can ping, all of the above
engines except MSN / Live with a new sitemap. In my experience this
has brought a fairly quick crawl.
-
The use and submission of you sitemap make many of the Google
Webmasters Tools work. Like 404's, broken links, and crawl errors.
-
You can report a "last modified" date for each URL.
-
If you have a huge amount of URLs you can submit a sitemap of just
changed URLs by creating a sitemap index file and using a lastmod
tag to spell out when each sitemap in it has been modified. On
really large site this saves a ton of bandwidth.
-
There are many program to convert your sitemap.xml to a html version
suitable for visitors. The real benefit of this is to add it in each
page, perhaps in the footer, and it will seriously reinforce your
internal linking structure.
So if you are considering making yourself
a sitemap, lets explore that. First thing is to find a service
suitable for making your map. Here are some generators, check them
out...you want them to create sitemap protocol .90 and make sure they
will spider enough URLs for your site. If your site is too large for
these then look at the
Google sitemap generator. It is also nice to have a generator that
pings the engines. Anyhow, explore them.
-
Wordpress sitemap plugin
-
xml-sitemaps.com
-
Audit My PC - Has directions for encoding to ping too (click the
webmaster tool image)
-
GSite Crawler - PC Platform
Now that you have a sitemap, lets put in
in your robots.txt. View the syntax below..

Notice the space above and below the
command. Sitemap command is capitalized. Before we go any further
validate your
robots.txt and
sitemap.
Now lets tell Google, Yahoo, and Ask.com
about your new sitemap.
Google has it's own Webmasters tools, just add your site and
verify it. The in the dashboard link, click "add sitemap" and enter
the path to your sitemap and submit.
Yahoo Site Explorer is pretty much the same as process Google. Now
ping
Ask.com with it's location. MSN / Live Tools are in Beta to be
released in the fall.
I have some helpful articles on this
subject you might find interesting. There is a little known fact about
a type of sitemap Yahoo still uses,
read Yahoo Site Explorer.
To ROR or not to ROR, ROR sitemaps are currently supported by
Yahoo and Google.
I would be interested to hear your sitemap
experiences...comment below.
Peace and SEO
Melanie Prough
"Baby"
SEOCog.com

This work is licensed under a
Creative Commons Attribution-No Derivative Works 3.0 License.
© 2007-2008 Webchonic.com
| Feel free to reprint as long
a credit & links remain intact. |
|