To control how and when your website is crawled, create a robots.txt file in the top-level (root) directory of your website. In the robots.txt file, you can specify which web crawlers to allow or block. Note that while MSNBot complies with the standards for robots.txt, not all web crawlers comply.
To conform to the Robots Exclusion Standard, MSNBot searches for robots.txt. When you create the file, make sure that the file is named robots.txt. Crawling and indexing restrictions may not work correctly if you name the file robot.txt.
Each time MSNBot crawls your website, it looks in your web server's root directory for a robots.txt file. If the file exists, MSNBot checks to see if MSNBot is an allowed user agent, and if any crawling or indexing restrictions have been set.
To set which web crawlers can access your website, use the syntax in the table below for your robots.txt file. MSN Search also includes image searching provided by Picsearch. If you do not want your images indexed, you can block the Picsearch crawler, Psbot, as described in the following table.
Text strings in the robots.txt file are not case-sensitive.
To do this: | Use this syntax: |
---|---|
Allow all robots full access and to prevent "file not found: robots.txt" errors | Create an empty robots.txt file |
Allow all robots complete access |
User-agent: * |
Allow only MSNBot access |
User-agent: msnbot |
Exclude all robots from the entire server |
User-agent: * |
Exclude only MSNBot |
User-agent: msnbot |
Exclude only Psbot (Picsearch) |
User-agent: psbot |
No comments:
Post a Comment