• The Web Robots Pages
  • The /robots.txt checker can check your site's /robots.txt file and meta tags. The IP Lookup can help find out more about what robots are visiting you.


  • Robots exclusion standard - Wikipedia, the free encyclopedia
  • The Robot Exclusion Standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web spiders and other web robots from ...


  • The Web Robots Pages
  • About /robots.txt In a nutshell. Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.


  • www.google.com
  • User-agent: * Disallow: /search. Disallow: /groups. Disallow: /images. Disallow: /catalogs. Disallow: /catalogues. Disallow: /news. Allow: /news/directory


  • www.whitehouse.gov
  • User-agent: * Crawl-delay: 10 . Sitemap: http://www.whitehouse.gov/feed/media/video-audio


  • Robots.txt Generator - McAnerin International Inc.
  • robots.txt generator designed by an SEO for public use. Includes tutorial.


  • Block or remove pages using a robots.txt file - Webmaster Tools Help
  • A robots.txt file restricts access to your site by search engine robots that crawl the web. These bots are automated, and before they access pages of a site, they check to see if a ...


  • Robots.txt and Search Indexing - Search Tools Report
  • Information on using the robots.txt file to keep web crawlers, spiders and robots from indexing certain sections of a site.


  • Introduction to "robots.txt"
  • Learn about the robots.txt, and how it can be used to control how search engines and crawlers do on your site.


  • en.wikipedia.org
  • # robots.txt for http://www.wikipedia.org/ and friends # # Please note: There are a lot of pages on this site, and there are # some misbehaved spiders out there that go _way_ too fast.