January 16th, 2005

Block Search Engines From Indexing Site: META Tags & Robot.txt



You created your very own personal home on the web. However, the site turning up into search engines and web directories, revealing your personal details. It is very simple to block search engines from indexing your website by adding a small META code in your web pages and adding a small text file called robots.txt. Protect your privacy.

Method 1
Add meta tag to head of your
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Method 2
Create a robots.txt (all lower-case) using any simple text editor like Notepad. Save it into your root directory of your domain to prevent search bots from accessing any page on your site. Type the details exactly as give below into the robots.txt file

To exclude all robots from the entire server
User-agent: * Disallow: /

To allow all robots complete access
User-agent: * Disallow:
Or create an empty “/robots.txt” file.

To exclude all robots from part of the server
User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /private/

To exclude a single robot
User-agent: BadBot Disallow: /

To allow a single robot
User-agent: WebCrawler Disallow: User-agent: * Disallow: /

More information is available here.

Bookmark this article on   Del.icio.us or Stumbleupon or Digg or Fark
Continue getting our new articles by RSS or email

Related articles
Del.icio.us $100 Bookmarking Contest
Add Sitemaps Autodiscovery in Robots.txt File
Get Invites for Microsoft Live Search Webmaster Portal
Google Search Stops Labeling Supplemental Results
Indexed by Google in 2 Minutes!

Comments

RSS feed for comments on this post.
  • 1. Brian_Evans | 17/11/05  #

    nice blog!

  • 2. Brian_Evans | 17/11/05  #

    nice blog!

  • 3. conservatories | 11/07/07  #

    Useful but i found some places will ignore these codes.

  • 4. Lisa | 3/03/08  #

    I followed your directions and made a robot.txt file. It worked for yahoo, but not google. I’m afraid it might not be working on google because someone found my page using google before I made the robot.txt file and it is somehow archived with google. Is there a way to fix this? Google is the most common search engine of all. I really don’t want my personal page popping up on it. Thanks!

  • 5. QuickOnlineTips | 3/03/08  #

    Read how to prevent content from being indexed or remove content from Google’s index here.

Articles Linking Here


Comment on “Block Search Engines From Indexing Site: META Tags & Robot.txt”