Search Engine Optimization-Robots.txt – Telling the search engines what they can and cannot index
As marketers a time get into engines… the index we don’t them to. Here’s how the what can’t look at….
So site it search — you’ve Googlebot, Slurp, MSNBot whole other spider-like crawling site regular basis. can your typing the engines… although won’t up front your just yet.
But if you’ve sections website don’t indexed? Remember engine just the internet, links whatever into databases. there’s a the page… spiders it. don’t know photos strictly friends only, there pages website you’d really have in engine being that archive — like long-expired special offers.
That is, of course, you not to.
That’s where robots.txt in handy.
Robots.txt a document in of and "robots" website they cannot access. one "robots" site, thing is for robots.txt file. to requests, and won’t pages you’ve disallowed.
How make robots.txt file?
Pretty easily. up editor (Notepad good all users… lurks Start > > menu. Don’t program Word, because lots formatting bits. can HTML create robots.txt file, sure you’re code view, make delete from and as .txt file, .html file.)
Decide areas website the index, ones don’t want through. if any would have your site.
Write information robots.txt file.
To all your website:
User-agent: * Disallow: /
To all all your site:
User-agent: * Disallow:
To block directories:
User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /personal/ Disallow: /photos/staffchristmasparty/
To a spider:
User-agent: Googlebot Disallow: /
To a spider, while others:
User-agent: Googlebot Disallow: User-agent: * Disallow: /
Tips:
- You a for instruction.
- Blank used separate instructions (as the example).
- The the User-agent has meaning robots.txt and can’t used wildcard. wanted all on website, you couldn’t go Disallow: *.gif. That won’t work.
- Your be robots.txt, all lower-case.
- Your be the of website: www.yoursite.com/robots.txt. That’s the when your site, they won’t it put else.
Now save and to website.
Robots.txt and sitemap
If you’ve our creating sitemaps, you’ll your robots.txt is handy let engines that is.
All have is blank the in robots.txt file, paste line:
Sitemap: <http://www.example.com/sitemap.xml>
If you’ve got one sitemap, can than line.
Sitemap: <http://www.example.com/sitemap1.xml> Sitemap: <http://www.example.com/sitemap2.xml> Sitemap: <http://www.example.com/sitemap3.xml>
This way don’t need tell every where find sitemap. They’ll as they your robots.txt file, which bot when your anyway.
Do all attention robots.txt file?
No. spiders do, the spiders spiders. are out well, the content for to spam lists. ones to to robots.txt file, there’s not trying them.
Your robots.txt is accessible!
Don’t try your robots.txt to on site… this able viewed anybody, simply www.yoursite.com/robots.txt browser. can things don’t want indexed!
If there’s content website really, don’t want seeing, bet password-protect directory. There be to do your panel (cPanel similar).
Related posts:

