Robots.txt How important it is?
You are invited to have my FREE RSS Feeds or you may Subscribe to me via email for latest information in this website.
It is good that I am already using WordPress Self Hosted blog. There is another reason for me to be happy since I can alter my robots.txt .So if you are hosted in WordPress.com or Blogspot.com this is not applicable to you guys and gals.
This post will aim to educate you about the importance of robots.txt in your plug in terms of SEO.
What is Robots.txt Google Says
How do I use a robots.txt file to control access to my site?
A robots.txt file provides restrictions to search engine robots (known as “bots”) that crawl the web. These bots are automated, and before they access pages of a site, they check to see if a robots.txt file exists that prevents them from accessing certain pages.
You need a robots.txt file only if your site includes content that you don’t want search engines to index. If you want search engines to index everything in your site, you don’t need a robots.txt file (not even an empty one).
So this is it, robots.txt is used to restrict bot’s in indexing once site. If you are a regular reader here, You know that I am having some problem with those The Unusual string showing at my Google Search Engine Results which is the reason why study this matter. And later found out that this is a must to do for SEO purpose.
How is my robots.txt made, I just change this yesterday and just 4 hrs ago it has been crawled by Google smoothly.
Sitemap: http://techathand.net/sitemap.xml.gz User-agent: * Disallow: /wp-content/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp- Disallow: /page/*/ Disallow: /feed/ Disallow: /?wpcf7=json* Disallow: /*/feed/rss/ Disallow: /*/feed/ Disallow: /trackback/ Disallow: /*/trackback/ Disallow: /category/Disallow: /2007/*/Disallow: /2008/*/Disallow: /cgi-bin/ Allow: /wp-content/uploads/
Explanation :
Update : I have remove Disallow: /2007/*/ & Disallow: /2008/*/ thanks to Marhgil.
User-agent: * means that all search engine
Above shown robots.txt informs bot not to index all WordPress files and documents, Pages, Feed, ?wpcf7, trackbacks ( URL only ), category, 2007 & 2008 archives and allows indexing of my uploads.
In SEO good for a certain page to be unique upon visitation of the search engine bot. Duplicates came from scrapers, Copier, and even from your own domain.
You may ask why did I restrict feed, page, Trackbacks, category, 2007 & 2008, This is because they are all archives and is seen by bot as duplicate pages.
Ultimate Tip
Go to your favorite blog check their Robots.txt and you will see how experts in the field is doing or commanding the bot. No wonder they are always on top.
But How…. Use this syntax [ Domain name/robots.txt ] ex.. http://www.abc.com/robots.txt
I saw lots of variation yesterday. and You will see lots of it also.
Combine their strategy and make your own that is applicable for your blog. You cannot access it if they are using .htaccess
The Reality
If you want that your site be a Search Engine Friendly you have to direct them on what part of your blog needs to be visited. By doing this your not making any duplicate content in the web that makes your post in the supplemental index of Google. This is a must to do specially if your monetize your site.And so if your thinking in moving to Self hosted blog check how to do it in SEO friendly way.
Hope you like it and you subscribe to my Email or feeds for future SEO Tips and Tricks. Just don’t forget to verify your email subscription.
Email This Post
; Filed Under SEO
|
If you are new here, and you would like be updated on the things happening on this site, Try to Subscribe to my FREE RSS Feeds and Subscribe to me via email |
Next post in category: Effective Linking With Your Old Post in 10 Steps »
« Previous post in category: The Unusual string "?wpcf7=json" Will It Hurt My SEO?
del.icio.us |Digg it |Furl |ma.gnolia |Maple.nu |Netscape |reddit |Scuttle |Shadows |Simpy |Spurl |StumbleUpon |Wink |Yahoo MyWeb |
Permalink : Robots.txt How important it is?
Comments
15 Responses to “Robots.txt How important it is?”
8 pingsLinks To This Post
-
Weekly Link Love from Pinoy Tech Guy | Pinoy Tech Guy on
January 25th, 2008 4:11 pm
[...] is one of the most important considerations on a self-hosted site. Kuya Dex shows us how important a Robots.txt is to protecting your site and guiding online bots to see only what you want them to [...]
-
Visitors : How do I Classified You ? » Tech At Hand on
January 30th, 2008 11:16 am
[...] Yes, Among all other visitors aside from Visitor via ads, The Search Engine Visitor has high percentage of clicking your ads, because usually those type of visitors are searching for [...]
-
7 must read Webmaster Central Blog Post » Tech At Hand on
February 15th, 2008 2:02 am
[...] blocked by robots.txt - I have also made a post regarding the robots.txt , and all I can say is, The experiment was a success [...]
-
Welcome Pacquiao And Marquez Fans » Tech At Hand on
March 17th, 2008 5:37 pm
[...] Robots.txt How important it is? [...]
-
SEO Tips : Changing Category, Tags and Search Pages : Tech At Hand dot Net | Philippine, Blogging, SEO & Tips on
June 7th, 2008 12:37 pm
[...] have also modified my robot.txt during implementation of this experiment and remove restriction in my Category pages and tag Pages. [...]
-
Better Check Your Robots.txt says Adsense : Tech At Hand dot Net | Philippine, Blogging, SEO & Tips on
June 11th, 2008 8:27 am
[...] have taught my visitors before on how to properly implement their Robots.txt and just today AdSense just warned their Publisher to check their Robots.txt file in order not to [...]
-
How to Setup a WordPress Blog | Niche Store Strategies on
July 31st, 2008 12:57 pm
[...] Robots.txt How important is it? [...]
-
How to Add a WordPress Blog to Your Site | Niche Store Strategies on
August 2nd, 2008 5:52 am
[...] Robots.txt How important is it? [...]
Leave a Reply


aww I must apply this one on my blogs!
@ Amy
I believe it is must be installed on blogs for Search Engine Ranking
nice post, I actually edited my robots.txt file today for digitalfrap.com, I think I’ll edit again… hehe
i’m afraid your permalink pages will also be deindexed by google because of these lines:
Disallow: /2007/*/
Disallow: /2008/*/
Your permalink format falls under that category, so, it will not only deindex the archive page but also your permalink pages.
@ Marhgil
A very important correction thanks.. This is what is good in blogging there are lots of co blogger who is helpful to others.
Again Thanks..
@ Ordnacin
I do hope you find good information in this post.
Mine is similar to that but with less lines. ^_^
sitemap: http://silkenhut.com/blog/sitemap.xml
User-agent: *
Disallow: /cgi-bin/
Disallow: /blog/wp-admin/
Disallow: /blog/wp-includes/
Disallow: /blog/author/
Disallow: */page/
Disallow: /blog/archives/
Disallow: */trackback/
Disallow: */feed/
Disallow: /blog/stats/
@ Allen
Thanks for sharing
hmm.. I screwed up my robots.txt a few weeks ago.. medyo nakakatakot iedit ulit haha
@ sylv3rblade
kaya ko pinost ko. kaya nabago ko agad nung macheck ni marhgil
thats all very nice information, thank you for that!! http://zobibiseo.harenatv.com .Great article. Honestly, I wish i could write like you.
@ Zobibi
Thanks for the compliment. I wish to comment at your blog but unfortunately . I cannot understant it.
Thank you for sharing your techniques.. very talented website.. Keep up good work!http://zobibiseo.harenatv.com
I haven’t had any success with this robots.txt, maybe i’m doing it wrong. pero so far, ok pa nman mga post ko sa SERPS
derek’s last blog post..September 2007 BAR Exam Results, Top 10
@ derek
What seem to be the problem.. Kaya mo nasabi na di ka success?