Increase Google Rankings With Robot Control
Do you want to increase your Google rankings? I have discovered an easy way to get your self hosted Wordpress blog and posts a better ranking in Google. A self hosted Wordpress blog can consists of hundreds, if not thousands of individual posts and pages. Google is standing by to rank every one of them. All you have to do is control that robot!
Robot control is a little bit like traffic control. With traffic control you have stop signs, traffic lights and crossing guards that control automobiles on the streets. On the Internet instead of their being cars that travel throughout your blog there are people and their are robots. Controlling people and robots on the Internet and within your site is actually much easier than control cars on the streets. In the real world, there are many different street signs that you need to recognize and understand before you can safely drive on the streets.
Traffic Control In The Real World
In order to be a safe driver on the streets you have to know what stop signs mean, you have to understand what Yield, Merge and Dead End stand for. If you don’t understand the signs and the commands you could cause an accident and get yourself including other drivers seriously injured and even killed. Driving on the road without understanding the rules of the road is dangerous. To understand all the rules takes time. You have to to study the driving book and gain actual experience. Luckily for you, understanding how to control people and robots on the Internet is much easier and less risky if you have a failed attempt!
Traffic Control In The Internet World
There are two elements that can be controlled on the Internet. People and robots. Generally speaking, most bloggers already know how to control and direct people on their blogs. This is done by placing emphasis on their certain sections within their blogs or putting various other things in the spotlight for their visitors and readers to see. What is not apparent is there is actually two versions of your blog. You have the graphical and pretty version that people see. And you have the coded and text based version that robots see. Moreover, in the version that the robots see, the order of your importance that is visible to people in most cases will be different than what the robots see. In other words, if you place top value on your category section on the graphical side, this section might actually be presented to robots mid way down and possibly even last, or even not at all.
The Robots.txt File
The solution to coaching robots along the right paths is to create a robots.txt file. This file actually has many uses and over the span of the next few weeks, I am going to show you all the functions of a robots.txt file. However, for the time being, I will explain how the robots.txt file helps with controlling robot traffic.
Controlling robot traffic is much easier than controlling street traffic. The only two commands you need to memorize are allow and disallow. In other words, on your robots.txt file, you specifically tell robots where they are allowed to visit and where they are not. Generally speaking, it is as simple as that.
If you run a self hosted Wordpress blog, you can create a robots.txt file and upload it into your root directory. A simple version of this file can give instructions to robots that tell them where they are not allowed to enter. See example:
In this example, I am telling all robots NOT to access these various folders shown. For example, I don’t want robots entering and indexing content within my /feed directory. The reason why I don’t robots to access this directory is because my content is already available in my category directory. When I purposely ban robots from accessing sections of my site that produce duplicate entries of my content, I am able to direct them to the sections that are important to me. This is very much needed when you want to assure that you are putting forth as much focus on your sections for both people and robots.
Learn More About Robots and the Robots.txt file
In the upcoming weeks, I will be showing you everything there is to know about robots and the robots.txt file. However, in the mean time, if you would like to venture out and learn more on your own, I recommend reading these following articles and sites:
Popularity: 1% [?]
12 comments
Garry I don’t really understand the Robots function. Why would you want to restrict search engines to any page, surely if they index say, feeds, comment and categories is better than just categories even if they have the same information. Categories still get indexed anyway doesn’t it and it wouldn’t get a ‘high priority indexing’?
I can see why you would want to restrict , say, private photos etc. but can’t figure why to restrict other public stuff.
Your example says Disallow comments – would that act the same as no-follow for fellow commentors (link backs etc stopped)?
cheers,
– GoldCoaster
GoldCoaster, please review this screen shot:

The above is a screen shot for the results page on google when I ask Google to do a search on my site for the terms: “MyBlogLog Got Hacked”. This search term matches exactly the title of a post I wrote back on May 12th, 2007. In the Google results, notice that the first listing the URL is actually the comments feed URL. And notice that the second listing is the actual post page. The second listing is also in the supplemental index and the first listing is in their main index.
I am glad that I am responding to this comment, because I just modified my robots.txt file again to disallow */feed from being indexed.
In this example, you can see that I have allowed Google to index both the .php/feed/ URL as well as the actual post page URL. Google has said in the past that people shouldn’t worry about indexing and that people should leave it in the hand of the googlebot. However, what Google is also failing to tell us is the fact that the googlebot is like a dog… it’s smart, but not that smart. You can teach a dog how to sit and you can teach a dog how to rollover and fetch… but without training you can’t expect the dog to do these tasks… additionally, you can’t expect a dog to make major decisions either. Moreover, you can’t expect a googlebot to decide which URL to index when presented with duplicate content.
By using the robots.txt file you command the googblebot what to do… in other words, if you have three sticks in the yard, and you tell your dog to fetch… instead of giving the dog the option of choosing any of the three… you block access to two of the sticks, which allows the dog only one stick to fetch.
Now, the Disallow: /comments/ does not make it so your’s or any of my reader’s comments aren’t indexed…. I am very loving with sharing link love. What this command does, is tell robots NOT to crawl and index content thru the /comments directory on my server… instead I tell the robots to find my content thru my home page and my category pages. From there, when the robots index my individual post pages… I have the “DO FOLLOW” plugin installed, and then the robot will visit and index your site from my individual post pages. Not only will the robot reindex your site but it will also attribute an added backlink to the total number of backlinks assigned to your site.
I have carefully plotted out how I have my site coded and how robots travel thru my site. Sharing the traffic ride and index ride with my readers is one of my most important priorities. In addition to all mentioned above, Google will index your site from the “Top Contributors” section, the “Most Recent Comments” section, as well as my brand new “Links Page”. Additionally, with the links page… I have submitted and maintain a separate sitemap alone… just for my links page…
I love link love…and I love giving it out to my readers and contributors.
Sorry for the long response back, I hope that this clarifies things.
Thanks for the explaination Garry. I understand completely now.
To use your dog example – I was thinking not that the dog would choose one stick but it would choose all three and how could that be a bad thing, I understand why now – you want people to click the article not for example the feed of the article.
cheers,
– GoldCoaster
Some extra useful links:
http://www.seomoz.org/blog/how-to-deal-with-pagination-duplicate-content-issues
http://www.seomoz.org/blog/the-illustrated-guide-to-duplicate-content-in-the-search-engines
Especially the last one is very useful in illustrating the problems with duplicate content.
I understand the “why” but I’m a long way from understanding the “how.” Guess I’ll go follow your links and see what I can find out. If you ever want to experiment on another blog, I volunteer! lol
Joost,
Awesome references, thanks so much for taking the time to inlcude them.
Yes. I would rather have my wonderfully written article find a home in the SERPs rather than the post feed url, or for that matter the comment feed url.
It has nothing to do with not allowing free and powerful link love to my commenters… hopefully with all the work I have done last week to the site, the valve should be wide open for you guys! So, comment away and know that these posts I write have open arms to Google for indexing and open arms for commenters who need a lot of link love.
Further explain what you would like help about… I totally invite you to dig around my previously written articles… however, if you don’t know what you are looking for that can be frustrating too.
What can I help you with, my friend?
I am no hand at editing my template, and don’t understand enough about what I’m doing to even tell you!! I keep paying the woman who designed my template to add and change (though I can edit my sidebar and a little of my header), because I’m so technologically challenged. I’m certain I need to direct those robots, but again I just think I need to read some more. That’s a kind offer of help, though, and much appreciated. If I can ever figure out what I need…I’ll come begging. Meanwhile, I’ll just watch for other wisdom that you impart.
Thanks, Garry
[...] Increase Google Rankings With Robot Control Garry Conn is starting a series about the robots.txt file. This article was an excellent start to his series. Go check it out. [...]
Good information, you should also cover the AutoDiscovery feature of the Robots.txt file that Google, Yahoo, ASK and MS have agreed to support.
Google link
oh yes.this is indeed a great tip.i am researching on robot but i have no idea until now
Leave a Comment