How To Use a Robots.txt File To Prevent Duplicate Content In WordPress From Indexing In Google
WordPress creates a lot of duplicate content. I don’t believe that is too much of an issue today as it was a few years ago because WordPress is obviously very popular and I don’t think Google is going to penalize millions of blogs for something that publisher aren’t aware of. Instead, Google does its best job at trying to figure out which version of the duplicate content is the main copy. Allowing this to happen can produce less than desired results.
In this video, I show you how you can use a Robots.txt file to prevent duplicate content on your WordPress blog. Also, If you would like to view the video in full quality and even download a copy to your computer, I have it available in my membership site.
The Camtasia Studio video content presented here requires a more recent version of the Adobe Flash Player. If you are you using a browser with JavaScript disabled please enable it now. Otherwise, please update your version of the free Flash Player by downloading here.
Here are the footnotes as promised in the video:
- http://www.garryconn.com/robots.txt
- http://www.robotstxt.org/
- http://www.google.com/search?q=how+to+create+a+robots.txt+file
- http://www.thesitewizard.com/archive/robotstxt.shtml
Here is a screen shot of an example a /comment-page-1 entry indexed in Google:

One of the things that have been bugging me the most are these strange /comment-page-1 entries that have been getting indexed in Google. This seems to have started around WP 2.71. At times it is very annoying because the comment-page-1 version of my post will get indexed and NOT the actual post itself.
Popularity: 1% [?]

12 comments
I cant seem to find the video in the membership forum with the other videos. Am I looking in the right place?
I have duplicate pages on my website. (pages repeated in different categories).
It seems Google is selecting one or the other, will Google penalise me or does it know this is what happens on the Internet and get on with its job?
I haven’t loaded it yet.
From what it seems, Google will choose one or the other for indexing as well as ranking. Sometimes they index both and then choose which one to rank. I don’t think there is a MAJOR issue of being penalized because there are millions of blogs on the Internet and most publishers don’t know anything about duplicate content, SEO, or things like that. If you want to control what gets indexed and what doesn’t, then you can do that with your robots.txt file.
I have been a regular reader of your blog but it disappoints me that you haven’t published the video freely but on the membership site.
Thanks for that post. I too have duplicate pages on some of the web properties using different keywords each time. Do I need to completely change all the articles I put?
What do you mean exactly?
OMG… can you like give me maybe a day before you jump my ass. Umm… I get kinda busy too you know. LOL!!!
This information is available for free. Just google the post title and you’ll find tons of information.
Ya Garry has been saying in many posts that he don’t make posts often to sell products but almost just second post made is selling or promoting some kind of product or service.
Just google the post title and you’ll find tons of information.
Ah ok,
I think I will leave it for now. Maybe chat to the web designer next time I see him.
Thanks.
Leave a Comment