• Home
  • New Entries
  • Popular Entries
  • Submit a Story
  • About

How to Make a WordPress Blog Duplicate Content Safe ...

In one of my recent posts I wrote about the duplicate content issue. This topic is especially important to me since my blog uses the WordPress content management system which, when used with the default configuration, is not duplicate content proof. In fact this CMS is capable to render almost 100% of your content duplicate. As usual the fault of the system has roots in its advantages. WordPress has many features facilitating blogging and linking, such as RSS feeds to posts and comments, trackback URLs, monthly archives and so on. In the same time this variety of URLs returning similar or identical pages represents a clear case of duplicate content.


WordPress And Duplicate Content

The first evidences of duplicate content produced by your WordPress CMS can be found in your sidebar. They are category pages and monthly/daily archives. Category pages store your articles posted under the same topic – a category. Such pages have no unique content; they are just a collection of your previous posts. Monthly and daily archives also simply group your previous articles by the date of posting. Sometimes when you have only one post in a given day, the archive page for the date and your post are totally identical.

The next case of duplicate content is even more prominent. It can be your home page itself. If it contains not excerpts but the full text of your posts, then it duplicates your post pages. This also applies to the ‘next/previous entries’ pages – those accessible via /page/2, /3, /4 etc.

Feeds. Search engine spiders crawl all the content they can reach and of course this includes RSS feeds too. The additional problem with them is that Google may choose to display your RSS URL in the search results over the link to the original post. In this case the user who clicks this result will see an XML formatted page which is not ‘human-friendly’.

Trackback URLs. Many WordPress templates add trackback links after posts. This links enable authors to track who links to their posts. Usually, if your post URL looks like ‘www.yoursite.com/2006-11-30/yourpost/’ its trackback URL will be ‘www.yoursite.com/2006-11-30/yourpost/trackback/’.

Identical meta-description. By default WordPress doesn’t provide a tool to add unique meta description tags to your posts, and they either have none or share a single site-wide description. Having no meta description at all is a disadvantage, as a properly written one can make your snippet stand out in a SERP. Having an identical description for all your pages is a threat, as Google might get them filtered out as too similar. (see a thread here)

Because of the duplicate content Google search can return less desired URLs (such as feeds or archives instead of original posts); your pages can be moved out of their index, or placed into the supplemental results, which are rarely displayed to users.
Solving the Duplicate Content Issue in WordPress
Adding ‘noindex, follow’ tags

What can you do to avoid this problem? You can tell the search engines what URL to index by using ‘noindex, follow’ meta tag, robots.txt exclusions or 301 redirects. Let’s say you want Google to index your front page, posts, single pages and category pages and forbid the spiders from crawling the content of archives, feeds and ‘next entries’ pages - page/2, /3, … To do this you have to add to your header.php the following code:
Code:

     if((is_home() && ($paged < 2 )) || is_single() || is_page() || is_category()){
echo <meta name="robots" content="index,follow" />;
} else {
echo <meta name="robots" content="noindex,follow" />;}

For those not familiar with editing templates in WordPress: in your dashboard click Presentation menu item and after the new page is opened – click Theme Editor. In the Theme Editor choose ‘header.php’ and then paste the above code into the editor form. This code has to be inserted anywhere between head tags .

Here the tag is added to the home page but not the ‘next entries’ page (is_home() and ($paged<2)), to your posts (is_single()); to solo pages, like ‘About me’, if you created any (is_page()); and to category pages (is_category()). If you don’t want your categories to be indexed just delete || is_category(). All the other pages will get . They will not be indexed, but this will not prevent crawlers from following their outgoing links.


Adding unique meta description

For this purpose I use Head Meta Description plugin. This plugin can be configured to use an excerpt of your post as a meta description – this is especially useful if you have to add this tag to hundreds of existing pages. Or you can add your own manually as a custom field, which is my personal preference.


Using more tag

By using this tag you tell WordPress to display only the first few lines of your post. This greatly reduces the similarity of home page and your articles. If you have too many existing posts to edit, you can use an ‘excerpt’ plugin, such as this one from Semiologic
Redirect to a canonical URL

You should edit your .htaccess file to perform 301 redirects. Non-www addresses like yoursite.com should be redirected to www.yoursite.com. URL without trailing slashes like www.yoursite.com/category should be rewritten to include it: www.yoursite.com/category/ This can be done by inserting the following code into your .htaccess file:


RewriteEngine On
RewriteCond %{HTTP_HOST} !^www.yoursite.com$ [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [R,L]
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

For more details I advise you to read this: the process or rewriting the URL layout.
Preventing spiders from crawling feeds and auxiliary pages

For this purpose you should edit your robots.txt file by inserting the following code

User-agent: *
Disallow: /wp-
Disallow: /search
Disallow: /feed
Disallow: /comments/feed
Disallow: /feed/$
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
Disallow: /*/*/feed/$
Disallow: /*/*/feed/rss/$
Disallow: /*/*/trackback/$
Disallow: /*/*/*/feed/$
Disallow: /*/*/*/feed/rss/$
Disallow: /*/*/*/trackback/$
Another two practical tips

Some people find it useful to restrict the number of posts displayed in your home page to 4-5, as less posts are duplicated.

A great article on customizing the more tag in Wordpress.
To Sum Up:

    * To avoid the duplicate content issue in WordPress include you should do:
    * Add ‘noindex, follow’ meta tag to your monthly/weekly/daily archives, ‘next entries’, and if necessary, category pages
    * Ensure that all your pages have unique meta-description tags
    * Set up 301 redirects for your non-www URL and URLs without trailing slashes
    * Restrict search engine crawlers from indexing your feeds and trackbacks
    * Use more tag to show excerpts in your home page instead of full posts
    * Restrict the number of posts displayed in your home page


 Original Source:
http://seoresearcher.com

AddThis Social Bookmark Button

Posted at 12:19:14 pm | Permalink | Posted in How to Blog  SEO  

Related Stuff

Google Buzz Button Wordpress Plugin Released

If you are looking for a Google Buzz button to add into your Wordpress site then we have released the first Wordpress plugin exclusively ...

Plug And Play Ecommerce With Wordpress Plugins

Since 2003 Wordpress has slowly been gaining popularity amongst the elite of the internet, the bloggers. It is one of those few things ...

Add Google Search to Your WordPress Blog

The native WordPress search does not return very relevant results, thus it makes a lot of sense to add Google Search into your WordPress ...

Add More Sidebars to Your WordPress Theme

You can add more than one sidebar section to your WordPress site. For example, with the stc-intermountain.org site, I added a whole bunch ...

Series Posting in Wordpress

In my functions.php file, I have some code which implements series posting. This relies on the thematic ...

Top Stuff

Free Blogger templates Anime Themes

Wordpress Guestbook Generator Plugin

48 Unique Ways To Use WordPress

GeekLog

WordPress Single Post Templates

Zookoda



About Webloglines

Webloglines is a project offers a comprehensive collection of blogging services, articles, themes and plugins from around the world. Whether you're looking to promote your own blog or find blogs on various topics, this site is for you.


Search


Topics

  • Adsense (12)
  • Blogging Tips (73)
  • Blogs Slides (25)
  • Blogs Websites (22)
  • Digg (20)
  • How to Blog (129)
  • Search Engines (9)
  • SEO (135)
  • WordPress Plugins (269)
  • WordPress Security (75)
  • Wordpress Themes (99)
  • Wordpress Tips (162)

© 2006 www.webloglines.com. All Rights Reserved. Powered by IRange