Avoid Duplicate Content on WordPress Websites

Posted on April 16, 2015
by Marcus Fishman

Avoid Duplicate Content on WordPress Websites

WordPress is a great tool to build a website in, but when it comes to search engine optimization, there are a few areas where it needs some improvement. The issue of duplicate content is one of them, so I’d like to talk about what duplicate content is before presenting 5 easy ways to fix that problem in WordPress.

What is duplicate content?

Simply put, duplicate content is any text on your website that either completely matches, or is similar to, content elsewhere on your website. While there are acceptable kinds of duplicate content – print-only versions of a web page, for example – in other cases people intentionally use duplicate content across multiple domains in an attempt to get more traffic to their website from search engine results.

It’s the latter usage of duplicate content that is the reason Google and the other major search engines penalize you for having it on your website. There’s no way for them to understand the intent of why the content might be duplicate – even if you aren’t duplicating content maliciously – so they just penalize it altogether.

That being the case then, optimizing your website so that it avoids duplicate content is something you need to do if you’re interested in your website’s placement in search engine results.

(For more tips and explanations from Google about duplicate content, read through their page on the subject in the webmasters/site owners guide.)

5 ways to avoid duplicate content

When you install WordPress out of the box, it’s not duplicate content proof – and that especially pertains to your blog posts and how they’re displayed. For example, if you have your category, archive, and home pages all set up so that they display the full text of your blog posts, guess what you have? Duplicate content.

Here are 5 simple changes you can make to avoid duplicate content on your WordPress website:

  • Display the full text once and only once – My rule of thumb is that the full text of a blog post should only be displayed on the actual page of the blog post itself. Everywhere else your recent blog posts are listed, you should either have the excerpt appear, or just the name of the post and a link the full text of it. To display the excerpt only, you can either update your theme files, find a plugin to do it for you, or just use the <!–more–> tag when writing your content.
  • Fix your page header – You should insert the following code into your theme’s header file to make sure that certain pages (such as the homepage, posts, pages and category pages) are indexed by search engines spiders, while certain others (feeds, archives, etc.) are excluded :<?php if((is_home() && ($paged < 2 )) || is_single() || is_page() || is_category()){
    echo '<meta name="robots" content="index,follow" />';
    } else {
    echo '<meta name="robots" content="noindex,follow" />';
    } ?>
  • Be aware of comment pagination – In WordPress 2.7, you have the option of separating your comments onto multiple pages rather than lengthening the actual post page. The only problem with this is that for every page of comments, you’re duplicating the content that people are commenting on. This function is enabled by default in WordPress 2.7, so if you don’t have a need for your comments to be paginated, go to the “Discussion” area under settings and uncheck that option.
  • Add unique META descriptions to each post – I’ve written about META tag issues in WordPress previously, but the most important META tag to consider here is your description. If you have the same META description on all of your blog posts or pages, that’s duplicate content. I recommend the All-in-One SEO Pack plugin (what I use on this website) because it lets you use your excerpt (or whatever text you want) as the META description, thus avoiding duplicate content.
  • Update your robots.txt file – If you don’t want search engine spiders to find unintentional duplicate content on your website, put some instructions in your robots.txt file that tells them what they shouldn’t crawl. In WordPress, that means making sure you exclude your feeds and any other auxiliary pages that duplicate content you have elsewhere. The following code will do the trick – just copy and paste it into your robots.txt file:
    User-agent: *
    Disallow: /wp-
    Disallow: /search
    Disallow: /feed
    Disallow: /comments/feed
    Disallow: /feed/$
    Disallow: /*/feed/$
    Disallow: /*/feed/rss/$
    Disallow: /*/trackback/$
    Disallow: /*/*/feed/$
    Disallow: /*/*/feed/rss/$
    Disallow: /*/*/trackback/$
    Disallow: /*/*/*/feed/$
    Disallow: /*/*/*/feed/rss/$
    Disallow: /*/*/*/trackback/$

Thoughts?

If you’re a web designer or developer and have a WordPress website (or have built WordPress websites for your clients), how do you help them avoid duplicate content on their websites? Are there any tips or suggestions that I didn’t mention that you feel would be useful to others? Share your thoughts by leaving a comment below!

About Marcus Fishman

Marcus has been working professionally with websites since 2001, and offers a wide range of website knowledge from his years of experience working with, designing, and building websites. He lives in Raleigh, North Carolina.