How to Find Thin Content on Your Site

Site traffic hit by Google Panda? It is essential to find thin content on your site and remove it to avoid being slapped by Google algorithm penalties like Google Panda, which specifically target to reduce search engine rankings to sites with thin content.

What is Thin Content?

Simply put it is poor content, low quality content, few words of content which is classified as ‘low quality’ by search engines and fails to generate any significant search engine traffic. So basically it adds to the site bloat and reduces your link equity value. Note that low pageviews does not necessarily mean thin content, its possible the content is good, but not highly searched.

If you autogenerate or scrape content, it can lead to thousands of low quality web pages which get no traffic, so this thin content is what needs to be fixed or moved to different domain / subdomain to fix Google Panda penalty (e.g. we moved our job board to a different domain, as it synced data from a huge dataset).

How to Find  Thin Content?

Since our site was recently stuck by Google Panda update, we have witnessed very low traffic since then. This update especially affected sites with heavy ads above the fold, but then there are several reason Panda can strike. As we looked into the site archives, there indeed was lot of thin content which had accumulated over the years since 2004, when the site started.

So lets get started… Open up your Google Analytics statistics (Google Analytics is a must have site traffic tracking tool, and its free!). Go to Content > Site Content > Content Drilldown. Then choose to review traffic details of the last one year.

yearly traffic

We use archives based on months in a path like /archives/year/month/,  we simply browse down to  Content Drilldown > Archives > Year > Month.

monthly low traffic

Now you can analyze in detail which articles are getting low traffic. Remember to sort by pageviews to get the data.

Poor content

We tend to delete or unpublish all articles which have less than 10 pageviews in a year!

Depending on your site, you can instead choose 3-6 month data and decide on lower criteria like 5 pageviews. Either way you need to relook at all these pages and analyze why these pages are getting low traffic, fix the content, or move to drafts to fix later, or simply delete them. Since these articles never received any traffic over the year, removing them would not make a traffic drop anyway.

Remove Thin Content

Most articles affected with low traffic on our site were having large blockquoted text. The blogging styles were different in 2004-2007, and there was a tendency to repeatedly quote text from authoritative resources. After a threshold, this might have been viewed as copied / duplicate content from another site (especially since blockquotes are also linked back to the source).

Another important group was articles with broken links. Well WordPress has an excellent plugin to find broken links and nofollow them all in minutes! We are still working on removing broken links from our old archive pages, as over 10% links were broken!

We reviewed lots of old content month by month and  identified lots of low traffic content, removed quoted content, fixed articles which were useful, and yet deleted over 500 articles over the last week! … and plan to remove at least 200  more this week!

Warning: Remove content only if you clearly understand what you are doing. It might be a better idea to unpublish rather than delete posts, in case you need to restore content later. Many of these articles may have backlinks and may have had traffic further in the past, so if you created the content, you know what your past popular content was like nobody else. It is a good idea to edit / delete the content yourself, rather than outsourcing it to an intern.

Note: Low pageviews does not necessarily mean thin content, its possible the content is good, but not highly searched, and removing them can harm your site. So it is essential to check your content, identify why its ranking poorly, and if it is actually good original content, but with low searched terms and therefore low pageviews … keep it.  (Thanks @Sumesh)

It is essential for webmasters to find thin content today to fix Panda updates, or even better to prevent your site from getting penalized. Subscribe our blog as we continue to fight Google Panda and post tips that helped us.

Share with friends

About the Author: P Chandra is editor of QOT, one of India's earliest tech bloggers since 2004. A tech enthusiast with expertise in coding, WordPress, web tools, SEO and DIY hacks.