7 Steps to Stop RSS Feed Scraping

By 05-03-2013   BloggingTutorials

How can you stop RSS feed republishing of your blog and prevent content theft to avoid duplicate content penalties, or getting scraped content ranking higher in search results than your own blog article. All bloggers experience RSS scraping to some extent and it is simply not possible to go after every blog with a DMCA takedown request.

Stop RSS Scraping & Content Theft

content theft

You can keep complaining, but here are some simple ways we use to deal with scraper sites  engaging in your content theft.

1. Install Feed Copyright

To deal with sites republishing RSS, we created the Simple feed copyright WordPress plugin which adds a copyright notice as well as links back to the blog url and permalink of the post. This plugin has already been downloaded 15,000+ times from the official WordPress Plugins repository. Republished content not only gets you backlinks, but also lets Googlebot and readers of the scraper site know that the original content was from your site.

2. Find Scraped Content

Of course the easiest way is to search for your latest post title within quotes on Google and all search results which have the same title will be detected. You can ignore the many sites which syndicate your headlines, but you must find the sites which republish your content.

If you want a more powerful content theft checker, I prefer to use Copyscape Premium – a useful online plagiarism finder tool for detecting copies of our web content. See the latest report for our site. (If you want a Copyscape premium report of your top level domain name, just ask as I have a few extra credits).

3. Stop Image Hotlinking

After you find the scraped content, it is mostly not possible to stop them from republishing. So we identify the site and then stop image hotlinking by these sites. You can choose to display a copyright image of your choice to replace this.

This really works as we switch hotlinked images with a bright green notice to indicate the image belongs to our site and post our short url QOT.co which is easy to type should people decide to check the original source. This works even if they decided to remove copyright notices.

4. Show the Copyright policy

This hardly works as most RSS scrapers hardly ever have a contact page, publicly displayed email or comment forms in which you can get a response to requests to stop RSS scraping. If the contact form is there, you can post a link to your copyright policy, and some scrapers might decide to remove the auto publishing bots from republishing your site, fearing lawsuit threats.

5. Google Likes Spam Reports

If these sites rank higher, Google wants to know. This is what takes these scrapers off search engines results to the bottom of the queue. Google has this special form to report web spam. You can fill this up and complain about these sites ranking higher than your site in search engine results. Remember not to overdo the reporting and only report genuine spam. Of course you can file a DMCA request, and track your DMCA requests via the webmasters tools dashboard, but then that is a longer process.

report webspam

6. Stop the Money Supply

Many such sites may carry Adsense ads, and if you report such Made-for-Adsense sites, Google will be happy to prevent paying advertisers from wasting their money. All ads now carry an blue arrow marked with Adchoices, click that arrow and you would be directed to Google pages where you can reports problems with the site. Blocking the flow of free money makes the scraping futile with no income to justify the effort.

report website

7. Bring the Site Down

Web hosting companies take web spam very seriously and they will take prompt action to sites hosting or engaging in illegal activities. Find webhosting company of any website and then report using the site web hosting company contact forms. They will take care of the rest, if it is a genuine request.

Looking forward to how you deal with this in comments.

28 comments on “7 Steps to Stop RSS Feed Scraping

  1. Brad says:

    I know a Spanish website which is scraping my content but the hosting company is also Spanish and ignores my emails.

  2. Sanjib Saha says:

    This article turns out to be very useful and enlightening..it is crazy to think that someone can ruin your site…the tools and methods suggested here can really be helpful for bloggers and webmasters..thanks for sharing..

  3. Pinky Jindal says:

    Very very helpful to others, thanks for share this with everyone

  4. raman bathina says:

    All are nice but most of them are useful to wordpress users.So i suggest one tool for all bloggers that is “tynt”.This works well on all blogging platforms.If any one copies text from your blog after it shows attribution to your blog with your website address.Nice tool try it.

  5. Matt Baran says:

    These are awesome tips for newbies. I am going to implement the RSS copyright right now. Thanks!

  6. online progamming Tutorials says:

    Recently i was suffering RSS Feed Scraping.From your post i will apply on my site .

    Thanks for your important post.

  7. e says:

    You might want to edit most instances of your featured word in the post so they indicate “scraping” instead of “scrapping” (trashing).

  8. Shahreer says:

    This is a great steps to stop RSS Feed Scraping. I would like to implement this in my site.

  9. Patrick Tasner says:

    Good to know this very helpful information. I am not aware of it. It’s probably the best time to have these tips be put to implementation. Thanks!

  10. golzar says:

    What a nice tips. But how does it hamper a site?

Leave a Reply

Your email address will not be published. Required fields are marked *

*