7 Steps to Stop RSS Feed Scraping

How can you stop RSS feed republishing of your blog and prevent content theft to avoid duplicate content penalties, or getting scraped content ranking higher in search results than your own blog article. All bloggers experience RSS scraping to some extent and it is simply not possible to go after every blog with a DMCA takedown request.

Stop RSS Scraping & Content Theft

content theft

You can keep complaining, but here are some simple ways we use to deal with scraper sites  engaging in your content theft.

1. Display Feed Copyright Notice

To deal with sites republishing RSS, we created the Simple feed copyright WordPress plugin which adds a copyright notice as well as links back to the blog url and permalink of the post. This plugin has already been downloaded 15,000+ times from the official WordPress Plugins repository. Republished content not only gets you backlinks, but also lets Googlebot and readers of the scraper site know that the original content was from your site.

2. Find Scraped Content

Of course the easiest way is to search for your latest post title within quotes on Google and all search results which have the same title will be detected. You can ignore the many sites which syndicate your headlines, but you must find the sites which republish your content.

If you want a more powerful content theft checker, I prefer to use Copyscape Premium – a useful online plagiarism finder tool for detecting copies of our web content. See the latest report for our site. (If you want a Copyscape premium report of your top level domain name, just ask as I have a few extra credits).

3. Stop Image Hotlinking

After you find the scraped content, it is mostly not possible to stop them from republishing. So we identify the site and then stop image hotlinking by these sites. You can choose to display a copyright image of your choice to replace this.

This really works as we switch hotlinked images with a bright green notice to indicate the image belongs to our site and post our short url QOT.co which is easy to type should people decide to check the original source. This works even if they decided to remove copyright notices.

4. Show Copyright policy

This hardly works as most RSS scrapers hardly ever have a contact page, publicly displayed email or comment forms in which you can get a response to requests to stop RSS scraping. If the contact form is there, you can post a link to your copyright policy, and some scrapers might decide to remove the auto publishing bots from republishing your site, fearing lawsuit threats.

5. Report Spam to Google

If these sites rank higher, Google wants to know. This is what takes these scrapers off search engines results to the bottom of the queue. Google has this special form to report web spam. You can fill this up and complain about these sites ranking higher than your site in search engine results. Remember not to overdo the reporting and only report genuine spam. Of course you can file a DMCA request, and track your DMCA requests via the webmasters tools dashboard, but then that is a longer process.

report webspam

6. Stop the Site Income

Many such sites may carry Adsense ads, and if you report such Made-for-Adsense sites, Google will be happy to prevent paying advertisers from wasting their money. All ads now carry an blue arrow marked with Adchoices, click that arrow and you would be directed to Google pages where you can reports problems with the site. Blocking the flow of free money makes the scraping futile with no income to justify the effort.

report website

7. Report to Web Hosting Provider

Web hosting companies take web spam very seriously and they will take prompt action to sites hosting or engaging in illegal activities. Find webhosting company of any website and then report using the site web hosting company contact forms. They will take care of the rest, if it is a genuine request.

Looking forward to how you deal with this in comments.