Two Tools that Help Protect Your Blog from Content Theft (Scrapers)

Categories: Blog Security, Blog Tools
Written By: BloggerSavvy
1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5.00 out of 5)
Loading ... Loading ...

Ever search for something in Google and found your content on another site? I have, often. One of the more damaging issues to your blog is when your copyrighted content is stolen and placed on another blog - and here’s the kicker - the content on the other blog has a higher pagerank than yours and is ahead of you in the search results! Quite frankly that’s terrible as it negatively impacts your blog. Not to mention the annoyance when you note that the blog stealing your content appears to be earning advertising revenue, with excessive ads plastered all around it.

I’ve posted about this subject before in “How to Deter Scrapers and Hotlinkers“, which discusses a bit more of the hands-on and some web based tools you can use to help protect your content.

Many of us have may have read all sorts of articles and other blog posts that delve into the legalities, copyright laws, rights and so forth. That’s not something I’m going to discuss as I think there’s too much discussion (and not enough action). If not, just take a look on Google and you’ll find a plethora of posts on the subject. The fact of the matter remains that we can file as many DMCA notices, cease and desist letters, etc. as we want. Often the content thief (called a “scraper” or “splogger”) does not care. After all they already have your content and you can cry and stamp your feet as much as you want - Many of them will simply not budge, especially if they are out of your geographical area or jurisdiction.

A case in point, on one of my other (self hosted) wordpress blogs, several pages were regularly being lifted by a web site in China. The only recourse was that Google removed the stolen content from search results (the hosting provider, etc. did nothing) - Even though Google was removing their search results, that didn’t stop them from adding more (from my stolen content)! Locally (North/South America and Europe) however, that blog has had a 100% success record in take downs of stolen content found on blog sties such as wordpress.com (Automattic), blogspot.com (Google), etc. Providing I followed their DMCA procedures, all issues were resolved.

In my opinion, this approach is ineffective, costly (with long distance faxing) and quite frankly a waste of time (at one point I was filing about 15 DMCA’s a day for that blog). Why? We need to be proactive not reactive! A client of mine uses the tagline “Predictable is preventable” for his blog and security business. And he’s right! We need to deter, curtail and control such theft attacks against our blog content.

It’s difficult to stop such activity, until you can see it occurring, as such, I hope the following tools will help you:

Grab yourself a copy of Antileech. It’s a plugin that does not stop sploggers, rather (in the developers words) “…produces a fake set of content especially for them that includes links back to your site and sends it only to them. When they steal this content, it appears online just like normal, except now you’ve turned the tables on them and have provided them with useless content…” The benefits here are that sploggers seldom read all the content. They have an automatied system grabbing thousands of pages - And now they will have backlinks to your original content, inviting the reader of the splog (containing fake content) to visit your blog instead. That’s link love!

Another effective tool in your arsenal is ©Feed, which allows you to place a digital fingerprint and copyright notice in your content feed (RSS). For those not familiar, in most cases, your RSS is used to facilitate content theft. What’s RSS? Common Craft’s video below easily explains it:

In the words of the ©feed developers “…You can use html. You can add the IP of a feed reader and digital fingerprint for an explicit key. There can also be a domain name for a whitelist and this domains became not the message [sic]. The plugin search for this key at content theft [sic]. It is furthermore possible to add comments and related posts to the feed. For the related post feature it uses a database-search for the content. You can use the plugin “Simple Tagging” for related posts in a feed. The copyright notice can be added even when using entry excerpts…”

Needless to say, the above two tools are very robust and take a proactive approach.  There are some other resources I’ve bumped into over time, that provide good reading and further insight:

Hopefully the above helps you (especially those who emailed me asking)!

Do you have any good solutions? What tools do you use? Feel free to comment your thoughts below.

If you like this post, why not share it?
  • StumbleUpon
  • Digg
  • del.icio.us
  • Google Bookmarks
  • Technorati
  • Reddit
  • TwitThis
  • YahooMyWeb
  • LinkedIn
  • Facebook
  • Live
  • Furl
  • Sphinn
  • Mixx
  • BlinkList
  • blogmarks
  • Ma.gnolia
  • NewsVine
  • Propeller
  • SphereIt
  • Spurl
  • Fark

[Post to Twitter] 

Related posts:

  1. Improving Google SEO - Tips for Your Blog Are you receiving the amount of Google referred traffic you’d...

7 Responses to “Two Tools that Help Protect Your Blog from Content Theft (Scrapers)”

  1. Jonathan Bailey Jonathan Bailey Says:

    Ok, lots to talk about. First, the two plugins mentioned.

    1) Antileech - I’ve heard mixed things about this in terms of RSS scraping. Some think it walks on water, some say it doesn’t do much. I don’t know how it works with 2.7 but there doesn’t seem to be much harm in it.

    2) Copyfeed has had some serious issue with 2.7 that I read. I don’t know if it is all installs or just some. You can simulate the digital fingerprint element with feed footer or even a FeedBurner feed flare.

    Real fast, some other tools I recommend:

    Fairshare (fairshare.cc) - Lets you give it your RSS feed and it will produce another feed of matches the system it detects with info about the matches. Works very well and is both free and easy. Works with almost any RSS feed.

    Whoishostingthis (whoishostingthis.com) - A great site for finding the host of any domain. Very handy.

    FeedBurner Uncommon Uses (feedburner.com) - Works great for tracking “uncommon” uses for your RSS feed, including aggreators and outright scrapers.

    Hope that this helps! Thank you for the great article!

  2. Scott Mahler-Datex Media Scott Mahler-Datex Media Says:

    Sadly, there will always be people that want success without any effort. Stealing is never good, and it’s nice to see there are tools out there that help alleviate this problem, but life being what it is, theives will never go away completely. Thanks for sharing the tools that can help.

  3. Blogger Savvy BloggerSavvy Says:

    @Jonathan Bailey - Thanks for the feedback. I’ll have to keep a close eye on those two then and see if I find any challenges. Thanks for mentioning Fairshare as I’ve never heard of them. I was not aware that Feedburner tracked uncommon uses, thanks for the heads-up on that.

    I looked though your site by the way and was quite impressed with the resources and content. It really is chocked full of goodness, I was impressed! Thanks for a great resource. :)

  4. Blogger Savvy BloggerSavvy Says:

    @Scot Mahler - In the beginning I was not aware of all the resources that helped. I was frustrated and had a steep learning curve to figure out how deter an “fix” my stolen content. In many respects that blog was a true learning environment - immersion - Yikes!

    Also, check out @plagiarismtoday on twitter. His site is dedicated to copyright related issues - tons of resources!

  5. Shiftos Shiftos Says:

    I find that the Antileech is good. The only thing to watch for is that it cannot change the DB table prefix. Otherwise it’s great.

  6. Kikolani Kikolani Says:

    Right now, all I have is the RSS footer, and I usually link to a lot of my other articles throughout my posts if possible. I hope that will prevent the auto-scrapers from stealing my content without any credit to its source.

    ~ Kristi

  7. RaiulBaztepo RaiulBaztepo Says:

    Hello!
    Very Interesting post! Thank you for such interesting resource!
    PS: Sorry for my bad english, I’v just started to learn this language ;)
    See you!
    Your, Raiul Baztepo

Leave a Reply