-
-
I’m quite surprise to see my server logs todays, Some dude decide to scrap my blog content (including my wp translations cache 100mb+ ) The Offending uri: http://www.shouker.com/user1/baiheinet/2008/1/16/80897.html
I’d blocked the site but it wont stop the search engine crawler from indexing the content .
This is nasty Blackhat SEO methods to get the target website penalize for duplicate content on Major Search Engine. There is few solution that i found at various resources ↓.
- Report to Google, proxyreports@gmail.com provide the url & the google search query.
- Block the Proxy Referrer IP
- Add special no index meta for unknown search engine spiders.
<META NAME="ROBOTS" CONTENT="NOARCHIVE, NOINDEX, NOFOLLOW">
How to track Google Proxy Hacked Duplicate Contents
- Monitor your content with Google Alerts try used a unique Search terms for your website. i.e: blog.kakkoi, myname, myunique keywords, url http://blog.kakkoi.net, base64 safe uri encode.
If you have a Google Webmaster Account go to Statistics » What Googlebot sees used the keywords as your Google Alerts search terms. - Search for copies of your page on the Web copyscape
Whitelisting Search Engine Crawler
IMO blocking the IP range of Proxy Server is not very practical. Having a Whitelist of Search Engine Crawler IP (class c) might do the trick. I’m working on a script for whitelisting search engine crawler for my wordpress. Hopefully i can finished it later this week.
Google Algo bugs
Dan Thies at seofaststart.com posts a details analysis regarding this issue, check out his post → Google Proxy Hacking: How A Third Party Can Remove Your Site From Google SERPs.
Recent Update
-
- February 1, 2008 at 6:29 am
- February 1, 2008 at 5:53 pm
- 0.3
- url
-
-
-
One Response to “How to track Google Proxy Hack Duplicate Contents”
Trackback URL: Use the TrackBack url ↑ to ping this article. If your blog does not support Trackbacks you might want to leave a comment instead.
-
-
"write as if you were talking to a good friend (in front of your mother)."
.haveyoursay
Disclaimer: For any content that you post, you hereby grant to Kakkoi the royalty-free, irrevocable, perpetual, exclusive and fully sublicensable license to use, reproduce, modify, adapt, publish, translate, create derivative works from, distribute, perform and display such content in whole or in part, world-wide and to incorporate it in other works, in any form, media or technology now known or later developed. Some rights reserved.
-
update: baiheinet.shouker has stop indexing raw contents. They used IE screenshot now. nice
http://www.shouker.com/v/view.aspx?id=80897