Blog comment spam
April 5th, 2011Comment spammers in my WordPresss blog ares very nicely filtered out by the free plugin Anti-Captcha.
There is literally no day without spam commentators thus I think Anti-Captcha should be installed by default in any WordPress blog. The plugin simply inserts invisible input fields (invisble to humans that is) and it works so well because most comment spam is done by brain-dead bots that stupidly fill out the first form they see.
But what is astonishing is the nastiness with which these spam bots write the comments:
Look at this example:
“This is a good blog. Keep up all the work. I too love blogging and expressing my opinions. Thanks”
(original spam)
or
“I can see you’re an expert at your field! I am launching a website soon, and your facts will probably be quite intriguing for me.. Thanks for all your aid and wishing you all the success.”
(original spam)
or
“Woah! I’m really digging the template/theme of this blog. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between superb usability and visual appearance. I must say you have done a superb job with this. Additionally, the blog loads super quick for me on Firefox. Superb Blog!”
(original spam)
This kind of “great-site-spam” is somewhat fine with me.
Then there is the p*nis enlargement spam:
“hi, if you find and need information about p*nis enlargement or male enh*ncement products that are guarantee to work…”
(original spam)
Crude but immediately recognizable as spam.
Then there is the nonsense spammer:
“attr gvhpw free xxx hsrtkc r bj g hka ondu niyal www.abc.com tiava fqlysp y pq c bpy”
(original spam)
and the grammatically correct nonsense spammer:
“Will the cable reserve the alien slag? The concealing intolerance reasons. The juvenile brushes like outdoors the handler. A strong blame results. My error sickens without a dropping axiom.”
(original spam)
But now look at this Blog Comment Spam:
“This last post was a little pitchy. Not your best effort. As you know dude I am one of your biggest fans, but now is when you have to bring the goods. For me the post was just O.K. It’s time to show America what you got.”
(original spam)
or
“Great site but your CSS looks wrong in my browser”
(original spam)
or
“Hi, Neat post. There’s a problem with your site in internet explorer, would check this… IE still is the market leader and a good portion of people will miss your magnificent writing due to this problem.”
(original spam)
or
“Does your blog have a contact page? I’m having trouble locating it but, I’d like to send you an email. I’ve got some creative ideas for your blog you might be interested in hearing.”
(original spam)
or
“How is it that just anyone can publish a blog and get as popular as this? Its not like you’ve said anything incredibly impressive more like you’ve painted a quite picture through an issue that you know nothing about! I don’t want to sound mean, right here. But do you genuinely think that you can get away with adding some quite pictures and not truly say something?”
(original spam, username linking to p*nis enlargement)
or
“hello there and thank you for your info – I have definitely picked up anything new from right here. I did however expertise a few technical points using this web site, since I experienced to reload the web site lots of times previous to I could get it to load correctly. I had been wondering if your web host is OK? Not that I’m complaining, but sluggish loading instances times will often affect your placement in google and can damage your quality score if ads and marketing with Adwords.”
(original spam)
or
“Im tired of this, should you spam my web site and also blog page one more instance I will expose you!”
(original spam)
If you wouldn’t know that this junk is being posted by a bot you’d be searching possibly for hours to find out what’s wrong with your blog or you’d be wondering what’s going on.
Disgusting.
Posted in Uncategorized | No Comments »
NOARCHIVE in robots.txt (no more CACHED search results)
March 21st, 2011I was looking for a way to make Google (or any search engine) NOT display the link to the “cached” version in the search results. And this is what you have to add to your robots.txt to prevent crawlers and spiders from creating a cache of your webpages:
User-agent: * Noarchive: /
It’s the word “Noarchive” that prevents Google (and Yahoo, Bing etc) from caching a copy of your website. Below you see other words that might be useful:
And, yes, “Nosnippet” suppresses the snippet itself (= the text below the link) AND the zoom preview. Unfortunately Google doesn’t use something like “Nopreview”. So if you want suppress snippetting, translating or caching then the robots.txt would look like this:
User-agent: * Noarchive: / Nosnippet: / Notranslate: /
Disallow caching by Alexa and WayBackMachine
You might also add …
User-agent: ia_archiver Disallow: /
… to your robots.txt since I think that someone who dislikes caching search engines might also dislike TheWayback machine (archive.org)
Posted in Uncategorized | 5 Comments »
