“Holy smokes, I’ve been hit!
My comment spam ‘secret code’ filter is working like a charm – no spam in weeks, but now they’ve decided to spam through trackback. The other day I had two new trackback pings on older entries, both spam. This morning I had 135, all spam. Yikes. So, later today I’ll be deleting away, but it will take a while…” – Salguod.net, February 01, 2005
Had to make some changes, the spammer decided to ‘teach me a lesson’ by adding healyourchurchwebsite to his referrer. I’ve tightened that up – and in the process, also snarfed some information from him/them – and am now in the process of filing a formal complaint to the Feds. Take note to the sections in yellow where my examples include healyourchurchwebsite…
Like many of you, I noticed a spike in Trackback Spam pointing to various card-shark subdomains at terashells.com, chat-nett.com and other domains that are sure to change on a daily basis.
First thing I noticed: the same crap coming in from a variety of anonymous proxies. This mean blocking by IP would quickly become a full-time job. As a stop-gap, I employed a girthy but quick-n-dirty .htaccess solution offered at Aaron Logan’s Loblogomy blog.
I knew I’d have to find a more efficient approach, I also know that Mark Pilgrim’s ‘How to block spambots …” was causing some other issues on my server because I suspect my server is configured slightly different than his. This happens.
Still, I didn’t want snoopers like the one I saw from BranDimensions.com, not that I’m hiding anything, but they’re not paying me for my bandwidth even though they profit from it. I needed a solution to solve my short-term trackback spam issue, and take care of my long-term no-pay no play policy regarding the commercial abuse of my bandwidth.
The Not-So-Final Solution
With not all that much searching, I found that Parker Morse of Flashes of Panic offered an elegant .htaccess approach that would get me 98% of what I needed. Ina post entitled ‘A little meanness,’ Morse employs a Blocking Referer Spam – mod_rewrite technique developed by Ed Costello back in May of 2004.
The (obligatory) Warning
Before we go any further, I need you to understand that while this is an excellent approach, it is not without its dangers. Dangers made clear in an absolutely must read, related post entitled “Killing referrer spam,” Caveat Lector offer this excellent advice:
BE AWARE: YOU CAN BORK YOUR WEBSITE WITH THIS. I’ve done it. (In fact, I did it two minutes ago. Go me.) How will you know your .htaccess file is borking your site? Well, usually, when you browse to your weblog’s URL you’ll get a “500 Internal Server Error” page of some sort instead of your beloved weblog.
Always, always, always keep a last-known-good version of your .htaccess file! If you’re using FTP to place your .htaccess file and you bork your site, you just upload the last-known-good file, and you’re golden.
Or in my case, working from a jailed ssh session I was able to do the following:
pico htaccess_parker.txt #see modifications below#
cp .htaccess htaccess_02feb05.txt
cp htaccess_parker.txt .htaccess
After downloading Parker’s text file version of his .htaccess file, I gave it a quick inspection and modified the following line:
SetEnvIfNoCase Referer .*flashesofpanic\.com.* !spam_com
SetEnvIfNoCase Referer “.*(blogs4god|healyourchurchwebsite|redlandbaptist|mission4me) *” !spam_com
The script also needed to be modified because I found some problems when trying to enter a post using my crufty old version of MovableType, so I had to add a line to Parker’s otherwise excellent approach. A problem also described in Laurabelle’s Blog article “Die spammers die!“. So after adding a few more drug names to the kill list, I immediately followed with another line of code:
SetEnvIfNoCase Referer www\.healyourchurchwebsite\.com\/cgi-bin\/mt/mt\.cgi.* !spam_ref
I suspect this fix was necessary because the way the .htaccess file is set-up, everyone is considered a spammer until we say they’re not. More on how-to modify and the mechanics of how this all works can be found over at Caveat’s column.
Finally, you may want to block the user agent CandyGenius has identified in this delicious post which asserts:
The trackback spammer is leaving the same signature as the comment spammer. It’s the same guy. Use the code above to block it all. (psxtreme & freakycheats but that will change tomorrow.)
A quick-n-dirty test of this is to Google your domain using one of the forbidden words. This is because that word will now appear in the referrer header from Google and you should be able to block yourself. For example “healyourchurchwebsite poker.” Not the most fool-proof test, but close enough for government work.
Likewise, let me know if you have improvements or patches … I’d be interested in seeing them.
Update 11:54 AM
It is becoming evident that this trackback spamming is less about advertising, and more about denial of service. For about 2 hours this morning, my server was under attack – the information below thwarted all but two trackbacks out of several hundred attempts. In the meantime, I am pondering whether or not I should enforce my terms of service and provide the spammer a bandwidth test using a variation of the following wget command:
However, if this is about denial of service, and since the spammer is abusing several anonymous proxies, it could be that the owners of the URLs are also innocent victoms. Your thoughts?