Inside Out Outside In

Checking for HTML injection in all your comment fields.

When checking for html injection in your blog comments, unless you're moderating your comments, remember to check all the fields, not just the body and strip the html.  I was reading some old blog entries of some friends (cough* Kevin's, sorry bub!) and noticed quite a bit of spam in the comment titles.  The spammer had used a link the comment title.  Unfortunately, the link doesn't have a rel="nofollow" attribute either, so the blog is just contributing to the spammers ranking. 

A Google Wish: Site BlackListing

An old post  (almost 2 years now) by Ray Camden on Expert Exchange continues to receive comments.  While there is a solution to removing expert exchange using creative search term formatting or by using FireFox/GreaseMonkey, I only wish that Google would provide a blacklist feature, something akin to a [Remove This Site from your future searches]  link.  Would I trade a bit of personal privacy for this feature as they tracked my google search habits?  Absolutely!

Stopping Spam is like trying to stop a thief.

Spammers, like thieves are generally opportunistic in nature.  email is free, so they spam.  Thieves steals if they  thinks they can get away with it.  Drive by a big electronic store and look at the front.  It used to be that all you would see were iron bars protecting the windows, until the thieves learned it was easy to steal a car and just drive it through the window for a smash and grab.  Now you see big cement columns.

Other than the obviously escalating security measures, the main idea is to make it so unattractive for the thieves to steal from you that they choose to steal from somebody else.  To actually nab the offenders requires a bit more effort.  Camera's help, but to really stop them, you have to be able to stop them in the act.  This requires onsite enforcement which can be costly.

The same goes with spam.  As coders, we have obligations to prevent the opportunistic spammer from abusing the systems we create such as blog engines.  There are many ways to accomplish this, mostly by attacking the core nature of the spammer which means you have to understand your adversary to begin with. 

The following steps, in an escalating fashion help to direct the spammers attention elsewhere.

  • Eliminate automated system attacks
    Spammers implement bot networks because it's easy and requires little effort.  A wide enough spam campaign using bots can manage enough of a distribution to make money.  This is where the Captcha comes in.  The strongest Captcha is the one that can't be deciphered by machine. This also means it's sometimes hard for a human as well.  A Captcha is a distorted image containing a pass phrase which must be entered. A Variant of a Captcha is to use a fill in the blank approach containing a phrase that your target audience can identify or a multiple choice question such as "Who's buried in Grant's Tomb".  Both Captcha's and the variants must also be random to prevent automated systems from being adapted to the system.
  • Eliminate Volume
    Actually part of the automated system attack, but if you have a blog it would be pretty unusual for anyone to post 100 comments in an hour.  Set a threshold limit where a specific user can only send n number of comments.  This can be accomplished by checking the IP address of the submitter (except for wide area bot attacks, see next step).
  • Eliminate Anonymity
    Spammers thrive on not being found out.  Requiring a validation step where an email is sent which must be replied to attacks this aspect of the spammer.  There are many levels to validation, including simple reply, an approval formal request method, or even submission of documents of some type.  Also in this category is the "Pay to Play" idea where you have to be a paying member or donate to your favorite charity.  PayPal is great for this.
  • The Gate Keeper
    The final step is the most resource intensive, it's YOU!  Moderate the comments and require approval of all submissions before posts and re-broadcasts.  This can be a little easier if you have trusted friends that can be set up as moderators.

Black Frog

I  read an intersting article on ZDNet UK by way of Slashdot regarding a new anit-spam service called Black Frog, which will step into the role formerly occupied by Blue Frog but this time will use a distributed system and other safeguards which will make a DDoS attack much harder.  While I think this a great step, I'd like to see it target bandwidth consumption like the now discontinued "Make Love Not Spam" campaign where the site being advertised by the spammer was targeted in the background with http request.