DRUPAL: How To Use BadBehavior 2.1x with Drupal 6.x

If you are a regular reader of my blog you know that BadBehavior is one of my favorite Drupal modules.  Bad Behavior is a script that does a great job of blocking bots, hijacked PC’s and proxies which helps protect your site from spammers, scrapers and trolls.

Using BadBehavior on your Drupal 6.x installation is a bit confusing.  First, you install the Drupal Bad Behavior module, then you must download the Bad Behavior script from ioerror.us and drop those files into a sub-directory of your Drupal BadBehavior module files.  To make things even more confusing is the fact that the official Drupal Bad Behavior module seems to have been abandoned by the maintainer since April 2008 and the module only works with the BadBehavior 1.x script.  If you try to install the latest and greatest 2.x scripts from ioerror.us, you’ll get nothing more than a WSOD.   You can use the official Drupal Bad-Behavior module with the older BadBehavior 1.x script, but you’ll be missing out on one of the best features of the newer Bad Behavior 2.x, HTTP:BL IP blacklisting provided by projecthoneypot.org.  This new feature blocks known spammer’s IP’s from their live, continually updated database of dirty IP addresses that you probably don’t want hanging around your website.

Confused yet?  Don’t worry.  I’ll show you how you can use the latest and greatest Bad-Behavior 2.x scripts with Drupal and the best part is it’s even easier than installing the officially supported, older Bad Behavior module.

How To Use Bad Behavior 2.x with Drupal:

Drupal.org member Gregarios has created a fully-patched (yet not officially supported) version of the Drupal 6.x-1.0-rc2 Bad Behavior module that works with the newer version 2.1.2 of the Bad Behavior script.  All you need to do is download the patched module from Drupal.org, unzip it and drop it into your sites/modules directory.  If you are using the older Bad Behavior module and script, you can simply over-write your existing files – there are no changes to the database structure and you do not need to run UPDATE.PHP.  After dropping in the new files, you’re done, that’s it!  You are now using the new 2.1.x version of Bad Behavior!

To activate the new HTTP:BL feature of Bad Behavior, bop over to projecthoneypot.org and create your (free) http:BL access-key.  Once you have your 12-character access key, go to your Drupal Bad Behavior configuration page (Admin/settings/badbehavior) and paste it into Project Honey Pot Key field and save it.

Note on the Bad Behavior 2.1.x whitelist:

IF you were using the whitelist in the older version BadBehavior, you will have to update your current whitlelist from the (older) whitelist.inc.php to the (newer) whitelist.ini.

The newer whitelist.ini file uses the following format:
[ip]
; Digg whitelisted as of 2.0.12
ip[] = "64.191.203.34"
ip[] = "208.67.217.130"
ip[] = "198.66.255.84"
; websiteoptimization.com whitelisted
ip[] = "67.225.164.12"

So if you had an older entry in whitelist.inc.php like this:
"66.249.64.0/19", // Google Bots
You will have to add it to whitelist.ini like this:
; Google Bots
ip[] = "66.249.64.0/19"

Of course, before upgrading or replacing any files, be sure to back up all your current files and your Drupal database.

I hope this helps – if you have questions, post a comment!

Spam Prevention For Drupal

If your website gets traffic then it probably also gets spam – and since we get a lot of traffic we have had to deal with a lot of spam.  Over the years we have tried virtually every spam solution available for Drupal with the exception of pure-captcha spam solutions because they are so damned annoying to visitors.  Now that we have finally upgraded from Drupal 5 to Drupal 6 a whole new world of additional anti-spam modules is now available for us to use and I think we’ve finally found the perfect anti-spam solution for Drupal.

Here is a quick roundup of the spam solutions currently available for Drupal – if you want to just see what the best anti-spam for Drupal is, scroll to the bottom:

  • Akismet – The Akismet module for Drupal lets you use the (very effective) Akismet service on your Drupal site.  Akismet is works great at blocking spam and is very popular with WordPress users.  Unfortunately the Drupal Akismet module has been abandoned and is no longer supported, and doesn’t really work.
  • AnitiSpam – The AntiSpam module is the successor to the now-dead Akismet module for Drupal.  However with AntiSpam you can choose between using the Akismet service, TypePad’s AntiSpam service and the Defensio anti-spam service.
  • Badbehavior – The BadBehavior module allows you to use the BadBehavior script available from ioError.us on your Drupal site.  It works by stopping spambots from accessing your site before they get a chance to post any spam.  They can’t spam what they can’t see.  (there is also a WordPress plugin for Badbehavior)
  • Captcha – The Drupal Captcha module is a standard challenge-response that presents a “captcha” image for the visitor to figure out and answer.  The idea is that spam bots are not smart enough to “see” what the captcha image is.
  • Captcha Riddler – Presents a captcha in the form of a riddle that must be answered before the post is published.
  • Egglue Captcha – Another riddle-type captcha that requires human intelligence to make make a post.
  • Mollom – Uses a combination of CAPTCHA’s, user/IP reputation as reported back by other Mollom users, and text analysis.  Posts are first checked against IP’s reported as spammers and then the text is analyzed. If Mollom decides the post is spam, it is blocked.  If Mollom is not sure about the validity of a post it presents the user with a CAPTCHA to figure out.  If the user submits the CAPTCHA correctly the post is published, if not, the post is blocked.  Mollom was created and is maintained by the guy that developed Drupal so you might think it is the best choice to use on your Drupal website – but my experience and a quick look at the Mollom support queue says otherwise.  Mollom is often ‘hit ‘n miss’ on stopping spam, suffers from system slowness or downtime, and does not have the best track record for support.
  • RECaptcha – More annoying captcha’s for your visitors to figure out.
  • Spam – Uses filtering and “learning” to block spam.
  • Spambot – Checks member details against the Stop Forum Spam system.  Probably not very effective for anonymous posts.
  • Spamcide – Adds a hidden field to forms that only spam-bots will see and fill in.  Not effective for human spammers.

So what is the best anti-spam module for Drupal? In my opinion there is no ONE best spam system – however the combination of these two antispam modules seems to work perfectly:

BadBehavior – If you read my blog regularly you know that i love this module.  BadBehavior works a bit like the Mod_Security Apache application firewall by blocking visitors (spammers, bots & bad guys) based on pre-defined rules which analyze the URL requested, browser agent ID,  and IP checks against http:BL.  Not only is BadBehavior very effective at blocking spambots, it also does a decent job at blocking certain types of  hacking such as SQL injections and it also blocks many proxies which spammers (and gutless assholes) love to hide behind.  Bad behavior works on Drupal with the BadBehavior module and on WordPress blogs with the BadBehavior WordPress Plugin.

Akismet – As I posted in my list above, the Akismet module for Drupal has been abandoned, however, you can make use of the Akismet system on your Drupal site with the AntiSpam Module for Drupal.   Just install AntiSpam like any other Drupal module, jump over to Akismet.com to get your free API key and you are set.  Once you setup Akismet in the AntiSpam module you can virtually forget about it.  Akismet will unpublish the spam that it detects and will automatically delete it after a preset amount of time that you choose.  You may want to review all the spam it blocks until you trust it, but in my experience Akismet has been right virtually 100% of the time.

Badbehavior + Akismet = NO MORE SPAM !  On Drupal or WordPress.

How We Block Proxies, Bots, Scrapers, Trolls & Assholes

As a website owner, you probably have at least a few good reasons to block bots and scrapers.  Scrapers steal your content and unruly bots can do anything from eating your bandwidth to trying to hack into your site.

As a forum or community owner, you may also have reasons to block proxies.  Proxies are what gives many trolls, fakes, assholes, idiots, jerk-offs, and other pitiful people in general, their false bravado.   For some reason, these “tech experts” that have the elite skills to be able type the words “free proxy” into Google, or figure out how to install a TOR client, grow giant balls when they think you can’t track them down to their real IP address.  Give this kind of anonymity to these socially unbalanced people (that’s a nice way of saying losers in real life, or people that forget to take their meds) and they suddenly become “tough guys” with no fear to wreak havoc in your community.  BUT, take away their proxy, force them to log-in from home or work and they suddenly become able to follow the rules or more likely are too chicken to do or say anything and alas, they go away!  If they DO continue to insist on making themselves feel better (it’s sad, I know) by bullying or causing trouble in your online-community, then one report to their ISP (or, the FBI if they are REALLY going overboard) or employer will usually take care of it.  Imagine what mommy and daddy will do when their internet account gets terminated!  If they are adults (yes, sadly “adults” do pull this kind of shit), then they’ll have to deal with the hassle of getting a new ISP or deal with mommy and daddy if they live with their parents in the basement (a common trait of internet trolls).   If reporting them doesn’t help, you can ban their IP and have no worries that they’ll just come right back via a proxy.   Sure, since you can never block 100% of the proxies out there, they may still find a proxy that works, but as your proxy blocking skills grow, eventually it will become too much hassle for all but the most pitiful of trolls or assholes and they’ll give up and go get their kicks bothering some other community.

So here are a few updated tips for blocking bots, scrapers, and proxies (aka trolls and assholes).  Much of this is Drupal focused, but much can be applied to any website/blog/forum.

Start with the obvious:  The Drupal Troll Module.  The Drupal 5.x version of this module had been abandoned several months ago after a critical security flaw was discovered.  But after popular outcry it has been updated and is supported again.  The Troll module allows you to block IP address and re-direct them to a static HTML page, but it also allows you to search your member database by IP address or email address (very handy in some situations).  It supports wildcard searching (just leave the last octet of an IP address blank for example, and it will return all matches) so even tracking down assholes trolls using DHCP is easy.  The Troll module will also easily show you every IP address that a member has ever signed-in with (User|Troll Track) and the domain name.  A member using a legit IP will show a history from the same address or ISP, whereas someone using a proxy will show as coming from many different locations and domains.  After you’ve looked at a few IP histories, the proxies will stand-out like a sore thumb.  You can then block those IP’s using the Troll module or your IPTables firewall.

Next on the list is BadBehavior.  If you use Drupal, you need to install the Drupal BadBehavior module and the BadBehavior script.  If you use WordPress, you need only the script.   BadBehavior can also be modified to work with virtually any PHP based website/forum.  BadBehavior blocks almost all automated bots, scrapers, and spammers – and if used in combination with something like Akismet or Mollom, spam becomes almost a non-issue.  When put in “strict mode” BadBehavior blocks many (but not all) proxies, and is a great first-line of defense, but you can also use information from Bad Behavior with  CSF/IPTables firewall to locate Proxy/Server farms and block them en-masse.

Now for the big guns: The IPTables Firewall.  IPTables allows you to block individual IP address or CIDRs (entire ranges of IPs) from accessing your website/server but instead of simply re-directing blocked address to a static page at the domain-level like TROLL does, IPTables/CSF “drops” all the packets, leaving the troll/asshole/proxy user nothing but an “unable to connect” error.  IPTables is very powerful, and almost by definition that makes it difficult to use.  Because of that, I recommend using CSF Firewall which is almost a GUI for IPTables and also adds some great additional features.    To use IPTables/CSF you need either a VPS or dedicated server with root access.  If you are on a shared host and have asshole problems, you might have to put your big-boy pants on and move to a dedicated or VPS server.

Once you get CSF up and running (it’s really not that tough), do the obvious things like activating the Real Time Block Lists (RBLs) and use the CC_Deny setting to block entire countries that you don’t need hanging around your site (North Korea, China, Turkey, Russia, India come to mind).

After you’ve blocked all the undesirable countries with CC_Deny, you can move on to the CSF.DENY file which allows you to block IP’s and ranges of IP address in CIDR format.   The first thing you can do is import any IP addresses that you’ve already blocked with the TROLL module – then you can start building your proxy-blocking list.

In building your proxy-block list, you aren’t just blocking proxy servers, you really want to block all servers.  There is really no reason for any server other than Google bots, Yahoo, etc, to access your site so blocking any/all ‘server farms’ will protect you not only from assholes using proxies, but also from compromised servers trying to hack your site.  The best source I have found for building my block list (now blocking hundreds of thousands of IP’s and several million domains) is the Bad Behavior module (mentioned above).  By learning how/why Bad Behavior blocks IP’s you can identify servers and server farms and add them by the thousands to your CSF.DENY file.

What to look for in Bad Behavior:  Each time Bad Behavior blocks an IP it logs the IP address and the reason.  The following reasons often (not always, you have to be careful) mean that the originating IP belongs to a proxy or a server:

  • Header ‘Connection’ contains invalid values
  • Required header ‘Accept’ missing
  • Prohibited header ‘Proxy-Connection’ present
  • Header ‘Referer’ is corrupt

Get the IP address from Bad Behavior identified with one of the reasons above and do a quick WHOIS lookup on it.  I like to use http://whois.domaintools.com, but any WHOIS server will do.  Usually (not always) a server or proxy will show other sites listed, an SSL cert, etc.  For example, look at this WHOIS for 67.159.1.17 .  A WHOIS lookup for a regular home ISP connection, or a business won’t show much info at all, for example, look at this WHOIS for this Comcast home user.

So now you have your IP, in our example above, 67.159.1.17, but you dont want to block just that IP, you want to block every server in that entire IP range.  To do that, you add the CIDR to your CSF.DENY file in CSF.   The example server/proxy above has the following CIDR in it’s WHOIS info:

OrgName:    FDCservers.net
OrgID:      FDCSE
Address:    141 w jackson blvd.
Address:    suite #1135
City:       Chicago
StateProv:  IL
PostalCode: 60098
Country:    US
ReferralServer: rwhois://rwhois.fdcservers.net:4321
NetRange:   67.159.0.0 - 67.159.63.255
CIDR:       67.159.0.0/18   <--------------  This is the CIDR
NetName:    FDCSERVERS
NetHandle:  NET-67-159-0-0-1
Parent:     NET-67-0-0-0-0
NetType:    Direct Allocation
NameServer: NS3.FDCSERVERS.NET
NameServer: NS4.FDCSERVERS.NET

If you aren’t positive this is a server-farm you could visit the domain listed, in this case, FDCservers.net.  Their website clearly shows that they are a server hosting company.  You could also google the company name or even the IP to dig up more info.  Now that you are positive that you want to block this entire range or CIDR of 67.159.0.0/18, simply add it to your CSF.DENY.  Sometimes, usually with foreign servers, a CIDR won’t be listed.  In a case like that you can still block an entire range of IP’s by using a CIDR Calculator and entering the beginning IP address and the mask or range/number of IP’s to block.  I usually block an entire 16-bit range, which for the example above would be 67.159.0.0/16  instead of the CIDR above “/18” which applies only to FDCServers, using “/16” blocks everything that starts with 67.159.

When adding your IP’s or CIDR into CSF.DENY be sure to add “# do not delete” after each entry.  Otherwise, once you hit the limit of IP’s specified in your CSF configuration file, older entries will get overwritten with newer entries.

How to block TOR: The Onion Router or TOR is a network of proxies intended to protect the anonymity of internet users.   TOR is great for whistleblowers or government protesters, but not so great for website owners trying to keep assholes out of their community.  TOR is fairly easily blocked by adding the list of “TOR Exit nodes” into CSF.DENY or TROLL.  You can get an updated list of TOR exit nodes here: TOR Exit Node list.  TOR is dynamic and the list changes, so you’ll have to update it every few days or so.

How to block Port Proxies or SOCKS proxies: Port or SOCKS proxies are almost always blocked by Bad Behavior

Sometimes you may end up blocking legitimate users, particularity when blocking entire ranges of IP’s – it’s unavoidable.  When someone complains, confirm their IP address and just remove them from CSF.DENY or your TROLL list – no big deal.  I’ve been using these methods for over a year and I’ve only blocked 10 or so legitimate users (that i know of at least).

If you don’t have/can’t use IPTABLES/CSF, you can also use some of the techniques above to block IP’s and CIDRs in your .HTACCESS file, but I cannot vouch for how well it will perform when the list grows large – and to be effective it needs to be really, really large.

This has turned out to be one of my longest and mostest rambling posts.  If I’ve been unclear or if you have any questions, please post a comment.  And oh – if you’re reading this via a proxy, post a comment and tell me that my techniques don’t work!