Home > Spam > How to Fight Blog Spammers with Bash, mod_rewrite and Cron

How to Fight Blog Spammers with Bash, mod_rewrite and Cron

February 11th, 2005

If you run your own webserver, you are certainly a blog spammer’s target. Yes, you are.

They use compromised boxes or open proxies to launch their bots on your website, posting comments or sending trackbacks on your blog, or simulating referers hits with their domain names. All this to increase their visibility.

This new way of polluting the World Wide Web must become as obstructing as mail spams. Here is how I proceed to block those kind of attacks, using basic and well known tools: mod_rewrite for denying access, bash for writing a simple IP addresses grabber script and Cron for scheduling.

The strategy here is to block requests that match one or more of those conditions:

  1. The user agent is known to be a spambot.
  2. The IP address is blacklisted.
  3. The referer is known to be a fake one.

1. Grabbing IP addresses

Let’s start by greping your accesslog for finding the IP addresses related to the attacks.
That tiny shell script will help you to do this job. It takes as its only argument a pattern used for performing the grep in the accesslog. Your only job is to use a good pattern.

Once you have lauched this script, the file /usr/share/apache/maps/ip-baned.txt (or whatever you chose) will contain all the IP addresses you don’t want to serve.

2. Apache configuration

We can now update the Apache configuration in order to setup mod_rewrite:

First filter the User Agents:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (AGENT_1) [OR]
...
RewriteCond %{HTTP_USER_AGENT} (AGENT__N)
RewriteRule ^(.*) - [F]

Then use the blacklisted IP addresses:

RewriteMap ipbaned txt:/usr/share/apache/maps/ip-baned.txt
RewriteCond ${ipbaned:%{REMOTE_ADDR}|NOTFOUND} !=NOTFOUND
RewriteRule ^(.*) -                                     [F]

And filter fake referers:

RewriteCond %{HTTP_REFERER} (DOMAIN_1) [NC,OR]
...
RewriteCond %{HTTP_REFERER} (DOMAIN_N) [NC]
RewriteRule ^(.*) - [F]

Restart Apache and enjoy all the 403 errors you’ll send to the spammers.

3. Using Cron for updating the blacklist

The last thing to do is to setup a cron script to periodically update your IP blacklist using the little script I provide.

You’ll then receive a mail from Crond whenever a change appears in the blacklist file, seeing which IP addresses are added.

Using this simple solution works great for me, my log analyzer shows more than 1400 hits refused with a 403 error in less than 3 days of use…

Spam

  1. No comments yet.
  1. No trackbacks yet.