Block all those unnecessary requests you see in the logs

I assume you have looked into your website visitor logs and fed up with those unwanted requests. They may not be necessarily bad or spam, but it is they may not add value to the business. Imagine if the majority of the hits are coming through that unwelcomed user-agent or referrers and you think your site is getting good traffic, but in reality, they are useless.

The best way to manage them is by stopping them at the edge like network devices, load balancer, firewall, or CDN. But, I understand it may not be feasible for a personal blogger or small websites to use, and you may want to block at a lower level like web servers, WordPress, etc.

I hope you already have a list of referer and user-agent you want to block. Let’s get it started.

As a best practice, take a backup of configuration file before you modifiy so you can roll-back when things goes wrong.

Nginx

Nginx power millions of sites and very popular among web hosting. If you are using Nginx, then here is how you can stop them. Let’s say you are getting lots of automated requests with the following user-agent and you have decided to block them.

  • java
  • curl
  • python
if ($http_user_agent ~* "java|curl|python") {
    return 403;
}

If you would you like those to redirect somewhere, then:

if ($http_user_agent ~* "java|curl|python") {
    return 301 https://yoursite.com;
}

The above configuration must be under the server block.

And the following to block by referrers. The following example which should go under the location block for blocking requests from semalt.com, badsite.net, example.com.

if ($http_referer ~ "semalt\.com|badsite\.net|example\.com")  {
  return 403;
}

After making necessary changes, you need to save the file and restart Nginx to take effects.

To restart Nginx, you can use:

service nginx restart

Nginx is a powerful web server and if you are interested in learning, then check out this online course.

Apache HTTP

To block user-agent in Apache, you can use the mod_rewrite module. Ensure the module is enabled and then add the following in either .htaccess file or respective .conf file.

If you are having multiple sites configured and want to block for a specific URL, then you may want to put them in respective VirtualHost section.

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} badcrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} badbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} badspider [NC]
RewriteRule . - [R=403,L]

The above rule will block any request containing user-agent as badcrawler, badbot, and badspider.

And, the below example to block by the referrer name BlowFish, CatchBot, BecomeBot.

RewriteEngine on
RewriteCond %{HTTP_REFERER} blowfish|CatchBot|BecomeBot [NC]
RewriteRule . - [R=403,L]

As usual, restart the Apache server and to test the results.

WordPress

If you are using WordPress on shared hosting or don’t have access to web server configuration or not comfortable in modifying the file, then you can use the WP plugin. There are many WP security plugins, and one of the popular one for blocking bad bots are Blackhole for Bad Bots.

Conclusion

I hope the above tips help you to stop the bad one so legitimate requests are not impacted. If you are looking for comprehensive security protection, then you may also consider using cloud-based WAF like Astra or SUCURI.