Untangled Development

Too many Invalid HTTP_HOST header exception errors

Have you deployed an application to production?

Are you getting too many, seemingly random Invalid HTTP_HOST header exception errors?

Background

I have recently deployed a Django app:

  • served via gunicorn/Nginx
  • onto an AWS Lightsail instance
  • with a CloudFlare certificate for https/SSL (followed this answer on StackOverflow for configuring it)

Following deployment I had a barrage of Invalid HTTP_HOST header error emails. Every. Single. Day.

I looked at the Nginx configuration. But it looked identical to other configurations I have in production. Configurations that do not have this kind of error.

The Nginx server block looks like:

upstream dbr_project {
  server unix:/home/ubuntu/[..]/gunicorn.sock fail_timeout=0;
}

server {
    listen 80 default_server;
    listen [::]:80 default_server;
    server_name subdomain.example.com;

    if ($http_x_forwarded_proto = "http") {
      return 301 https://$server_name$request_uri;
    }

    ...

}

On further inspection, these exceptions were being caused by bots. Example user agent strings:

HTTP_USER_AGENT = 'Mozilla/5.0 (compatible; Nimbostratus-Bot/v1.3.2; http://cloudsystemnetworks.com)'
HTTP_USER_AGENT = 'masscan/1.0 (https://github.com/robertdavidgraham/masscan)'

Even though /robots.txt is set up to prevent robots visiting all URLs. This is its content:

User-agent: *
Disallow: /

So the cause looks external. And the “dirty” fix I resorted to is to stop reporting this class of error.

Fix

Update: I revised this fix following the comments made below. What I applied to finally fix this is an Nginx more than a Django fix. I describe the Nginx change I made in the section “Revised Solution” below.

I’ve followed this answer on StackOverflow to stop reporting, or “suppress”, this error.

My previous LOGGING config (which I stripped down to a bare minimum for this post) looked like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "handlers": {
        "mail_admins": {
            "level": "ERROR",
            "class": "django.utils.log.AdminEmailHandler",
        },
    },
    "loggers": {
        "django.request": {
            "handlers": ["console", "mail_admins"],
            "level": "ERROR",
            "propagate": False,
        },
    },
}

To stop reporting this error:

  • I’ve added a null handler; docs on LOGGING handlers here
  • set the handler for logging django.security.DisallowedHost exception to use this null handler

This is how the logging configuration looks like after the changes. The newly-added handler and logger are highlighted:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "handlers": {
        "mail_admins": {
            "level": "ERROR",
            "class": "django.utils.log.AdminEmailHandler",
        },
        "null": {
            "level": "DEBUG",
            "class": "logging.NullHandler",
        },
    },
    "loggers": {
        "django.security.DisallowedHost": {
            "handlers": ["null"],
            "propagate": False,
        },
        "django.request": {
            "handlers": ["console", "mail_admins"],
            "level": "ERROR",
            "propagate": False,
        },
    },
}

Final Thoughts

I’m not sure this is the best “solution”. In fact I regard it a “dirty fix” more than a solution.

Because ideally I wouldn’t be suppressing any django.security exception notifications.

But it follows Django’s standard machinery to suppress logging a specific exception. Which is a good thing.

Let me know if you handle this differently. Or in any way better!

P.S. Sentry or similar can group this exception into one “error” page. Which in effect contains thousands of this exception’s occurrence. That does not count as a better approach 😊

Revised Solution

Take a look at the comments thread below. Bots are effecting requests without passing the correct host.

Solution is to enforce the domain subdomain.example.com can using Nginx. This is the resulting Nginx config:

upstream subdomain_project {
  # fail_timeout=0 means we always retry an upstream even if it failed
  # to return a good HTTP response (in case the Unicorn master nukes a
  # single worker for timing out).
  server unix:/home/ubuntu/[..]/gunicorn.sock fail_timeout=0;
}

server {
  listen 80 default_server;
  listen [::]:80 default_server;

  if ($host !~* ^(subdomain.example.com)$ ) {
    return 444;
  }

}

server {
  server_name subdomain.example.com;
  ...
}

Notice how:

  • the domain subdomain.example.com is being handled by the second server block
  • unless the request’s host is subdomain.example.com, then HTTP 444 is returned.

Comments !