I've been abusing HTTP Status Codes in my APIs for years

update: Hacker News raises some good points. I’d like to clarify I’m not talking about a RESTful service, but rather HTTP RPC. That said, I’m basically implying a RESTful endpoint, so perhaps this is the source of my confusion. The important question then is this: Is there a distinction between a bad URL and a missing record, and if so, should you try represent it differently?

You’re using HTTP status codes wrong

Got your attention? Good. That’s what got my attention too during my colleagues' discussion this morning.

In this post I want to put forth the argument that as an application programmer building HTTP APIs, you should never touch HTTP status codes. Ever. For any reason.

Naturally, this argument does not apply to anyone working on actual web servers, such as nginx.

A prime example

You have some GET HTTP resource defined, say /api/v1/employees/<employee_id>. It’s pretty simple: If you pass a valid employee ID to this service, it will return all the details the service has stored for the employee, as well as some kind of error if you don’t pass a number for the ID or if you pass an ID that doesn’t exist.

If your mind went straight to 404 and 400, then this post is for you.

What actually happens

You try employee 1, fantastic, it works!

You try employee 100, not fantastic, it 404’d.

Huh?

Why do I get a 404 here? The path is clearly correct, otherwise employee 1 wouldn’t have worked either.

“Ah”, you may be thinking “but it clearly means that the employee wasn’t found!”

No, there’s nothing clear about that. If I were to call /api/v11/employees/1 I would get the exact same error. As an API consumer, all I want to do here is raise my middle finger.

But as an API producer, this results in a conundrum: What am I supposed to do then?

Not the application layer you’re looking for

Maybe part of my confusion is that RFC 7230 defines HTTP as an Application Layer protocol, which means it should represent application logic, right?

Right?

Wrong.

HTTP is just a protocol defining behavior that belongs in layer 7. It’s not a transport layer in a technical sense, but from the perspective of an API it’s mostly just TCP with extra steps. The problem comes in when we start to use protocol errors to denote application problems.

HTTP status codes are used to denote the state of the HTTP transaction, and using them to denote the result of the application logic is abusing the specification.

You wouldn’t return a TCP error to denote you couldn’t find an employee, so why are you using HTTP errors?

2 problems to solve

In any API call, there are 2 problems to solve as a client when you are processing the response:

1: Did the technical request succeed?

2: Did the business/domain request succeed?

Technical request

Networks are flakey, everyone knows that. Sometimes you lose packets, sometimes you fat finger URLs. In any scenario, application logic cannot be processed until you satisfy the technical constraints in front of it. In this instance, we’re talking about HTTP.

For HTTP 1/1 requests, Host is a mandatory header. Failure to provide this results in an immediate 400 from a compliant web server.

Further headers are required to locate a resource, such as path and possibly method.

If I’ve provided all of these constraints to the target service, then the HTTP transaction has succeeded (provided the server doesn’t blow up processing it, looking at you NullPointerException).

Business/Domain request

Now that the web server is happy with the incoming payload, it can correctly identify which resource to call. We’re discussing APIs, so the resource in this instance is some business/domain logic. In our example, that would be the service responsible for retrieving the employee record.

Any business error that comes out of this layer (such as providing a letter instead of a number or an ID that doesn’t exist) is not something the web server cares about in any shape or form. Importantly, neither does the client. The HTTP client only cares about whether or not it successfully created a valid HTTP request for the server to parse.

You might not like where this is going, but your consumers will thank you: regardless of the business/domain outcome, return a 2xx status code.

Opionated payloads should be mandatory

Returning a 2xx code immediately tells the client that the HTTP response contains a payload that they can parse to determine the outcome of the business/domain request. That is to say

  • client checks HTTP response is valid (2xx status)
  • client can confidently parse the response and make a domain oriented decision, as opposed to a techinical one

This makes your client happy. Very, very happy. Using our above examples, here is what we would see:

Success scenario: /api/v1/employees/1

StatusCode: 200

Body:

{
    "result": true,
    "payload": {
        "id": 1,
        "name": "slim",
        "surname": "jim",
        "email:" "james@slimjim.xyz",
        "role": "chief doughnut"
    }
}

Failed scenario: No such employee /api/v1/employees/100

StatusCode: 200

Body:

{
    "result": false,
    "errorMessage": "No employee found for ID 100"
}

Failed scenario: Bad path /api/v11/employees/1

StatusCode: 404

Conclusion

This approach throws ambiguity straight out of the window. I can now immediately differentiate between a failure in the technical layer and a failure in the business domain.

My API is clean, easy to understand and easy to debug. A client no longer needs to send me a request to ask for clarity on an endpoint that sometimes returns a 200 and other times returns a 404.