Panic. AbEnd (or “abend”). GPF. Blue Screen o’ Death.
There are many names for it but I’m certain you’ve seen one of those situations where your computer throws a fit, gives up, and basically checks out on you. In the world of web programming this is essentially what a “500 error” is from an application’s standpoint. I’d like to take a moment to explain what it is and why you should never send it.
Simplified Version
For those of you who are in the “TL/DR camp”, let me offer the cheat sheet version…
What are they?
Concept | Level | Definition |
---|---|---|
Status | Request Context | Overall status and reliability of the request and response. |
Exception | Method Scope | Returned result does not match the shape of the method result. |
Error | Local Scope | Problem or event usually preventing processing. |
When to return them?
Concept | Level | Use When |
---|---|---|
Status | Request Context | Sent to the client on every response. |
Exception | Method Scope | Handled by the calling method and translated. Never sent back. |
Error | Local Scope | May be the cause of an exception. Only visible in the exception. |
Which status code set to use?
Qualifier | Status Code Range |
---|---|
Is your service working the way it is supposed to? | 2xx |
Did the client mess up what they were supposed to send? | 4xx |
Is your process or server dead and/or no longer reliable? | 5xx |
Clarification
We all do it. For some reason, certain words are used interchangeably which are actually not interchangeable. So, to help clarify the mystery, let’s address those items up front.
Context vs Scope
In short, scope is the more granular of the two while context is more encompassing. Both have various modifiers like “global scope” vs “local scope” or “data context” vs “object context”. So, while they both can be broken up at an even more granular level, we need some starting point which will allow us to work from without repeating the first month of a CompSci program.
Contexts (Request Contexts)
Whether you’re discussing the aging phrase “nTier” or the new fangled “microservices” buzzword, all multi-tier solutions do the same thing. They all receive requests from one app, service, tier, or layer and then make calls to other apps, services, tiers, or layers. Since I’m speaking primarily to web developers, our “context” will be referring to the request context of a typical multi-tier solution.
When a browser-based app calls a webservice, that browser based app is the “client” and our back-end service is the “server.” The relationship of these two items, working together, for that moment in time is a context. However, the moment that back-end service calls out to a different webservice, then the webservice initiating that second call becomes the “client” and that upstream webservice is considered the “server”. And, following the convention we discussed with the first client-server relationship, this second client-server pair is also a context. While both calls end up being chained together, they are both performing some type of logic in their own little world where that snippet of logic is only really valid for those few nanoseconds.
Let us take a peek at this scenario using some awe-inspiring shapes and colors…
In this example, we show two separate contexts. The first context (in green) begins when the doIt()
function is called from within the user interface. That function makes a call to the back-end webservice’s /api/do-it-now
route and is not complete until it receives a response and returns it to the operation’s caller. In this example the server has determined it requires data and invokes a search()
operation to call an upstream data service to fetch the needed information. Although one may depend on the other, that second context (in purple) is completely separate and detached from the first. Because they are separate, the status codes and error numbers are only considered valid and logical within each context.
Status Codes
Before we get into the dreaded 500
code, let us imagine a more easily understood code for a moment… the beloved 401 - Unauthorized
status. Receiving this status from a webservice can only mean one thing: you did not have permission to call that service. Or, as the error name specifies, you were “unauthorized” to make your request. Assume, for a moment, that the second context (the one in purple) received a 401 - Unauthorized
from the upstream webservice when fetching data. While this status code may make complete sense, should probably be logged, and is something we can troubleshoot, we would never send it directly back to the user interface as this would be a lie. Think about it. If we were to tell the user interface application that it was “unauthorized” we are saying that the UI application did not have permission to call the business service at all. And, since we happily received its request via the /api/do-it-now
route, and began processing said request, that is clearly not true. What we do send back will need to make sense within first context and will be entirely dependent on whether or not the /api/do-it-now
function can still proceed or how critical it is for continued processing of that initial request. If the data was absolutely essential, and we cannot continue processing at all, then we need to explain why that specific call failed. Since we obviously expected the business service to be configured with the correct credentials for calling the upstream data service some better status codes may be either a 417 - Expectation Failed
or maybe even a 412 - Precondition Failed
.
Scope
Rounding out this first comparison is the concept of scope. In the diagram above, the internal functionality, happening within functions like doIt()
or search()
are happening within that method’s scope (often referred to as “local scope” for each method). Similarly, if the doIt()
method relied upon a BUS_SERVICE_URL
variable that other methods share, more than likely, this variable would be set in the global scope. Regardless, just keep in mind that the scope is related to the smaller internal functionality within those services or libraries.
Errors vs Exceptions
The only real similarity errors and exceptions share is that the appear at the lower scope level. Aside from that, they are completely different and not interchangeable. The best way to think of them is that errors are bad and exceptions are not (at least, that’s how they’re intended). Exceptions are used to provide an intelligent response which does not match the documented shape of a method’s response and is used within the caller’s local scope. Errors, on the other hand, mean that bad things are happening and you need to take cover.
Exceptions
Let’s look at the following snippet…
const stepA = (value) => {
...
if (result === 'redrum') {
throw new Error('No wire hangers!');
}
...
};
const stepB = (value) => {
...
};
const stepC = (value) => {
...
};
export const doSomething = (value) => {
let result = null;
try {
result = stepA(value);
result = stepB(result);
...
} catch (ex) {
logger.info(90120, ex.message);
} finally {
if (result !== null) {
result = doSomething(result);
} else {
return false;
}
}
return true;
}
In the example above, the public doSomething()
function calls several smaller methods and returns a final boolean indicator to the caller. It does this even when an exception is thrown. The throw
is used as a way to indicate processing could not continue in that one step. It does not convey the underlying system is unstable or malfunctioning. Exceptions are simply a means of returning a synchronous result to a caller which does not match the documented and expected result. So, if the caller is about to receive something other than the normal result, then go for it! Throw it, baby!
Playing Catch
Exceptions can be either handled or unhandled. This distinction exists for a reason. Many third-party components throw exceptions excessively while others should probably leverage this functionality more. The authors of those components are throwing exceptions to communicate with you and expect you to catch and handle the scenario. Regardless of whether from a third-party component or your own, it is the responsibility of the parent function to account for exceptions, handle them, and craft a meaningful response that makes sense within the calling operation’s context. The HTTP client Axios throws an exception any time it receives any status outside of the 200 range. The authors of Axios expect these errors to be understood and handled. It is assumed processing will either continue normally or a translated and intelligent message would be returned by the parent operation should processing need to stop. Every single exception should be handled. An unhandled exception, on the other hand, means something completely catastrophic has happened and we need to get all hands on deck. However, even if this were the case, in my mind, an unhandled exception would only ever be seen once since, once we know it can happen, we will add code to ensure we gracefully recover from it in the future.
Errors
Unlike exceptions, errors are generally bad and usually indicate a critical situation. For example, your database may be up and running but return an error when trying to execute a basic query. Or you may see the term **ERROR**
in place of a result set. Either either scenario, you know that something bad happened under the hood.
Error Numbers
Since we’ve already discussed status codes specifically, I guess it’s only fair to spell out what an “error number” is and when they are generally used. In short, error numbers are commonly used by developers to document where an exact situation occurred in their source code (usually an undesirable event) and are usually never put in front of the end user (or, at least, not in a very prominent manner). Quality errors will be globally unique and only appear in one specific scenario. However, keep in mind that they are attached to errors. And, since errors are generally never displayed, neither will this be (again, unless it’s very discretely and [hopefully] with a plain-English explanation as to not cause panic to an end user).
Quick recap…
Before we go on, let’s ensure we are all on the same page …
- Errors may pop up during normal operation and may be the thing that causes an exception to be thrown. In most situations they are bad;
- Exceptions are natural and should be used in local scope to indicate when a method call will not receive its expected result shape. They exist for you to use;
- All exceptions are handled gracefully within the scope where they occurred and are translated to something meaningful to the rest of the application. They are generally not passed along; and,
- Status codes have nothing to do with errors or exceptions and, instead, convey the reliability of the response to the caller.
The 500 Family
First up: Mr. 500!
His official description says it all: “The server encountered an unexpected condition which prevented it from fulfilling the request.” Bascically, it’s the web server’s way of saying, “Hey, you know all of of that time you spent accounting for all of the bad things that could possibly happen? Well, this is something totally new that you never thought would happen. You should probably plan for a long day of troubleshooting.”
Consider the following event handler commonly used in a middleware pattern:
app.use(function (err, req, res, next) {
console.error(err.stack)
res.status(err.statusCode || 500).send(err.message || 'Something broke!')
})
As the example shows, both a statusCode
and message
property are expected to be passed in. As the authors of our code we are the experts. Likewise, we have taken the time understand any third party or external components we may be leveraging. We therefore have an intrinsic opportunity to return an intelligent message in any scenario. This pattern assumes we have leveraged those opportunities but also continues processing should something happen which we never expected. In that unlikely scenario, a 500
status code is sent with the intention of communicating a completely unpredictable event.
The Rest of the 500 Klan
In general, the 500 series of status codes are meant to convey that something about our server is either not healthy or is broken. Some of these conditions may be temporary (like a 509 - Bandwidth Exceeded
) and may resolve themselves at some point in time. Personally, I use the 501 - Not Implemented
status regularly when adding a placeholder, that I don’t expect to be called (kinda like a “to do” note), or when removing logic from an older application. However, what is key here is that the status codes from 501
onward are also used intentionally when we need to intelligently communicate a condition about our server.
Opinions are like…
As with anything on this blog, please remember that these are my opinions. This is the way I like to work. And, yes, I realize I’m a bit anal at times (thanks, Mike, for pointing that out… again). After all, that is software development… a never-ending trail of opinions. Granted, our code actually does something in the end. However, along the way, we need to appease the opinion or feelings of an end-user or a business owner or, yes, even other team mates. It doesn’t mean any one of them is “wrong” or “right.”