Vorner's random stuff

My private take on error handling in Rust

I’ve had a note in my to-do list to write down some of my own thoughts about error handling in Rust for quite some time and mostly got used to it sitting in there. Nevertheless, a twitter discussion brought it back to my attention since I wanted to explain them and honestly, twitter is just not the right medium for explaining design decisions, with its incredible limited space and impossible-to-follow threading model.

Anyway, this is a bit of a brain dump that’s not very sorted. It contains both how I do error handling in Rust, why I do it that way and what I could wish for. Nevertheless, my general view on the error handling is it is mostly fine ‒ it would use some polishing, but hey, what wouldn’t.

And of course, the way I do error handling doesn’t necessarily mean it’s the way you need to be doing it too, this is very much based on personal preferences as much as some technical reasons. You’re free to think I’m doing it all wrong :-).

Language syntax

I know this is somewhat contentious topic. But I’m a strong opponent of adding more specialized syntax for error handling specifically. Currently, error handling is done through the Result type. It’s just a type, has some methods, implements some traits and it composes well. You can have Vec<Result<(), Error>> or even monsters like:

HashMap<String, Box<dyn FnMut() -> Box<dyn Future<Item = Result<Option<u32>, Error>>>>>

(that would be a registry of asynchronous handlers of commands, each promising to eventually maybe return an u32, but being able to fail; and I probably put too few or too many >s there, sorry if you get a headache from an unclosed delimiter)

Any new syntax like fn x() -> u32 throws Error makes the connection between this being really a Result (with useful methods on it and being able to be stored in a Vec) longer to grasp without an obvious (to me) advantage. Furthermore, it promotes error handling into some special place in the language ‒ you no longer could write your own fully-featured Result, making std more privileged. And it opens the door further to „Should Option also have a special syntax sugar, so you could write fn x() -> maybe u32 and should it compose to fn x() -> maybe u32 throws Error? What about fn x() -> maybe u32 maybe throws Error? Should we have locked String instead of Mutex<String>?“

That’s my two cents on this, but I really don’t want to dive into it more.

So, if anything would be to be added to the language to help with error handling, I believe it should be of general use and in line with expressing a lot with types instead of special keywords.

Some time ago I’ve seen an idea (I believe by Withoutboats, but I might be mistaken) that error handling would really get better if Rust handled dynamic dispatch & downcasting in some nicer way. I kind of agree on that front. Let’s see below.

Open vs. closed error types

We have these leaf error types that describe one specific error:

/// We failed to synchronize stuff with the backend.
#[derive(Copy, Clone, Debug)]
struct SynchronizationError;

// Some more boilerplate here...

That’s nice, but what if our function can fail in multiple different ways? There are two general approaches to that.

The closed error type is if we know all the ways it can fail. Let’s say something in lines of:

#[derive(Clone, Debug)]
#[non_exhaustive] // Make sure we can add more variants in future API versions
enum PossibleError {
    SyncError(SynchronizationError),
    OutOfCheeseError(MouseError),
    ...
}

// Some more boilerplate here...

Well, one could hope for somewhat less boilerplate (that I’ve excluded here) ‒ and there are crates for that. One could also hope for some way to just list the damn errors in-line instead of having to create the whole enum out of band manually, but that comes with a full new can of worms (like creating unnameable types which make it harder to work with on the caller side) and this isn’t really that bad anyway. And working with these errors is quite nice, Rust really likes enums:

match fallible_function() {
    Ok(result) => println!("Cool, we have a {}", result),
    Err(PossibleError::SyncError(e)) => error!("{}", e),
    Err(OutOfCheeseError(mouse)) => error("{}: Squeek!", mouse),
    _ => error!("Unknown error"),
}

But let’s say we don’t really know all the ways a function can fail, either because we are lazy slackers that can’t be bothered to track it down and we don’t really care (speed of development is a valid reason), or because somewhere in there there’s a user-provided callback that can also fail for whatever reason our caller likes, so we can’t really limit them to our own preset of error types. That’s the open case.

So let’s have something like Box<dyn Error + Send + Sync> (some people prefer to wrap that up into another type, but the high-level idea is the same). If we want to just log the error and terminate (either the application, or one request, or whatever), it’s fine. This thing can be printed, because it implements Display. All well-behaved errors do.

But what if we want to check if it happens to be one of the specific error types we can somehow handle? If our cache fails to load, that sucks, but we can recover and regenerate it. Now we do something like:

if let Some(e) = error.downcast_ref::<CacheError>() {
    ...
} else if let Some(e) = error.downcast_ref::<OtherError>() {
    // This is getting tedious
}
// And this doesn't really work, does it?
// else if let Some(e) = error.downcast_ref::<Some|More|Errors>() {

Note that this is not a problem of just error handling. Any time we get a dyn Something, it’s kind of painful. I mean, one should generally not downcast things in a perfect world, but one of the valid reasons to use Rust is because the situation is not perfect and one has to do things that generally should not be done. So, why make it painful? With a very tentative syntax, this would make it much nicer:

match e {
    e@dyn CacheError => { ... }
    e@dyn OtherError => { ... }
    e@(dyn Some | dyn More | dyn Errors) => { ... }
}

And yes, this syntax probably can’t be used because it collides with something that’s valid today and means something entirely different. I want to demonstrate the idea, not the exact syntax.

Some history: the failure and spirit crates

Finally moving from the syntax part (which I believe is OK) to the library part. Let’s do a bit of historical context.

I’m the author of the spirit family of… let’s call it configuration manager helpers. It takes care of loading and reloading configuration and setting up parts of an application. In that area lives a lot of error handling.

So where does that stand?

At that time, the failure appeared and it was the perfect tool for the job, because:

All in all, I believe failure was a great success in the sense it showed a way forward. Nevertheless, it has bunch of drawbacks. Specifically:

Evolution

After failure got more popular than expected and discovering that the reasons why it didn’t use the std’s Error trait could be fixed, people started to discuss the ways forward ‒ including std-compatible failure-0.2, extending the trait in std, etc. And when the Rust community starts discussing something, it is a very thorough discussion. Which is good because the result is eventually great. But it takes ages. When I need or want something, I need it right now, not eventually. And I needed to move forward with my error handling ‒ I wanted to stop using failure for spirit.

But I didn’t want to tie in into one specific library again, both because everything was (is) in a flux and the landscape can change and because I no longer wanted to force anything specific onto my users.

Fortunately, someone did the work and extracted the derive part of failure and modified it to work with the Error trait ‒ and err-derive was born. It saves the boilerplate for leaf error types and closed-enum error case:

#[derive(Copy, Clone, Debug, Error)]
#[error(display = "Failed to synchronize something with backend")]
struct SynchronizationError;

// No more boilerplate!!

(I’ve been pointed to thiserror, which seems to be another implementation of the same thing)

But there was the other half of failure, so I went and wrote a very minimal err-context crate.

It provides:

However, I don’t really try to publicize the err-context too much, or to develop it too much. It works, I use it, but I wait for something more „official“ to appear eventually. Then I can just deprecate it, because the whole design is prepared for it getting replaced.

Why don’t I use XYZ?

Sometimes, I get asked why I wrote my own err-context instead of using something else. I believe generally one of these three reasons apply to whatever XYZ:

Things I miss

I’ve mentioned some of the dynamic matching above as a nice to have. I have some more things that I’d consider nice from the crates ecosystem:

Conclusion

Overall, I’m mostly happy with where the error handling is already. Some improvements are possible and may make it nicer to use, but I guess it’s mostly a matter of time for one library to take the lead and win, then some time of polishing. There’s nothing I’d be entirely outright missing or that some form of error handling would be impossible.

Also, I don’t plan to be pulled into endless discussions about error handling. This is more of a report of what I prefer, not an attempt to start a flame war.