Error handling in Upspin

  • Pike is totally wrong here, and trying to explain his way out of a poorly thought out error type.

    > In contrast, a stack trace-like error is worse in both respects. The user does not have the context to understand the stack trace, and an implementer shown a stack trace is denied the information that could be presented if the server-side error was passed to the client.

    Absolutely not. The stack is useful to the implementer, the error message is useful to the user. You include both. Want it on both sides? Send it to both sides. Turning a stack trace into a string and propagating it around is not so expensive, and incredibly useful.

    Here's a common error I see in Go: "os: file does not exist". What file doesn't exist? Upspin fixes that oversight, but the very next thing to ask is who wants that file? Without the stack trace you'll never know why /tmp/dfgsdfg was needed and who couldn't find it.

    > For those cases where stack traces would be helpful, we allow the errors package to be built with the "debug" tag

    Ever have those times where a binary is acting wonky, and once in a blue moon raises an unusual error? You can't just rebuild the binary and deploy it to 10,000 machines to debug it. It needs to be on all the time.

    It is unbelievable the mental hoops the Go implementers jump through to explain how their anemic error type is actually better.

  • Is there a name for this technique/pattern of using an argument's type to figure out which parameter it is, rather than using its position or a keyword? I've just started doing the same recently and it's quite ergonomic.

    Are any (currently existing) type systems able to encode this sort of thing such that it can be statically checked? IIUC from here[1] the compiler allows any type to be passed in. Sum types catch the error of passing the wrong type in, but don't do anything about passing the same type in multiple times. It's unclear whether the Upspin solution of last-write-wins is simply an easy default that falls out of how they process the arguments, or whether they actually use that property somewhere. I've made the equivalent case an error in my code since it seems much more likely to be indicative of a logic error, I'd love to hear arguments for doing it the other way.

    [1] https://upspin.googlesource.com/upspin/+/master/errors/error...

  • It's good to see the Go authors spending more of their time implementing a system using Go.

    It should inform any deeper changes with Go 2, hopefully.

  • The error constructor errors.E is without a doubt the worst design for this type of code.

    It seems Rob Pike is looking for the simplicity of method overloading here, which Go (which Pike co-designed) explicitly doesn't provide [1]:. "Method dispatch is simplified if it doesn't need to do type matching as well. Experience with other languages told us that having a variety of methods with the same name but different signatures was occasionally useful but that it could also be confusing and fragile in practice. Matching only by name and requiring consistency in the types was a major simplifying decision in Go's type system."

    So Pike turned to the catch-all interface{} type and allows you to specify anything and everything as a parameter. A few quick questions. Without reaching for the source of the implementation of errors.E, what happens when: 1. You supply two or more Kinds as arguments? 2. Same as above for all other types(Op, Err, PathName, UserName)? 3. You supply a completely random object, string or integer as argument?

    Couple this with the fact that error-paths are often the least-tested code and you have a recipe for disaster.

    This is a hack. It might be quick to write but it is certainly not well-designed or elegant. I say that as a proponent of Go.

    [1] https://golang.org/doc/faq#overloading

  • This blog is worth a read even if you don't care about Upspin or Go. Some details are certainly specific to Upspin (and they're marked as such), but there are also patterns worth thinking about for programmers of any background.

    My favorite highlights:

    > it is critical that communications [between Upspin servers] preserve the structure of errors

    YES. This is something that's wildly underconsidered in most projects and indeed most programming languages. Errors need to be serializable... and deserializable, losslessly. Having errors which can serialize round-trip is a superpower. It unlocks the ability to have -- well, the blog gets to it later:

    > an operational trace, showing the path through the elements of the system, rather than as an execution trace, showing the path through the code. The distinction is vital.

    YESSS. From personal anecdote: I've been writing a collection of software recently which is broken up into several different executables. This is necessary and good in the system's design, because they have different failure domains (it's nice to send SIGINT to only one of them), and they also need to do some linux voodoo that's process-level, yada yada. The salient detail is that in this system, there are typically three processes deep in a tree for any user action. That means I need to have errors from the third level down be reported clearly... across process boundaries.

    An operational trace is the correct model for this kind of system. Execution traces from any single process are relatively misguided.

    > The Kind field classifies the error as one of a set of standard conditions

    This is something I've seen emerge from many programmers lately, independently (and from a variety of languages)! There must be something good at the bottom of this idea if it's so emergent.

    The "Kind" idea is particularly useful and cross-cutting. They're serializable -- trivially, and non-recursively, because they're a simple primitive type. And, per the guidelines of what would be useful in a polyglot world, they're also virtuous in that they're pretty much an enum, which can be represented in any language, with or without typesystem-level help.

    (I also have a library which builds on this concept; more about that later.)

    ---

    Now, some things I disagree with:

    Right after describing the importance of serializable errors, Pike goes on to mention:

    > we made Upspin's RPCs aware of these error types

    This is a colossal mistake. Or rather, it's a perfectly reasonable thing to do when developing a single project; but it severely limits the reusability of the technique, and also limits what we can learn from it.

    I'm a polyglot. I'd like to see a community -- with representatives from all languages -- focus on a simple definition of errors which can be reliably parsed and round-tripped to the wire and back in any language. For gophers, a common example should be Go and Javascript: I would like my web front-end to understand my errors completely!

    I think this is eminently doable. Imagine a world where we could have the execution flow trace bounce between a whole series of microservices, losslessly! It's worth noting, however, that it would require an investment in keeping things simple; and in particular, controversially, not leaning on the language's type system too much, since it will not translate well if your program would like to interact with programs in other languages. Simplicity would be key here. Maps, arrays, and strings.

    ---

    Lastly, I said earlier I have a library. It's true. (I have two libraries, actually; one in which I tried to use the Go type system more heavily, and included stack traces -- both of which I've come to regard as Mistakes over time and practice; I'll not speak of that one.)

    Gophers might be interested in this, but also I'd love commentary from folk of other language backgrounds:

    https://godoc.org/github.com/polydawn/go-errcat

    This library is an experiment, but using it in several projects has been very pleasing so far. It has many of the same ideas as in this blog, particularly "categories" (which are a clear equivalent of "kinds" -- what's in a name?). Some ideas are taken slightly farther: namely, there is one schema for serializing these errors, and it very clearly coerces things into string types early in the program: the intention of this is to make sure your program logic has exactly as much information available to it as another program will if deserializing this error later; any more information being available in the original process would be a form of moral hazard, tempting you to write logic which would be impossible to replicate in external process.

    The main experiment I have in mind with this library is what's the appropriate scope for a kind/category? Does it map cleanly to packages? Whole programs? Something else entirely?

  • Oh, it looks more and more like regular exceptions (nested errors, details, kinds).