- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Sometimes when a function fails, there is extra information that you have on hand that may help the caller respond to the problem or produce a diagnostic. For example, in the parseU64 example by andrewrk here,
const ParseError = error {
InvalidChar,
Overflow,
};
pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {it would be useful for the function could return the position of the invalid character so that the caller could produce a diagnostic message.
Because Zig treats error types specially, when using errors you get a bunch of nice features, such as ! error-set inference, try/catch, and errdefer; you currently lose these features if you want to return extra diagnostic information since that information is no longer an error type.
While something like index-of-bad-character is less useful for parsing an integer, getting "bad character" with no location when parsing a 2KiB JSON blob is very frustrating! -- this is the current state of the standard library's JSON parser.
There are currently two workarounds possible today to let this extra information get out, neither of which are very ergonomic and which work against Zig's error types:
Workaround 1: Return a tagged union
You could explicitly return a tagged union that has the extra information:
const ParseError = error {
Overflow,
}
const ParseResult = union(enum) {
Result: u64,
InvalidChar: usize,
}
pub fn parseU64(buf: []const u8, radix: u8) ParseError!ParseResult {This is unfortunate in a number of ways. First, because InvalidChar is no longer an error, you cannot propagate/handle the failure with try/catch. Second, because the InvalidChar case is no longer an error, you cannot use errdefer to cleanup partially constructed state in the parser. Finally, calling the function is made messy because it can fail in two separate ways -- either in the error union, or in the explicitly returned union. This means calls that distinguish different errors (as opposed to just propagating with try) need nested switches.
Workaround 2: Write to an out parameter
You could also leave the error set alone, and instead expand the contract of parseU64 to write to an out parameter whenever it returns a InvalidChar error:
pub fn parseU64(buf: []const u8, radix: u8, invalid_char_index: *usize) ParseError!u64{
However, this makes the function's interface much messier: it now includes mutation, and it makes it impossible to indicate that it's being called in such a way that it cannot fail, since the pointer parameter is required (where previously a catch unreachable could handle). Also, it won't be immediately obvious which out parameters are associated with which errors, especially if inferred error sets are being used. In particular, it gives libraries writes the opportunity to sometimes re-use out parameters (in order to prevent function signatures from growing out of hand) and sometimes not (they at least cannot when the types aren't the same).
Proposal: Associate each error with a type
EDIT: Scroll down to a comment for a refreshed proposal. It looks essentially the same as here but with a bit more detail. The primary difference is not associating errors with value types, but an error within a particular error-set with a type. This means no changes to the anyerror type are necessary.
I propose allowing a type to be associated with each error:
const ParseError = error {
InvalidChar: usize,
Overflow, // equivalent to `Overflow: void`
};
pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {
......
if (digit >= radix) {
return error.InvalidChar(index);
}
......The value returned would be available in switchs:
if (parseU64(str, 10)) |number| {
......
} else |err| switch (err) {
error.Overflow => {
......
},
error.InvalidChar => |index| {
......
}
}This allows a function which can fail in multiple ways to associate different value types with different kinds of failures, or just return some plain errors that worked how they did before.
With this proposal, the caller can use inferred error sets to automatically propagate extra information, and the callsite isn't made messy with extra out-parameters/an extra non-error failure handling switch. In addition, all of the features special to errors, like errdefer and try/catch, continue to work.
Errors in the global set would now be associated with a type, so that the same error name assigned two different types would be given different error numbers.
I'm not sure what happens when you have an error set with the same name twice with different types. This could possibly be a limited case where "overloading" a single name is OK, since instantiating an error is always zero-cost, but I'll ask what others think.
I'm fairly new to Zig, so some of the details may not be quite right, but hopefully the overall concept and proposal makes sense and isn't unfixably broken.
Activity
hryx commentedon Jun 10, 2019
I see potential in that. A world where error sets are just regular unions, but given all the syntax-level amenities of today's errors.
Taking it further, perhaps all today's good stuff about errors could be applied to any type, not just unions. Maybe the
errorkeyword "taints" a type as an error type. (Although, making errors non-unions would probably have too many effects on the language.)Because you could now "bloat" an error set with types of larger size, this might affect how strongly use of the global error set is discouraged.
daurnimator commentedon Jun 10, 2019
I remember seeing this proposed before but I can't find the issue for it. Maybe it was only on IRC?
andrewrk commentedon Jun 10, 2019
Thank you @CurtisFenner for a well written proposal
shawnl commentedon Jun 11, 2019
This is just a tagged union.
And as they seem so useful, maybe we can add anonymous structs, so we can just use tagged unions instead of multiple return values.
Don't worry about the optimizations here. The compiler can handle that.
ghost commentedon Jun 11, 2019
There's a previous issue here #572 (just for the record)
emekoi commentedon Jun 11, 2019
because errors are assigned a unique value, how about allowing for tagged unions to use errors as the tag value? this would avoid adding new syntax to language and making this feature consistent with other constructs in the language. this tangentially relies on #1945.
shawnl commentedon Jun 12, 2019
Agreeing with @emoki I'd like some syntactic sugar for multiple arguments to an error switch, if the type is defined in the same tagged union:
CurtisFenner commentedon Jun 13, 2019
I think what @emekoi suggested is excellent, as it removes the need for extra syntax and sidesteps the issues of increasing the size of
anyerrorand dealing with error names assigned different types, while still enabling the core idea here!daurnimator commentedon Jun 13, 2019
I assume this should be:
Otherwise I love the idea!
emekoi commentedon Jun 16, 2019
that's what i wasn't sure about. would you still have to explicitly name the error even when using an inferred error set? or would you just use
erroras you normally would with an inferred error set?ghost commentedon Jul 20, 2019
Not a proposal, but something possible currently: here's a variation on OP's "Workaround 2" (the out parameter). A struct member instead of an "out" parameter. It's still not perfect, but this or Workaround 2 is still the most flexible as they make it possible to allocate memory for the error value (e.g. a formatted error message).
This might be a solution for std InStream and OutStream which currently have that annoying generic error parameter?
Also, for parsers and line numbers specifically, you don't need to include the line number in the error value itself. Just maintain it in a struct member and the caller can pull it out when catching. If these struct members aren't exclusive to failed states, then there's no smell at all here.
Tetralux commentedon Jul 21, 2019
I like @emekoi's suggestion here, but I'll note that I'd like to be able to have
parseU64return!u64and have the error type inferred, just as we do now, and still be able to doreturn error{ .InvalidIndex = index };.80 remaining items
ericlangedijk commentedon Oct 22, 2024
I just started with Zig and as far as I can see now, the error handling is absolutely perfect and extremely elegant.
(I tried Rust for a while, where error handling is insanely complicated and hardly usable, in my opinion).
For more complicated error handling, where feedback (which can be a lot) is needed, maybe it is better to leave this to the application or library programmers to handle in a creative way.
CorruptedVor commentedon Dec 8, 2024
Perhaps liberror can be a source of inspiration?
callee sets an error string, caller can decide to simply ignore it
it's one alternative to additionally returning an arbitrary type - just let the callee do the heavy lifting of setting errstr
plan9 uses error strings in its C library
andrewrk commentedon Dec 8, 2024
Error codes are for control flow.
ayende commentedon Dec 8, 2024
I agree that error codes are good for control flow, but it is super common to not care what the actual error was until much higher in the stack, and there is no idiomatic way to attach that state.
For example, consider this API:
Which will save the message in a buffer and write it once the buffer is full to a file with the current date as the name.
We run into an error saving the file, and we want to both show the user the error (
error.FileAlreadyExists) and what the filename is.There is no idiomatic way to do that, which is problematic.
marler8997 commentedon Dec 8, 2024
I've discovered this pattern that I use when I both want to have error control flow (i.e. trigger errdefers) but would also like a "side-channel" for extra error information. Consider this example,
We want to add a side-channel to foo. The general pattern is to define a
FooErrortype that will house this side-channel data; using it looks like this:FooErrorgives you a place to answer questions about how this side-channel data should be managed. Does the data need to be freed? Add adeinitfunction. Do you want print the error? Add aformatfunction. Do you still want to support other zig error codes that don't populate thisFooErrorside-channel? Reserve a special zig error code (probablyerror.Fooin this case) that indicates the side-channel is initialized. Here's a more comprehensive example showing the sorts of things you can do:And here's an example of what
fooandFooErrorcan look like:@ayende, In your example with the
persistmessage there are some unanswered questions, such as where this filename with the date in it is stored? Are you using an allocator, or maybe it's in a global with a fixed max size? When you return the filename should the caller free it or not? We can use this pattern to answer those questions, and here's one way that could look:Here's a couple real-world examples:
direct2d-zig: https://github.com/marler8997/direct2d-zig/blob/6c6597a3a80203ee144bd97efc269e33c0653864/ddui.zig#L134
zware: https://github.com/malcolmstill/zware/blob/3ad3f4e10bafba1d927847720aacad78f690cec6/src/error.zig
ayende commentedon Dec 8, 2024
I ended up doing something like that, sure. But the key point isn't that this is possible.
The issue is what would be the idiomatic manner to do this.
nektro commentedon Dec 9, 2024
there isn't one and that's part of the freedom of Zig. unless you limit error return values to only be primitives then you have to worry about the allocation strategy of the error return and Zig is intentionally not prescriptive about that in its design.
jamii commentedon Dec 10, 2024
I find myself writing a lot of this:
When I would have liked to write:
Is there a nicer way to compose error values with this pattern?
andrewrk commentedon Dec 10, 2024
Good question, yes there is:
I find your example a little unrealistic, because I don't see why you would have 3 different error types. By making them the same, then you can introduce a
failhelper function which has return typeerror{AlreadyReported}. Thenerror.AlreadyReportedis used to indicate the error state has already been recorded somewhere and you can keep usingtry.jamii commentedon Dec 31, 2024
Because I'm calling three different apis that all use this error pattern and have different values associated with their errors. The values are actually useful too, so I don't want to just turn everything into a string error message.
In your examples you have AnalysisFail, InnerError, LinkFailure etc. If you have some zig code that you want to analyze, codegen and link, then what error value do you return? Is it always just an opaque string?
Vulpesx commentedon Jan 8, 2025
Considering how often we work with errors and how useful it is to have detailed information on the errors is, I don't think the fact that we can work around it is a valid reason to not make it a better experience
loganbnielsen commentedon Feb 20, 2025
Maybe I missed it somewhere, but is there a technical reason why we seem opposed to enabling errors with payloads? Is it
anti-the_thesis_of_zigin some way to provide that type of functionality?ayende commentedon Feb 20, 2025
The biggest reason why not is lifetime issues.
Consider:
In the case of
BadToken, it is easy to handle, you just return a struct payload, simple.What about
UnknownIdentifier? If you have that, you may need to free the memory.Now what happens?
You return a pointer to invalid memory, probably.
And you need to handle this somehow.
The problem here is that this just doesn't compose.