(cache) Compilers – Embedded in Academia

June 1, 2017June 1, 2017

Pointer Overflow Checking is in LLVM

Production-grade memory safety for legacy C and C++ code has proven to be a frustratingly elusive goal: plenty of research solutions exist but none of them appear to be deployable as-is. So instead, we have a patchwork of partial solutions such as CFI, ASLR, stack canaries, hardened allocators, and NX.

Today’s quick post is about another piece of the puzzle that very recently landed in LLVM: pointer overflow checking. At the machine level a pointer overflow looks just like an unsigned integer overflow, but of course at the language level the overflowing operation is pointer arithmetic, not unsigned integer arithmetic. Keep in mind that in these languages, unsigned overflow is defined but signed overflow is undefined. Pointer overflow is a weak indicator of undefined behavior (UB): the stricter rule is that it is UB to create a pointer that lies more than one element outside of an allocated object. It is UB merely to create such a pointer, it does not need to be dereferenced. Also, it is still UB even if the overflowed pointer happens to refer to some other allocated object.

Here is the patch, it was originally developed by Will Dietz (who is doing his PhD at UIUC under Vikram Adve) and then pushed into the tree by Vedant Kumar (a compiler hacker at Apple). In 2013, Will wrote a great blog post about the patch. He showed lots of examples of pointer overflows in open source programs. Also see an earlier post of mine.

To see pointer overflow checking in action you’ll need to build a very recent Clang/LLVM (r304461 or later) from source, and then you can try out this stupid little program:

$ cat pointer-overflow.c
#include  <stdio.h>
#include  <stdint.h>

int main(void) {
  for (int i, *p = &i; ; p += 1000)
    printf("%p\n", p);
}
$ clang -O3 pointer-overflow.c -Wall -fsanitize=pointer-overflow -fsanitize-trap=pointer-overflow -m32
$ ./a.out 0xff8623c4
0xff863364
0xff864304
0xff8652a4
...
0xffffd804
0xffffe7a4
0xfffff744
Illegal instruction
$

Of course the result is much the same if the pointer is decremented in the loop, instead of incremented; it just takes longer to hit the overflow.

The transformation implemented by the compiler here is pretty straightforward. Here’s IR for the uninstrumented program (I cleaned it up a bit):

define i32 @main() {
entry:
  %i = alloca i32, align 4
  br label %for.cond

for.cond:
  %p.0 = phi i32* [ %i, %entry ], [ %add.ptr, %for.cond ]
  %call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i32* %p.0)
  %add.ptr = getelementptr inbounds i32, i32* %p.0, i32 1000
  br label %for.cond
}

To instrument the program, the last two instructions are changed into these three instructions (and also a trap basic block is added, which simply calls the LLVM trap intrinsic):

  %1 = icmp ult i32* %p.0, inttoptr (i32 -4000 to i32*)
  %add.ptr = getelementptr inbounds i32, i32* %p.0, i32 1000
  br i1 %1, label %for.cond, label %trap

The icmp checks whether the not-yet-incremented pointer is below 0xfffff060, in which case it can be incremented without overflowing.

Can pointer overflow checking by used as a mitigation in production code? This should be fine if you (as I did above) use the -fsanitize-trap=pointer-overflow flag to avoid dragging in any of the UBSan runtime library. But how efficient is it? I ran SPEC INT 2006 with and without pointer overflow checking. 400.perlbench actually contains pointer overflows so we’ll leave it out. Here are the raw scores with and without pointer overflow checking, and here are the increases in runtime due to pointer overflow checking, sorted from best to worst:

462.libquantum	-1%
429.mcf	5%
471.omnetpp	5%
403.gcc	9%
483.xalancbmk	12%
473.astar	27%
401.bzip2	34%
445.gobmk	50%
458.sjeng	79%
464.h264ref	113%
456.hmmer	119%

Keep in mind that this implementation is totally untuned (the patch landed just today). No doubt these scores could be improved by teaching LLVM to eliminate unnecessary overflow checks and, when that doesn’t work, to hoist checks out of inner loops.

Although, in the example above, I enabled pointer overflow checking using an explicit flag, these checks are now part of UBSan and -fsanitize=undefined will enable them.

May 30, 2017May 30, 2017

Compiler Optimizations are Awesome

This piece, which I hadn’t gotten around to writing until now since I thought it was all pretty obvious, explains why Daniel J. Bernstein’s talk, The death of optimizing compilers (audio) is wrong, and in fact compiler optimizations are extremely wonderful and aren’t going anywhere.

First, the thesis of the talk is that almost all code is either hot, and therefore worth optimizing by hand, or else cold, and therefore not worth optimizing at all (even with -O). Daniel Berlin, a compiler person at Google, has looked at the data and disagrees. We can also refute Bernstein’s argument from first principles: the kind of people who can effectively hand-optimize code are expensive and not incredibly plentiful. If an optimizing compiler can speed up code by, for example, 50%, then suddenly we need to optimize a lot less code by hand. Furthermore, hand-optimized code has higher ongoing maintenance costs than does portable source code; we’d like to avoid it when there’s a better way to meet our performance goals.

Second, size matters. Most of the computers in the world are embedded and many of these are storage-constrained. Compiler optimization reduces code size and this phenomenon is completely independent of the hot/cold issue. Without optimization we’d have to buy more expensive deeply-embedded processors that have more on-chip flash memory, and we’d also have to throw away many of those 16 GB phones that are cheap and plentiful and fairly useful today.

Third, most future software isn’t written in C and C++ but rather in higher-level languages, which more or less by definition rely on the optimizer to destroy abstraction layers, do compile-time memory management, etc.

Finally, I claim that the economics of compiler optimization are excellent. A lot of dollars are spent each year making code run faster, either by buying hardware resources or by paying programmers to write faster code. In contrast, there are probably a few thousand people actively doing compiler optimization work, and just about everyone benefits from this. If we can centralize on fewer compiler infrastructures, like GCC and LLVM and V8, then the economics get even better.

In summary, of course there’s plenty of hot code that wants to be optimized by hand, and of course there’s plenty of cold code that sees little benefit due to optimizing compilers. But neither of these facts forms an argument against optimizing compilers, which are amazingly useful and will continue to be for the indefinite future.

May 9, 2017May 9, 2017

Translation Validation of Bounded Exhaustive Test Cases

This piece is jointly authored by Nuno Lopes and John Regehr.

Compilers should be correct, but it is not straightforward to formally verify a production-quality compiler implementation. It is just too difficult to recover the high-level algorithms by looking at an enormous mess of arithmetic, loops, and memory side effects. One solution is to write a new compiler such as CompCert that is designed to be verified. Alternatively, we keep our large, low-level code base such as GCC or LLVM and settle for weaker forms of validation than formal verification. This piece is about a new way to do the second thing. Our focus is the middle-end optimizers, which seem to be the most difficult part of a compiler to get right. The target is LLVM.

End-to-end compiler testing, supported by a random source code generator like Csmith, is great — but it only gets us so far. The expressiveness of the program generator is one limitation, but a more serious problem is the normalization that happens in the compiler frontend. The issue is that there are a lot of valid code patterns that Clang will never emit and that are therefore impossible to test by driving Clang. As a Clang user you may not happen to care about this, but as LLVM people we want the middle-end optimizations to be free of logic errors and also the non-Clang-emittable code is important in practice since there are lots of frontends out there besides Clang.

The first step is to generate lots of LLVM IR. Rather than creating a relatively small number of large functions, as Csmith would do, this IR generator generates lots of tiny functions: it uses bounded exhaustive test generation to create every LLVM function up to a certain size. A fun thing about this kind of generator is its choose() operator. In random mode, choose() returns a random number; in exhaustive mode, it uses fork() to explore all alternatives. While this isn’t the most efficient way to do search, leveraging the OS keeps the generator very simple. The most vexing thing about this design is allowing it to use multiple cores while stopping it from being a fork bomb. The current version doesn’t contain code that tries to do this.

The next step is to run LLVM optimizations on the generated functions. One thing we want to try is the collection of passes that implements “-O2,” but it also makes sense to run some individual passes since it is possible for sequences of passes to miss bugs: early passes can destroy constructs that would trigger bugs in later ones, and the late passes can clean up problems introduced by earlier ones. In fact both of those things seem to happen quite often.

We end up with lots of pairs of unoptimized and optimized LLVM functions. The obvious thing to do is run them with the same inputs and make sure that the outputs are the same, but that only works when the executions encounter no undefined behaviors. Solutions to the UB problem include:

Generating UB-free code, as Csmith would. At the level of these tiny functions that would be a major handicap on the generator’s expressiveness and we definitely do not wish to do it.
Create an LLVM interpreter that detects UB instead of silently skipping over it. The rule is that the optimizer is allowed to preserve UB or remove it, but never to add it. In other words, the correctness criterion for any compiler transformation isn’t input/output equivalence but rather input/output refinement. Someone needs to write this interpreter, perhaps using lli as a starting point (though the last time we looked, the slow/simple interpreter mode of lli had suffered some bit rot).
Formally verify the refinement relation using Alive. This is better than an interpreter because Alive verifies the optimization for all inputs to the function, but worse because Alive doesn’t support all of LLVM, but rather a loop-free subset.

It is option three that we chose. The Alive language isn’t LLVM but rather an LLVM-like DSL, but it is not too hard to automatically translate the supported subset of LLVM into Alive.

In the configuration that we tested (2- and 4-bit integers, three instructions per function, including select but not including real control flow, floating point, memory, or vectors) about 44.8 million functions are generated and binned into 1000 files. We identified seven configurations of the LLVM optimizers that we wanted to test: -O2, SCCP, GVN, NewGVN, Reassociate, InstSimplify, and InstCombine. Then, to make the testing happen we allocated 4000 CPU cores (in an Azure HPC cluster) to process in batch the 7000 combinations of file + optimization options. Each combination takes between one and two hours, depending on how many functions are transformed and how long Alive takes to verify the changes.

If we could generate all LLVM functions and verify optimization of them, then we’d have done formal verification under another name. Of course, practically speaking, there’s a massive combinatorial explosion and we can only scratch the surface. Nevertheless, we found bugs. They fall into two categories: those that we reported and that were fixed, and those that cannot be fixed at this time.

We found six fixable LLVM bugs. The most common problem was transformations wrongly preserving the nsw/nuw/exact attributes that enable undefined behaviors in some LLVM instructions. This occurred with InstCombine [1], GVN [1], and Reassociate [1,2]. InstSimplify generated code that produces the wrong output for some inputs [1]. Finally, we triggered a crash in llc [1].

The unfixable bugs stem from problems with LLVM’s undefined behavior model. One way to fix these bugs is to delete the offending optimizations, but some of them are considered important. You might be tempted to instead fix them by tweaking the LLVM semantics in such a way that all of the optimizations currently performed by LLVM are valid. We believe this to be impossible: that there does not exist a useful and consistent semantics that can justify all of the observed optimizations.

A common kind of unfixable bug is seen in the simplification logic that LLVM has for select: it transforms “select %X, undef, %Y” into “%Y”. This is incorrect (more details in the post linked above) and, worse, has been shown to trigger end-to-end miscompilations [1]. Another source of problems is the different semantics that different parts of LLVM assume for branches: these can also cause end-to-end miscompilations [1,2].

In summary, this is a kind of compiler testing that should be done; it’s relatively easy and the resulting failing test cases are always small and understandable. If someone builds an UB-aware LLVM interpreter then no tricky formal-methods-based tools are required. This method could be easily extended to cover other compilers.

There are some follow-on projects that would most likely provide a good return on investment. Our test cases will reveal many, many instances where an LLVM pass erases an UB flag that it could have preserved; these could be turned into patches. We can do differential testing of passes against their replacements (for example, NewGVN vs. GVN) to look for precision regressions. The set of instructions that we generate should be extended; for example, opt-fuzz already has some limited support for control flow.

The code to run these tests is here.

April 14, 2017April 14, 2017

Taming Undefined Behavior in LLVM

Earlier I wrote that Undefined Behavior != Unsafe Programming, a piece intended to convince you that there’s nothing inherently wrong with undefined behavior as long as it isn’t in developer-facing parts of the system.

Today I want to talk about a new paper about undefined behavior in LLVM that’s going to be presented in June at PLDI 2017. I’m an author of this paper, but not the main one. This work isn’t about debating the merits of undefined behavior, its goal is to describe and try to fix some unintended consequences of the design of undefined behavior at the level of LLVM IR.

Undefined behavior in C and C++ is sort of like a bomb: either it explodes or it doesn’t. We never try to reason about undefined programs because a program becomes meaningless once it executes UB. LLVM IR contains this same kind of UB, which we’ll call “immediate UB.” It is triggered by bad operations such as an out-of-bounds store (which is likely to corrupt RAM) or a division by zero (which may cause the processor to trap).

Our problems start because LLVM also contains two kinds of “deferred UB” which don’t explode, but rather have a contained effect on the program. We need to reason about the meaning of these “slightly undefined” programs which can be challenging. There have been long threads on the LLVM developers’ mailing list going back and forth about this.

The first kind of deferred UB in LLVM is the undef value that acts like an uninitialized register: an undef evaluates to an arbitrary value of its type. Undef is useful because sometimes we want to say that a value doesn’t matter, for example because we know a location is going to be over-written later. If we didn’t have something like undef, we’d be forced to initialize locations like this to specific values, which costs space and time. So undef is basically a note to the compiler that it can choose whatever value it likes. During code generation, undef usually gets turned into “whatever was already in the register.”

Unfortunately, the semantics of undef don’t justify all of the optimizations that we’d like to perform on LLVM code. For example, consider this LLVM function:

define i1 @f(i32) {
  %2 = add nsw i32 %0, 1
  %3 = icmp sgt i32 %2, %0
  ret i1 %3
}

This is equivalent to “return x+1 > x;” in C and we’d like to be able to optimize it to “return true;”. In both languages the undefinedness of signed overflow needs to be recognized to make the optimization go. Let’s try to do that using undef. In this case the semantics of “add nsw” are to return undef if signed overflow occurs and to return the mathematical answer otherwise. So this example has two cases:

The input is not INT_MAX, in which case the addition returns input + 1.
The input is INT_MAX, in which case the addition returns undef.

In case 1 the comparison returns true. Can we make the comparison true for case 2, giving us the overall result that we want? Recall that undef resolves as an arbitrary value of its type. The compiler is allowed to choose this value. Alas, there’s no value of type i32 that is larger than INT_MAX, when we use a signed comparison. Thus, this optimization is not justified by the semantics of undef.

One choice we could make is to give up on performing this optimization (and others like it) at the LLVM level. The choice made by the LLVM developers, however, was to introduce a second, stronger, form of deferred UB called poison. Most instructions, taking a poison value on either input, evaluate to poison. If poison propagates to a program’s output, the result is immediate UB. Returning to the “x + 1 > x” example above, making “add nsw INT_MAX, 1” evaluate to poison allows the desired optimization: the resulting poison value makes the icmp also return poison. To justify the desired optimization we can observe that returning 1 is a refinement of returning poison. Another way to say the same thing is that we’re always allowed to make code more defined than it was, though of course we’re never allowed to make it less defined.

The most important optimizations enabled by deferred undefined behavior are those involving speculative execution such as hoisting loop-invariant code out of a loop. Since it is often difficult to prove that a loop executes at least once, loop-invariant code motion threatens to take a defined program where UB sits inside a loop that executes zero times and turn into into an undefined program. Deferred UB lets us go ahead and speculatively execute the code without triggering immediate UB. There’s no problem as long as the poisonous results don’t propagate somewhere that matters.

So far so good! Just to be clear: we can make the semantics of an IR anything we like. There will be no problem as long as:

The front-ends correctly refine C, C++, etc. into IR.
Every IR-level optimization implements a refinement.
The backends correctly refine IR into machine code.

The problem is that #2 is hard. Over the years some very subtle mistakes have crept into the LLVM optimizer where different developers have made different assumptions about deferred UB, and these assumptions can work together to introduce bugs. Very few of these bugs can result in end-to-end miscompilation (where a well-formed source-level program is compiled to machine code that does the wrong thing) but even this can happen. We spent a lot of time trying to explain this clearly in the paper and I’m unlikely to do better here! But the details are all there in Section 3 of the paper. The point is that so far these bugs have resisted fixing: nobody has come up with a way to make everything consistent without giving up optimizations that the LLVM community is unwilling to give up.

The next part of the paper (Sections 4, 5, 6) introduces and evaluates our proposed fix, which is to remove undef, leaving only poison. To get undef-like semantics we introduce a new freeze instruction to LLVM. Freezing a normal value is a nop and freezing a poison value evaluates to an arbitrary value of the type. Every use of a given freeze instruction will produce the same value, but different freezes may give different values. The key is to put freezes in the right places. My colleagues have implemented a fork of LLVM 4.0 that uses freeze; we found that it more or less doesn’t affect compile times or the quality of the generated code.

We are in the process of trying to convince the LLVM community to adopt our proposed solution. The change is somewhat fundamental and so this is going to take some time. There are lots of details that need to be ironed out, and I think people are (rightfully) worried about subtle bugs being introduced during the transition. One secret weapon we have is Alive where Nuno has implemented the new semantics in the newsema branch and we can use this to test a large number of optimizations.

Finally, we noticed that there has been an interesting bit of convergent evolution in compiler IRs: basically all heavily optimizing AOT compilers (including GCC, MSVC, and Intel CC) have their own versions of deferred UB. The details differ from those described here, but the effect is the same: deferred UB gives the compiler freedom to perform useful transformations that would otherwise be illegal. The semantics of deferred UB in these compilers has not, as far as we know, been rigorously defined and so it is possible that they have issues analogous to those described here.

March 6, 2017

Do Expressive Programming Languages Always Have Undefined Behavior?

In the Hacker News comments on one of my previous posts about undefined behavior, someone said this:

AFAIK Gödel’s incompleteness theorems imply that _any_ language will have at least some undefined behaviour.

Let’s take a quick look at this statement, keeping in mind that incompleteness and undecidability can be remarkably tricky topics. Some years ago I read and enjoyed Gödel’s Theorem: An Incomplete Guide to Its Use and Abuse (and no doubt need to reread it).

First off, it is clear that there exist programming languages that are free of UB, such as one where the semantics of every program is to print “7”. Whatever it is that UB means (we have not formally defined it), it seems clear that the language that always prints “7” does not have it.

There are also useful languages that are obviously UB-free, such as an expression language that evaluates elementary functions over IEEE floats. These languages are particularly easy to reason about because they are not Turing-complete: all computations terminate and we simply have to ensure that they terminate with a defined result.

In contrast, the HN commenter may have intended to invoke Rice’s Theorem: “Any nontrivial property about the language recognized by a Turing machine is undecidable.” A consequence is that when f() is some arbitrary computation we cannot in general hope to decide whether a program like this invokes undefined behavior:

main() {
  f();
  return 1 / 0; // UB!
}

But this is a red herring. Rice’s Theorem only applies to non-trivial properties: “properties that apply neither to no programs nor all programs in the language.” To sidestep it, we only need to define a programming language where UB-freedom is a trivial property. This is done by ensuring that every operation that a program can perform is a total function: it is defined in all circumstances. Thus, programs in this language will either terminate in some defined state or else fail to terminate. This kind of extremely tight specification is not typically done for realistic programming languages because it is a lot of work, particularly if the language has open-ended interactions with other levels of the system, such as inline assembly. But the problem is only a practical one; there is no problem in principle. It is not too difficult to write an UB-free specification for a (Turing-complete) toy language or subset of a real language.

Now let’s return to the original HN comment: it was about the incompleteness theorems, not about the halting problem. I’m not sure what to do with that, as I don’t see that Gödel’s theorems have any direct bearing on undefined behavior in programming languages.

February 14, 2017

Undefined Behavior != Unsafe Programming

Undefined behavior (UB) in C and C++ is a clear and present danger to developers, especially when they are writing code that will execute near a trust boundary. A less well-known kind of undefined behavior exists in the intermediate representation (IR) for most optimizing, ahead-of-time compilers. For example, LLVM IR has undef and poison in addition to true explodes-in-your-face C-style UB. When people become aware of this, a typical reaction is: “Ugh, why? LLVM IR is just as bad as C!” This piece explains why that is not the correct reaction.

Undefined behavior is the result of a design decision: the refusal to systematically trap program errors at one particular level of a system. The responsibility for avoiding these errors is delegated to a higher level of abstraction. For example, it is obvious that a safe programming language can be compiled to machine code, and it is also obvious that the unsafety of machine code in no way compromises the high-level guarantees made by the language implementation. Swift and Rust are compiled to LLVM IR; some of their safety guarantees are enforced by dynamic checks in the emitted code, other guarantees are made through type checking and have no representation at the LLVM level. Either way, UB at the LLVM level is not a problem for, and cannot be detected by, code in the safe subsets of Swift and Rust. Even C can be used safely if some tool in the development environment ensures that it will not execute UB. The L4.verified project does exactly this.

The essence of undefined behavior is the freedom to avoid a forced coupling between error checks and unsafe operations. The checks, once decoupled, can be optimized, for example by being hoisted out of loops or eliminated outright. The remaining unsafe operations can be — in a well-designed IR — mapped onto basic processor operations with little or no overhead. As a concrete example, consider this Swift code:

func add(a : Int, b : Int)->Int {
  return (a & 0xffff) + (b & 0xffff);
}

Although a Swift implementation must trap on integer overflow, the compiler observes that overflow is impossible and emits this LLVM IR:

define i64 @add(i64 %a, i64 %b) {
  %0 = and i64 %a, 65535
  %1 = and i64 %b, 65535
  %2 = add nuw nsw i64 %0, %1
  ret i64 %2
}

Not only has the checked addition operation been lowered to an unchecked one, but also the add instruction has been marked with LLVM’s nsw and nuw attributes, indicating that both signed and unsigned overflow are undefined. In isolation these attributes provide no benefit, but they may enable additional optimizations after this function is inlined. When the Swift benchmark suite is compiled to LLVM, about one in eight addition instructions has an attribute indicating that integer overflow is undefined.

In this particular example the nsw and nuw attributes are redundant since an optimization pass could re-derive the fact that the add cannot overflow. However, in general these attributes and others like them add real value by avoiding the need for potentially expensive static analyses to rediscover known program facts. Also, some facts cannot be rediscovered later, even in principle, since information is lost at some compilation steps.

In summary, undefined behavior in programmer-visible abstractions represents an aggressive and dangerous tradeoff: it sacrifices program correctness in favor of performance and compiler simplicity. On the other hand, UB at lower levels of the system, such as machine code or a compiler IR, is an internal design choice that needn’t have any effect on the programmer-facing parts of the system. This kind of UB simply requires us to accept that safety checks can be usefully factored out of their corresponding unsafe operations to afford efficient execution.

February 1, 2017

Detecting Strict Aliasing Violations in the Wild

Type-based alias analysis, where pointers to different types are assumed to point to distinct objects, gives compilers a simple and effective way to disambiguate memory references in order to generate better code. Unfortunately, C and C++ make it easy for programmers to violate the assumptions upon which type-based alias analysis is built. “Strict aliasing” refers to a collection of rules in the C and C++ standards that restrict the ways in which you are allowed to modify and look at memory objects, in order to make type-based alias analysis work in these weakly-typed languages. The problem is that the strict aliasing rules contain tricky and confusing corner cases and also that they rule out many idioms that have historically worked, such as using a pointer type cast to view a float as an unsigned, in order to inspect its bits. Such tricks are undefined behavior. See the first part of this post for more of an introduction to these issues.

The purpose of this piece is to call your attention to a new paper, Detecting Strict Aliasing Violations in the Wild, by Pascal Cuoq and his colleagues. C and C++ programmers should read it. Sections 1 and 2 introduce strict aliasing, they’re quick and easy.

Section 3 shows what compilers think about strict aliasing problems by looking at how a number of C functions get translated to x86-64 assembly. This material requires perseverance but it is worth taking the time to understand the examples in detail, because compilers apply the same thinking to real programs that they apply to tiny litmus tests.

Section 4 is about a new tool, built as part of Trust-in-Soft’s static analyzer, that can diagnose violations of the strict aliasing rules in C code. As the paper says, “it works best when applied on definite inputs,” meaning that the tool should be used as an extended checker for tis-interpreter. Pascal tells me that a release containing the strict aliasing checker is planned, but the time frame is not definite. In any case, readers interested in strict aliasing, but not specifically in tools for dealing with it, can skip this section.

Section 5 applies the strict aliasing checker to open source software. This is good reading because it describes problems that are very common in the wild today. Finding a bug in zlib was a nice touch: zlib is small and has already been looked at closely. Some programs mitigate these bugs by asking the compiler to avoid enforcing the strict aliasing rules: LLVM, GCC, and Intel CC all take a -fno-strict-aliasing flag and MSVC doesn’t implement type-based alias analysis at all. Many other programs contain time bombs: latent UB bugs that don’t happen to be exploited now, that might be exploited later when the compiler becomes a bit brighter.

Also see libcrunch and its paper and a paper about SafeType.