“Easy to reason about” - what does that mean? [on hold]

Question

I have heard a lot of times when other developers use that phrase to "advertise" some patterns or developing best practices. Most of the time this phrase is used when you are talking about benefits of functional programming.

The phrase "Easy to reason about" has been used as it is, without any explanation or code sample. So for me it becomes like the next "buzz"-word, which more "experienced" developers use in their talks.

Question: Can you provide some examples of "Not easy to reason about", so it can be compared with "Easy to reason about" examples?

@MartinMaat a more precise phrase that is widely used is equational reasoning, i'd suggest that this might be what Fabio is after — jk., yesterday
I like to use the phrase "cognitive load" for this sort of thing. — Baldrickk, yesterday
In the non-formal sense, I use this to mean a solution is simple enough understand (generally) what the results will be for any given input without testing it. It means that for any set of inputs, the results will be unsurprising. Solutions that have non-obvious corner cases, for example, are hard to reason about. Mainly I use this in reference to robustness. — JimmyJames, yesterday
I am very guilty of using "easier to reason about" frequently; I note however that I try to be careful to say the comparative easier rather than the absolute easy. There was a day in my life when I could reason about no software at all, so it was not easy on that day; it became easy only by spending a great deal of time and effort. To say that any programming problem is easy is taking a pejorative stance towards anyone who might not (yet) find it easy. To say that one model is easier than another is to say that there are fewer concepts involved, fewer moving parts, and so on. — Eric Lippert, yesterday

David Arno · Answer 1 · 2017-06-20 09:29:29Z

up vote 55 down vote

To my mind, the phrase "easy to reason about", refers to code that is easy to "execute in your head".

When looking at a piece of code, if it is short, clearly written, with good names and minimal mutation of values, then mentally working through what the code does is a (relatively) easy task.

A long piece of code with poor names, variables that constantly change value and convoluted branching will normally require eg a pen and piece of paper to help keep track of the current state. Such code therefore cannot be easily worked through just in your head, So such code isn't easy to reason about.

edited 2 days ago

answered 2 days ago

David Arno

12.1k42548

27

With a slight caveat that no matter how well you name your variables, a program that tries to disprove Goldbach's conjecture is inherently difficult to "execute", in your head or elsewhere. But it can still be easy to reason about, in the sense of being easy to convince yourself that if it claims to have found a counter-example then it's telling the truth ;-) – Steve Jessop yesterday

3

I would never want to execute code in my head. That, to me, would be the ultimate show of "not easy to reason about." I would want to be able to make predictive statements about what the computer would do without executing it. Code that is "easy to reason about" is code which doesn't have to be executed in your head, it can be reasoned about instead. – Cort Ammon 19 hours ago

1

How can one answer a question about reasoning about code without even mentioning formal verification? This answer suggest that reasoning about code is informal and ad-hoc. its not, its usually done with very great care and mathematical approaches. There are certain mathematical properties that make code "easy to reason about" in an objective sense (pure functions, to bring a very easy example). names of variables have nothing to do with how easy it is to "reason" about code, at least not in any formal sense. – Polygnome 18 hours ago

2

@Polygnome Reasoning about code is not usually done with very great care and mathematical approaches. As I write this, people reasoning about code informally are outnumbering the mathematical approachers by millions to one, at least, or so I reckon. – Kaz 16 hours ago

2

@Polygnome "Code easy to reason about" almost exclusively alludes to its mathematical properties and formal verification - that roughly sounds like an answer to the question. You may want to post that as an answer instead of disagreeing about what the (subjective) answer is in the comments. – Dukeling 14 hours ago

| show 3 more comments

Michael Borgwardt · Answer 2 · 2017-06-20 13:14:41Z

up vote 44 down vote

A mechanism or piece of code is easy to reason about when you need to take few things into account to predict what it will do, and the things you do need to take into account are easily available.

True functions with no side effects and no state are easy to reason about because the output is completely determined by the input, which is right there in the parameters.

Conversely, an object with state is much harder to reason about, because you have to take into account what state the object is in when a method is called, which means you have to think about which other situations could lead to the object being in a particular state.

Even worse are global variables: to reason about code that reads a global variable, you need to understand where in your code that variable could be set and why - and it may not even be easy to find all those places.

Just about the hardest thing to reason about is multithreaded programming with shared state, because not only do you have state, you have multiple threads changing it at the same time, so to reason about what a piece of code does when executed by one thread you have to allow for the possibility that at every single point of execution, some other thread (or several of them!) might be executing just about any other part of the code and change the data you're operating on right under your eyes. In theory, that can be managed with mutexes/monitors/critical sections/whatever-you-call-it, but in practice no mere human is actually able to do that reliably unless they drastically confine the shared state and/or parallelism to very small sections of the code.

edited yesterday

answered yesterday

Michael Borgwardt

37.6k884144

7

I do agree with this answer, but even with pure functions, declarative approaches (like CSS, or XSLT, or make or even C++ template specialisation and function overloading) can put you back in the position of considering the whole program. Even once you think you've found the definition of something, the language allows a more specific declaration anywhere in the program to override it. Your IDE might help with this. – Steve Jessop yesterday

4

I'd add that in the multithreaded scenario you also have to have a reasonably deep understanding of what lower level instructions your code desugars to: an operation that looks atomic in source might have unexpected interruption points in the actual execution. – Jared Smith yesterday

5

@SteveJessop: Indeed, this point is often overlooked. There is a reason why C# makes you say when you want a method to be overridable rather than quietly making overridability the default; we wish to wave a flag saying "the correctness of your program might depend on code you can't find at compile time" at this point. (That said, I also wish that "sealed" was the default for classes in C#.) – Eric Lippert yesterday

@EricLippert What were the final reasons for sealed not being the default? – Zev Spitz 17 hours ago

@ZevSpitz: That decision was made long before my time; I don't know. – Eric Lippert 17 hours ago

add a comment |

dagnelies · Answer 3 · 2017-06-20 11:29:54Z

In the case of functional programming, the meaning of “Easy to reason about” is mostly that it is deterministic. By that, I meant that a given input will always lead to the same output. You can do whatever you want to the program, as long as you don't touch that piece of code, it won't break.

On the other hand, OO is typically more difficult to reason about because the "output" produced depends on the internal state of every involved object. The typical way it manifests are unexpected side effects: when changing one part of the code, an appearingly unrelated part breaks.

...the downside of functional programming is of course that in practice, a lot of what you want to do is IO and managing state.

However, there are plenty of other things which are more difficult to reason about, and I agree with @Kilian that concurrency is a prime example. Distributed systems too.

AakashM · Answer 4 · 2017-06-21 11:58:08Z

Avoiding wider discussion, and addressing the specific question:

Can you provide some examples of "Not easy to reason about", so it can be compared with "Easy to reason about" examples?

I refer you to "The Story of Mel, a Real Programmer", a piece of programmer folklore that dates to 1983 and therefore counts as 'legend', for our profession.

It tells the tale of a programmer writing code that preferred arcane techniques wherever possible, including self-referential and self-modifying code, and deliberate exploitation of machine bugs:

an apparent infinite loop had in fact been coded in such a way as to take advantage of a carry-overflow error. Adding 1 to an instruction that decoded as "Load from address x" normally yielded "Load from address x+1". But when x was already the highest possible address, not only did the address wrap around to zero, but a 1 was carried into the bits from which the opcode would be read, changing the opcode from "load from" to "jump to" so that the full instruction changed from "load from the last address" to "jump to address zero".

This is an example of code that is 'hard to reason about'.

Of course, Mel would disagree...

+1 for referencing the story of Mel, one of my perennial favorites. — John Bollinger, 18 hours ago
Read The Story of Mel here, since the Wikipedia article doesn't link to it. — TRiG, 18 hours ago

Kasey Speakman · Answer 5 · 2017-06-21 15:09:06Z

I can provide an example, and a very common one.

Consider the following C# code.

// items is List<Item>
var names = new List<string>();
for (var i = 0; i < items.Count; i++)
{
    var item = items[i];
    var mangled = MyMangleFunction(item.Name);
    if (mangled.StartsWith("foo"))
    {
        names.Add(mangled);
    }
}

Now consider this alternative.

// items is List<Item>
var names = items
    .Select(item => MyMangleFunction(item.Name))
    .Where(s => s.StartsWith("foo"))
    .ToList();

In the second example, I know exactly what this code is doing at a glance. When I see Select, I know a list of items is being convert into a list of something else. When I see Where, I know that certain items are being filtered out. At a glance, I can understand what names is and make effective use of it.

When I see a for loop, I have no idea what is going on with it until I actually read through the code. And sometimes I have to trace through it to be sure I have accounted for all the side effects. I have to do a bit of work to even come to understand what names is (beyond the type definition) and how to effectively use it. Thus, the first example is harder to reason about than the second.

Ultimately, being easy to reason about here also depends on understanding LINQ methods Select and Where. If you don't know them, then the second code is harder to reason about initially. But you only pay the cost to understand them once. You pay the cost to understand a for loop every time you use one and again every time it changes. Sometimes the cost is worth paying, but usually being "easier to reason about" is far more important.

ChrisW · Answer 6 · 2017-06-20 17:48:52Z

up vote 2 down vote

A related phrase is (I paraphrase),

It's not enough for code to have "no obvious bugs": instead, it should have "obviously no bugs".

An example of relatively "easy to reason about" might be RAII.

Another example might be avoiding deadly embrace: if you can hold a lock and acquire another lock, and there are lots of locks, it's hard to be sure there's no scenario in which deadly embrace might occur. Adding a rule like "there is only one (global) lock", or, "you're not allowed to acquire a second lock while you hold a first lock", makes the system relatively easy to reason about.

answered yesterday

ChrisW

1,8421018

1

Hmm. I'm not sure RAII is so easy to reason about. Sure, it's easy to understand conceptually, but it gets more difficult to actually reason about (i.e., predict) the behavior of code that makes extensive use of RAII. I mean, it's basically invisible function calls at scope level. The fact that plenty of people have trouble reasoning about this is very plain if you've ever done any COM programming. – Cody Gray yesterday

I meant relatively easy (C++ compared with C): for example the existence of a language-supported constructor means that programmers can't create/have/use an object which they forget to initialize, etc. – ChrisW yesterday

That COM-based example is problematic because it mixes styles, i.e. C++-style smart pointer (CComPtr<>) with C-style function (CoUninitialize()). I find it a bizarre example, too, so far as I remember you invoke CoInitialize/CoUninitialize at module scope and for the whole module lifetime, e.g. in main or in DllMain, and not in some tiny short-lived local function scope as shown in the example. – ChrisW yesterday

It is an overly simplified example for illustrative purposes. You're completely right that COM is initialized at module scope, but imagine Raymond's example (like Larry's example) as being the entry point (main) function for an application. You initialize COM at startup, and then you uninitialize it right before exiting. Except you have global objects, like COM smart pointers, using the RAII paradigm. Regarding mixing styles: a global object that initialized COM in its ctor and uninitialized in its dtor is workable, and what Raymond suggests, but it's subtle and not easy to reason about. – Cody Gray 22 hours ago

I would argue that, in many ways, COM programming is easier to reason about in C, because everything is an explicit function call. There's nothing hidden or invisible going on behind your back. It is a bit more work (i.e., more tedious), because you have to manually write all those function calls and go back and check your work to see that you've done it correctly, but it's all laid bare, which is the key to making it easy to reason about. In other words, "sometimes smart pointers are just too smart". – Cody Gray 22 hours ago

| show 2 more comments

Kaz · Answer 7 · 2017-06-21 17:39:23Z

The crux of programming is case analysis. Alan Perlis remarked on this in Epigram #32: Programmers are not to be measured by their ingenuity and their logic but by the completeness of their case analysis.

A situation is easy to reason about if the case analysis is easy. This either means that there are few cases to consider, or, failing that, few special cases—there might be some large spaces of cases, but which collapse due to some regularities, or succumb to a reasoning technique such as induction.

A recursive version of an algorithm, for instance, is usually easier to reason about than an imperative version, because it doesn't contribute superfluous cases which arise through the mutation of supporting state variables that don't appear in the recursive verison. Moreover, the structure of the recursion is such that it fits into a mathematical proof-by-induction pattern. We don't have to consider complexities like loop variants and weakest strict preconditions and whatnot.

Another aspect of this is the structure of the case space. It is easier to reason about a situation which has a flat, or mostly flat division into cases compared to a hierarchical case situation: cases with sub-cases and sub-sub cases and so on.

A property of systems which simplifies reasoning is orthogonality: this is the property that the cases which govern subsystems remain independent when those subsystems are combined. No combinations give rise to "special cases". If a four-case something is combined with a three-case something orthogonally, there are twelve cases, but ideally each case is a combination of two cases that remain independent. In a sense, there aren't really twelve cases; the combinations are just "emergent case-like phenomena" that we don't have to worry about. What this means is that we still have four cases that we can think about without considering the other three in the other subsystem, and vice versa. If some of the combinations have to be specially identified and endowed with additional logic, then the reasoning is more difficult. In the worst case, every combination has some special handling, and then there really are twelve new cases, which are in addition to the original four and three.

Kilian Foth · Answer 8 · 2017-06-20 09:12:09Z

up vote 0 down vote

Sure. Take concurrency:

Critical sections enforced by mutexes: easy to understand because there is only one principle (two threads of execution cannot enter the critical section simultaneously), but prone to both inefficiency and deadlock.

Alternative models, e.g. lock-free programming or actors: potentially much more elegant and powerful, but hellishly hard to understand, because you can no longer rely on (seemingly) fundamental concepts such as "now write this value to that place".

Being easy to reason about is one aspect of a method. But choosing which method to use requires considering all aspects in combination.

answered 2 days ago

Kilian Foth

73k27198230

12

-1: really, really bad example that makes me think you don't understand what the phrase means yourself. "Critical sections enforced by mutexes" are in fact one of the hardest things to reason about out there - pretty much everyone who uses them introduces race conditions or deadlocks. I'll give you lock-free programming, but the whole damn point of the actor model is that it is much, much easier to reason about. – Michael Borgwardt yesterday

1

The problem is that concurrency is itself a very difficult topic for programmers to reason about, so it doesn't make for a very good example. You are completely correct that critical sections enforced by mutexes are a relatively simple way to implement concurrency, compared to lock-free programming, but most programmers are like Michael, and their eyes glaze over when you start talking about critical sections and mutexes, so this certainly doesn't seem like an easy thing to understand. Not to mention all the bugs. – Cody Gray yesterday

add a comment |

Gangnus · Answer 9 · 2017-06-20 14:31:25Z

Let us limit the task to the formal reasoning. Because humoristic or inventional or poetic reasoning have different laws.

Even so, the expression is dimmly defined, and cannot be set in a strict manner. But it does not mean it should remain so dim for us. Let us imagine that a structure is passing some test and getting marks for different points. The good marks for EVERY point mean that the structure is convenient in every aspect and thus, "Easy to reason about".

The structure "Easy to reason about" should get good marks for the following:

Inner terms have reasonable, easily distinguished and defined names. If elements have some hierarchy, the difference between parent and child names should different from the difference between siblings names.
Number of types of structural elements is low
Used types of structural elements are easy things we are accustomed to.
The hardly understandable elements (recursions, meta steps, 4+ dimensional geometry...) are isolated - not directly combined with each other. (for example, if you'll try to think on some recursional rule changing for 1,2,3,4..n..dimensional cubes, it will be very complicated. But if you will thransfer each of these rules to some formula depending on n, you will have separately a formula for every n-cube and separately a recursion rule for such formula. And that two structures separately can be easily thought about)
Types of structural elements are obviously different (for example, not using mixed arrays starting from 0 and from 1)

Is the test subjective? Yes, naturally it is. But the expression itself is subjective, too. What is easy for one person, is not easy for another one. So, the tests should be different for the different domains.

Pete Kirkham · Answer 10 · 2017-06-21 21:27:41Z

The idea of functional languages being possible to reason about comes from their history, specifically ML which was developed as a programming language analogous to the constructs which the Logic for Computable Functions used for reasoning. Most functional languages are closer to formal programming calculii than imperative ones, so the translation from code into the input of a system of reasoning system is less onerous.

For an example of a reasoning system, in pi-calculus, each mutable memory location in an imperative language needs to be represented as a separate parallel process, whereas a sequence of functional operations is a single process. Forty years on from LFC theorem prover, we are working with GB of RAM so having hundreds of processes is less of an issue - I have used pi-calculus to remove potential deadlocks from a few hundred lines of C++, despite the representation having hundreds of processes the reasoner did exhaust the state space in around 3GB and cure an intermittent bug. This would have been impossible in the 70s or required a supercomputer in the early 1990s, whereas the state space of a functional language program of similar size was small enough to reason about back then.

From the other answers, the phrase is becoming a buzz-phrase even as though much of the difficulty which made it hard to reason about imperative languages is eroded by Moore's law.

Cort Ammon · Answer 11 · 2017-06-20 17:54:26Z

Easy to reason about is a culturally specific term, which is why it's so hard to come up with concrete examples. It is a term which is anchored to the people who are to do the reasoning.

"Easy to reason about" is actually a very self descriptive phrase. If one is looking at the code, and wants to reason what it does, it's easy =)

Okay, breaking it down. If you're looking at code, you usually want it to do something. You want to make sure that it does what you think it should do. So you develop theories on what the code should be doing, and then you reason about it to try to argue why the code does indeed work. You try to think about the code like a human (rather than like a computer) and try to rationalize arguments about what the code can do.

The worst case for "easy to reason" is when the only way to make any sense of what the code does is to go line-by-line through the code like a Turing machine for all inputs. In this case, the only way to reason anything about the code is to turn yourself into a computer and execute it in your head. These worst case examples are easily seen in obsfucated programming contests, such as these 3 lines of PERL which decrypt RSA:

#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

As for easy to reason, again, the term is highly cultural. You have to consider:

What skills does the reasoner have? How much experience?
What sorts of questions might the reasoner have about the code?
how certain does the reasoner need to be?

Each of these affects "easy to reason about" differently. Take the skills of the reasoner as an example. When I started at my company, it was recommended that I develop my scripts in MATLAB because it is "easy to reason about." Why? Well, everyone in the company knew MATLAB. If I picked a different language, it would be harder for anyone to understand me. Nevermind that MATLAB's readability is atrocious for some tasks, simply because it wasn't designed for them. Later, as my career progressed, Python became more and more popular. Suddenly MATLAB code became "hard to reason about" and Python was the language of preference for writing code that was easy to reason about.

Also consider what idoms the reader may have. If you can rely on your reader to recognize a FFT in a particular syntax, it's "easier to reason about" the code if you stick to that syntax. It lets them look at the text file as canvas that you painted a FFT onto, rather than having to get into the nitty gritty details. If you're using C++, find out how much your readers are comfortable with the std library. How much do they like functional programming? Some of the idioms which come out of the containers libraries are very dependent on which idomatic style you prefer.

Its also important to understand what sorts of questions the reader may be interested in answering. Are your readers mostly concerned with superficial understanding of the code, or are they looking for bugs deep in the bowels?

How certain the reader has to be is actually an interesting one. In many cases, hazy reasoning is actually enough to get the product out the door. In other cases, such as FAA flight software, the reader is going to want to have ironclad reasoning. I ran into a case where I argued for using RAII for a particular task, because "You can just set it up and forget about it... it will do the right thing." I was told that I was wrong about that. Those who were going to reason on this code weren't the sort of people who "just want to forget about the details." For them, RAII was more like a hanging chad, forcing them to think about all the things that can happen when you leave scope. Those who were reading that code actually preferred explicit function calls at the end of the scope so that they could be confident that the programmer thought about it.

The Perl code is hard to read; not reason about. If I had some stake in having to understand it, I would de-obufscate the code. Code that is actually hard to reason about is that which is still hard to reason about when it is nicely formatted with clear identifiers for everything, and no code golfing tricks. — Kaz, yesterday

asked	2 days ago
viewed	6,272 times
active	today

current community

your communities

more stack exchange communities

“Easy to reason about” - what does that mean? [on hold]

put on hold as primarily opinion-based by gnat, BlueRaja - Danny Pflughoeft, Greg Burghardt, scriptin, Snowman 9 hours ago

11 Answers 11

Not the answer you're looking for? Browse other questions tagged programming-languages programming-practices functional-programming or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

“Easy to reason about” - what does that mean? [on hold]

put on hold as primarily opinion-based by gnat, BlueRaja - Danny Pflughoeft, Greg Burghardt, scriptin, Snowman 9 hours ago

11 Answers 11

Not the answer you're looking for? Browse other questions tagged programming-languages programming-practices functional-programming or ask your own question.

Related

Hot Network Questions