This bug, and many others, can be detected with a trivial amount of formal verification. I really think formal verification should see much wider use for things that see as much use as standard libraries, even if it's just for the trivial things like overflow and out-of-bounds access.
In the below code we can see a formal verification tool (GNATProve) detect both the original error and the out-of-bounds access that it causes. Doing the arithmetic using a larger type clears both reported errors without the need for any additional annotations for GNATProve.
function Search (A : A_Type; Target : Integer) return Integer is
Left : Integer := A'First;
Right : Integer := A'Last;
begin
while Left <= Right loop
declare
Mid : Integer := (Left + Right) / 2;
begin
if A (Mid) = Target then
return Mid;
elsif A (Mid) < Target then
Left := Mid + 1;
elsif A (Mid) > Target then
Right := Mid - 1;
end if;
end;
end loop;
end Search;
GNATProve output:
Phase 1 of 2: generation of Global contracts ...
Phase 2 of 2: flow analysis and proof ...
wrapper.adb:12:36: medium: overflow check might fail, cannot prove lower bound for Left + Right
12 | Mid : Integer := (Left + Right) / 2;
| ~~~~~~^~~~~~~~
reason for check: result of addition must fit in a 32-bits machine integer
wrapper.adb:12:45: info: division check proved
wrapper.adb:14:19: medium: array index check might fail
14 | if A (Mid) = Target then
| ^~~
reason for check: value must be a valid index into the array
these kinds of problems are present in very many standard treatments of algorithms and don't survive their first contact with real world use in some cases.
"needs a carry bit or a wider type" is common for arithmetic operations that actually use the range.
They literally advertise this as a feature on their home page in that they give you X% of what they earn from affiliate links. It says a lot about the influencers advertising it if they did not realise that Honey is taking their affiliate revenue while reading an ad that says they do. More realistically, most of them likely did know that this is what Honey does and determined that honey paid more than any lost revenue, but know that saying that would lead to some of this outrage being directed at them. As a bonus, by asking users to uninstall Honey they can have their money from running Honey ads previously and money from their affiliate links too. The only way I can maybe see influencers being unaware of this if if they read an ad that only talked about the coupon feature and they never bothered to even visit the Honey homepage or try it themselves, which would be a pretty bad look for them.
I also think most users that use Honey have to know that this is what it does given that, again, it's an advertised feature. I suspect most people outraged at this are people that never used the tool in the first place.
I certainly think it's a bad product as it defeats the purpose of affiliate links and reduces revenue for anyone using affiliate links for their intended purpose, but I don't think they were misleading anyone about what the product does.
If a company says they make money “selling cars” you don’t assume they get those cars through theft. Same deal here, saying they get affiliate money doesn’t imply they overwrite existing affiliate links which is about a clear a case of tortious interference as you could find. https://en.wikipedia.org/wiki/Tortious_interference
So, no Honey didn’t disclose what people are complaining about.
I agree that it's likely tortious interference, I just don't agree that they're being deceptive. I think it's reasonable to say that most of the target market would be considered tech-savvy enough to know how these purchase affiliate programs work and the influencers advertising it absolutely do.
If users are not aware of how this works then I have to question where they think the cashback comes from, and if they are aware that it comes from the business they are making a purchase from then I have to question why they think the business would give a commission to Honey on top of the existing commission to wherever a link was clicked from. I think a majority of users would have to be somewhat wilfully ignorant to not question why money is being given back to them whenever they make a purchase.
I would posit that the "average user" has no care about how the companies they interact with make their money. The modern world is filled with black boxes to them.
I think you’re missing the point. Honey is stealing affiliate credits from influencers who don’t have any relationship with them.
If a YouTuber posts an link with their own affiliate code and during checkout the user uses the Honey extension to look for coupons Honey steals the affiliate credit even if they don’t have a coupon the YouTuber gets nothing.
*Even if they know there is a coupon, but are hiding it from you because of a partner relationship with the store who hasn't published any coupons to their honey admin portal.
I very much agree that it's bad in this regard and potentially even illegal, I just don't think they were ever misleading about the fact that they do this.
It took LTT years to detect this behavior despite it coming up in forums and HN. They then decided to stop accepting sponsorships from them.
Sponsored ads also often boasted Honey will get you the best deal anywhere. Yet some have found them knowingly preferring their own coupons even though others users had manually entered and successfully used better coupons from elsewhere (while the extension was installed).
> Sponsored ads also often boasted Honey will get you the best deal anywhere.
I agree that this can be called misleading, or even potentially a scam.
> It took LTT years to detect this behavior despite it coming up in forums and HN.
They're well aware of how affiliate links work, so as I said above, even if they only advertised the coupon part and never the cashback part they'd have to have never bothered to visit the Honey homepage or tried it themselves. I'm sure they just determined that they'd make more revenue than the lost affiliate revenue at first and later determined they no longer would.
Just because they know how affiliate links work, would the wording on Honeys website not lead them to believe they're only being the "affiliate" if they find a deal?
How would they have detected this without doing an investigation themselves into how the tool works, and they don't seem to be the kind of company to do that kind of due diligence when accepting ad deals.
There's some thick bindings to libtls that coincidentally happen to be written by the author of the article. There's also some OpenSSL bindings in Dmitry Kazakov's Simple Components and some in Ada Web Server by AdaCore, although they're pretty minimal.
I think most applications of Ada are in embedded systems where you don't often want anything not in the standard library.
> I think most applications of Ada are in embedded systems...
Ada is heavily used and carries a historical influence not only with embedded software space, but also with hardware space: VHDL is one of the two major hardware description languages used in ASIC design and FPGA implementations. (The other language is Verilog, based on - you guessed it - C, because of its popularity.)
"Due to the Department of Defense requiring as much of the syntax as possible to be based on Ada, in order to avoid re-inventing concepts that had already been thoroughly tested in the development of Ada, VHDL borrows heavily from the Ada programming language in both concept and syntax." - https://en.wikipedia.org/wiki/VHDL#History
As far as I'm aware, the compiler AdaCore sells is just GCC. You can install GCC built with Ada support from your distros package manager and it will just work. You can also download builds from here: https://github.com/alire-project/GNAT-FSF-builds
> As far as I'm aware, the compiler AdaCore sells is just GCC.
The compiler yes, but I'm convinced FOSS gnatprove must be outdated in some way: Last time I tried following AdaCore's SPARK manuals, certain basic aspects and pragmas didn't work correctly on the latest version.
Not to mention when SPARK aspects sometimes broke the LSP and formatter.
If something hasn't changed, FSF builds are a year behind libre version (by design), and libre version is GPL3 cancer which is not suitable for commercial development. You're then stuck either with a year old version or buy into AdaCore Pro version of it. Not great, not terrible.. but that's kind of the only game out in the open, which is what makes it different from most of other languages out there.
GNAT CE isn't a thing anymore, only FSF and Pro exist. And AdaCore now sponsors Alire, which installs FSF GNAT, and relicensed some of their tools more permissively.
> An example of something you can do in a dependently typed language is write a sorting function in such a way that the type checker proves that the output will always be in sorted order. I am pretty sure this cannot be done in Ada; checking at runtime does not count!
It actually can be done in Ada, but not purely with the type system, instead we rely on SPARK, which converts Ada code and passes it through various automatic theorem provers. Some examples of fully proven sorting functions are here: https://github.com/AdaCore/spark2014/blob/master/testsuite/g...
You can also see from the above code just how good theorem provers and SPARK are now with the reasonably low number of assertions required to both prove that the output is sorted and prove that the input and output contain the same elements, not to mention all the hidden proofs relating to integer overflow, out-of-bounds access, etc..
You could maybe do all this with types and SPARK, but it's not the approach that would usually be taken.
Ah, very interesting. It does seem that the Ada community has done serious engineering work to build in powerful formal verification, in a way that is somehow parallel to the (much slower, for practical purposes, if more elegant) arc of type theory...
what is the flow for working through this kind of proof? Is there an interactive proof mode like you find in a lot of dependent type provers? Or is there some other guiding mechanism for telling you that you haven't provided enough guidance with asserts?
One other neat thing about discriminated records is that you're not limited to just a single field with a variable size, you can also write something like this:
type My_Record (A, B : My_Integer) is record
X : My_Array (1 .. A);
Y : My_Array (1 .. B);
end record;
A record that's created from this will have those arrays tightly packed rather than leaving space at the end of the first one like you might expect (this might be compiler dependant, but it's definitely how GCC does it). Also note that these values can be set at runtime, so this isn't just a generic in disguise (although in Ada you can also instantiate generics at runtime).
I'm a bit disappointed that we've ended up with Rust in the kernel but not Ada. The kernel relies on GCC and there's already an Ada compiler in GCC, so it wouldn't require adding another compiler as a build requirement like Rust does.
There's a couple of major advantages that Ada could have in the Linux over Rust for safe drivers:
1. Having the option to use SPARK for fully verified code provides a superset of the correctness guarantees provided by Rust. As mentioned in the parent comment, Rust focuses purely on memory safety whereas SPARK can provide partial or complete verification of functional correctness. Even without writing any contracts, just preventing things like arithmetic overflows is incredibly useful.
Agreed. It would also alleviate the maintenance issue as Ada was designed to reduce software costs. Realistically the obly way this will have any chance is if a few developers were funded full time to work on it. There seemed to be a few full time devs pushing Rust support and some drivers before it was taken seriously. Honestly Ada is the best language that I have seen for drivers and for network stacks or for registers received as bytes. Linux is missing out.
> 2. Representation clauses almost entirely eliminate the need for the error-prone manual (de-)serialisation that's littered all over Linux driver code:
Do representation clauses let you specify endianess? From a quick glance at that link it didn't appear so. I would imagine that invalidates most use cases for them.
Interesting, thank you. I think per-record is probably good enough for most applications, and less verbose than per-member. But it's not part of the language (that page calls it "implementation specific") and quite recent (that page is undated but references Ada 2012 so must be since then). It wouldn't have helped the Ada project I'm working on, which had an endianess crisis with serialisation a few decades ago.
> I'm a bit disappointed that we've ended up with Rust in the kernel but not Ada.
Why? Do you program in Ada or Coq?
People can't be bothered to track lifetimes, what makes you think they are ready to track pre/post-conditions, invariants and do it efficiently and thoroughly.
For Rust and Ada in the Kernel they need to be accepted by kernel maintainers.
And seeing they can't stand Rust, when its constraints are much weaker than Ada.Spark what chances does it have?
To paraphrase some computer science guru: The cyclical nature of programming languages is all to blame on the collective programmers. Rather than being guided by mathematic rigor, they are guided by a fashion sense.
I've been saying for a while that formal verification seems like the obvious way forward here. Test cases are good but only cover a limited number of cases, whereas formal verification can cover all cases.
The biggest reason that we don't see formal verification being used more is that it takes some time to formally define the output of an algorithm and significantly more time to convince a theorem prover that your code is correct according to the pre- and post-conditions. If we can get LLMs to work well alongside formal verification tools then I think that would greatly improve their usefulness.
Here's a made up and trivial example of how I imagine this might look:
User: Write me a binary search over "type A_Type is array (Integer range <>) of Integer;"
LLM: Implementing the following:
function Search (A : A_Type; Target : Integer) return Integer with Pre => Sorted (A) and Target in A, Post => A(Search'Result) = Target;
First implementation:
function Search (A : A_Type; Target : Integer) return Integer is
Left : Integer := A'First;
Right : Integer := A'Last;
begin
while Left <= Right loop
declare
Mid : Integer := (Left + Right) / 2;
begin
if A (Mid) = Target then
return Mid;
elsif A (Mid) < Target then
Left := Mid + 1;
elsif A (Mid) > Target then
Right := Mid - 1;
end if;
end;
end loop;
end Search;
Compiler:
Phase 1 of 2: generation of Global contracts ...
Phase 2 of 2: flow analysis and proof ...
wrapper.adb:12:36: medium: overflow check might fail, cannot prove lower bound for Left + Right
12 | Mid : Integer := (Left + Right) / 2;
| ~~~~~~^~~~~~~~
reason for check: result of addition must fit in a 32-bits machine integer
wrapper.adb:12:45: info: division check proved
wrapper.adb:14:19: medium: array index check might fail
14 | if A (Mid) = Target then
| ^~~
reason for check: value must be a valid index into the array
LLM: Looks like there's a possible overflow for very large arrays, attempting new solution...
Etc. until a valid solution is found and the result is returned to the user.
The use of formal methods has historically been limited to very limited niche areas. The volume of code that is covered by formal verification is some ridiculously small percentage of the overall amount of code out there.
The reason is probably that it is too tedious/difficult and you need some rare skills to do it. And mostly the categories of bugs it eliminates are not problematic enough. Either way, the amount of people capable of writing code vastly outnumber the people capable of formally verifying that code. I know a lot of programmers without computer science backgrounds that definitely have never been exposed to any of this. I have been exposed to some of this. But that's 25 years ago. And the persons teaching me that lived out his career in academia without ever working on real code that mattered. A lot of this stuff is rather academic and esoteric.
Of course, LLMs could change this a quite a bit. A lot of programming languages are optimized for humans. Lots of programmers prefer languages that sacrifice correctness for flexibility. E.g. static typing is the simplest form of adding some formal verification to a language and a lot of scripting languages get rid of that because the verification step (aka. compilation) is somewhat tedious and so is having to spell out your intentions. Python is a good example of a language that appeals to people without a lot of formal training in programming. And some languages go the other way and are harder to use and learn because they are more strict. Rust is a good example of that. Great language. But not necessarily easy to learn.
With LLMs, I don't actually need to learn a lot of Rust in order to produce working Rust programs. I just need to be able to understand it at a high level. And I can use the LLM to explain things to me when I don't. Likewise, I imagine I could get an LLM to write detailed specifications for whatever verifiers there are and even make helpful suggestions about which ones to pick. It's not that different from documenting code or writing tests for code. Which are two things I definitely use LLMs for these days.
The point here is that LLMs could compensate for a lack of trained people that can produce formal specifications and produce larger volumes of such specifications. There's probably a lot of value in giving some existing code that treatment. The flip side here is that it's still work and it's competing with other things that people could spend time on.
That Java issue you mentioned is an example of something that wasn't noticed for 9 years; probably because it wasn't that big of a problem. The value of the fix was lowish and so is the value of preventing the problem. A lot of bugs are like that.
Formalism starts with intent and then removing ambiguity from that intent. Having intent is easy, removing it is not. Especially when you do not know the limitation of what you're using to materialize that intent.
Python is easy because it lets you get somewhere because the inputs will roughly be the set of acceptable inputs, so the output will be as expected, and you can tweak as things go (much faster for scripting tasks). But when you need a correct program that needs to satisfies some guaranteed, then this strategy no longer cuts it, and suddenly you need a lot more knowledge.
I don't think LLM would cut it, because it doesn't understand ambiguity and how to chisel it away so only the most essential understanding remains.
reply