Ruby `reject!`
The reject!
method in Ruby selectively removes elements from an array in-place, based on a provided predicate (in the form of a block).
Between Ruby 1.9.3 and Ruby 2.3, reject!
was accidentally quadratic.
The underlying bug was fairly straightforward. Every time the provided predicate returned true
, the code immediately deleted that element:
if (RTEST(rb_yield(v))) {
rb_ary_delete_at(ary, i);
result = ary;
}
In an array, deleting an index necessitates shifting back all the following episodes, and so is an O(n) operation; If your predicate rejects a fixed fraction of elements, the total runtime is this O(n²).
The bug was fixed by keeping a count of accepted elements, and moving each element into its proper final position as it is scanned, and truncating the array at the end.
This bug is fairly straightforward, but the part I find most interesting was why it was introduced. The code used to be linear, but it regressed in response to bug #2545, which concerned the behavior when the block passed to reject!
executed a break
or otherwise exited early. Because reject!
is in-place, any partial modifications it makes are still visible after an early exit, and reject!
was leaving the array in a nonsensical state. The obvious fix was to ensure that the array was always in a consistent state, which is what resulted in the “delete every time” behavior.
I find this interesting as a cautionary tale of how several of Ruby’s features (here, ubiquitous mutability, blocks, and nonlocal exits) interact to create suprising edge cases that need to be addressed, and how addressing those edge cases can easily result in yet more problems (here, quadratic performance). In my mind, I’d just rather not have a reject!
at all, and callers who need to mutate an array in-place can figure out how to do safely with respect to their own use cases.
(Thanks to Russell Davis for bringing this one to my attention).