Daniel Lemire's blog

Science and Technology links (July 7th, 2017)

People magazine recently named Julia Roberts, who is 49, as the World’s Most Beautiful Woman.

Volvo plans to commercialize self-driving cars in 2020, and all electric by 2019. France will ban petrol cars in 2040.

The Fermi paradox is the idea that we ought to see intelligent life in the universe, since it is so vast… yet we have no evidence for it. Sandberg et al. claims that there is no paradox because the probability of life is simply too small: we might be a unique or nearly unique case.

According to an article in Nature, caffeine helps to fight obesity in mice.

The New York Times has an article about how tech companies have successfully lobbied schools to include computer science in their curriculum. I have mixed feelings about this entire story. I think we should resist the temptation to think that because learning to program can be highly beneficial for some, then many people should learn it. It is just not true that we will have millions of programmers in 20 years. Programming is and will remain a specialized activity.

There is a lot of talk about cancer vaccines, where your immune system is geared up to fight the specific kind of cancer you have. It is worth repeating that we are nowhere near a cure for cancer.
Scientists claim to have cured Alzheimer’s (in mice): “The drug completely erased evidence of Alzheimer’s synapse damage and memory loss in mouse models of the disease.”

Concerned with the poor quality of modern-day science, Vazire writes that “the drive for eminence is inherently at odds with scientific values”.

Are your strings immutable?

A value is immutable if it cannot change.

Immutability is a distinct notion than that of a constant. The speed of light in a vacuum is believed to be a universal constant, for example. Constants are immutable in the sense that they cannot change. However, immutability refers to values, not to the assignment of values. For example, the number 3 is immutable. However, if I say that your rank is 3, this rank could change. That’s because your rank is a variable, and variables may change their values even if these values are immutable.

That is, a variable may change its value to point to a different immutable value. That’s a somewhat confusing point for non-programmers. For example, my name is “Daniel”. To say that strings are immutable is to say that I cannot change the string “Daniel”. However, I can certainly go see the government and have my first name changed so that it is “Jack”. Yet this change does not modify the string “Daniel”. If I could change the string “Daniel” then, possibly, all individuals named “Daniel” would see their name changed.

So in the world around us, values are typically immutable. And that’s largely why it is believed that immutable values are safer and easier.

Working with mutable values requires more experience and more care. For example, not only does changing the string “Daniel” affect all people named “Daniel”, but what if two people try to change the string at the same time?

In fact, we mostly work with immutable constants because it makes reasoning easier. Change is hard.

So integer values are always immutable, not only in real life but also in software. There is no programming language where you can redefine the value “3” to be equal to “5”.

Yet I believe that most programming languages in widespread use have mutable arrays. That is, once you have created an array of values, you can always change any one of the entries. Why is that? Because immutability could get costly as any change to an immutable array would need to be implemented as a copy.

Arguably, the most important non-numeric type in software is the string. A string can be viewed as an array of characters so it would not be unreasonable to make it mutable, but strings are also viewed as primitive values (e.g., we don’t think of “Daniel” as an array of 6 characters). Consequently, some languages have immutable strings, others have mutable strings. Do you know whether the strings in your favorite language are mutable?

In Java, C#, JavaScript, Python and Go, strings are immutable. Furthermore, Java, C#, JavaScript and Go have the notion of a constant: a “variable” that cannot be reassigned. (I am unsure how well constants are implemented and supported in JavaScript, however.)
In Ruby and PHP, strings are mutable.
The C language does not really have string objects per se. However, we commonly represent strings as a pointer char *. In general, C strings are mutable. The C++ language has its own string class. It is mutable.
In both C and C++, string constants (declared with the const qualifier) are immutable, but you can easily “cast away” the const qualifier, so the immutability is weakly enforced.
In Swift, strings are mutable.
However, if you declare a string to be a constant (keyword let), then it is immutable.

Pruning spaces from strings quickly on ARM processors

Suppose that I give you a relatively long string and you want to remove all spaces from it. In ASCII, we can define spaces as the space character (‘ ‘), and the line ending characters (‘\r’ and ‘\n’). I am mostly interested in algorithmic and performance issues, so we can simplify the problem by removing all byte values less or equal to 32.

In a previous post where I asked how quickly we could prune spaces, the best answer involved vectorization using 128-bit registers (SSE4). It ends up being between 5 and 10 times faster than the naive approach.

Conveniently enough, ARM processors all have 128-bit vector registers, just like x64 processors. So can we make ARM processors go as fast as x64 processors?

Let us first consider a fast scalar implementation:

size_t i = 0, pos = 0;
while (i < howmany) {
    char c = bytes[i++];
    bytes[pos] = c;
    pos += (c > 32 ? 1 : 0);
}

This prunes all character values less or equal to 32, writing back the data in-place. It is very fast.

Can we do better with vector instructions?

On x64 processors, the winning strategy is to grab 16 bytes of data, quickly compare against white space characters, then extract a mask (or bitset) value made of 16 bits, one bit per character, where each bit indicates whether the value found is a white space. The construction of such a bitset is cheap on an x64 processor, as there is a dedicated instruction (movemask). There is no such instruction on ARM processors. You can emulate movemask using several instructions.

So we cannot proceed as we did on x64 processors. What can we do?

Just like with SS4, we can quickly check whether byte values are less or equal to 32, thus identifying white space characters:

static inline uint8x16_t is_white(uint8x16_t data) {
  const uint8x16_t wchar = vdupq_n_u8(' ');
  uint8x16_t isw = vcleq_u8(data, wchar);
  return isw;
}

Next we can quickly check whether any of the 16 characters is a white space, by using about two instructions:

static inline uint64_t is_not_zero(uint8x16_t v) {
  uint64x2_t v64 = vreinterpretq_u64_u8(v);
  uint32x2_t v32 = vqmovn_u64(v64);
  uint64x1_t result = vreinterpret_u64_u32(v32);
  return result[0];
}

This suggests a useful strategy. Instead of comparing characters one by one, compare 16 characters at once. If none of them is a white space character, just copy the 16 characters back to the input and move on. Otherwise, we fall back on the slow scalar approach, with the added benefit that we do not need to repeat the comparison:

uint8x16_t vecbytes = vld1q_u8((uint8_t *)bytes + i);
uint8x16_t w = is_white(vecbytes);
uint64_t haswhite = is_not_zero(w);
w0 = vaddq_u8(justone, w);
if(!haswhite) {
      vst1q_u8((uint8_t *)bytes + pos,vecbytes);
      pos += 16;
      i += 16;
 } else {
      for (int k = 0; k < 16; k++) {
        bytes[pos] = bytes[i++];
        pos += w[k];
     }
}

Most of the benefit from this approach would come if you can often expect streams of 16 bytes to contain no white space character. This seems like a good guess in many applications.

I wrote a benchmark where I try to estimate how long it takes to prune spaces, on a per character basis, using input data where there are few white space characters, placed at random. My source code is available, but you need an ARM processor to run it. I run the benchmark on a 64-bit ARM processor (made of A57 cores). John Regher has a few more benchmarks on this same machine. I think it is the same cores that you find in the Nintendo Switch.

scalar	1.40 ns
NEON	1.04 ns

The technical specification is sparse. However, the processor runs at 1.7 GHz as one can verify by using perf stat. Here is the number of cycles per character we need…

scalar	ARM	recent x64
scalar	2.4 cycles	1.2 cycles
vectorized (NEON and SSE4)	1.8 cycles	0.25 cycles

In comparison, on an x64 processor, the scalar version uses something like 1.2 cycles per character, which would put the ARM machine at half the performance of a recent x64 processor on a per cycle basis. That is to be expected as the A57 cores are hardly meant to compete with recent x64 processors on a cycle per cycle basis. However, with SSE4 on an x64 machine, I manage to use a little as 0.25 cycles per character, which is more than 5 times better than what I can do with ARM NEON.

This large difference comes from an algorithmic difference. On x64 processors, we are relying on the movemask/pshufb combo and we end up with a branchless algorithm involving very few instructions. Our ARM NEON version is much less powerful.

There is a lot to like about ARM processors. The assembly code is much more elegant than the equivalent with x86/x64 processors. Even the ARM NEON instructions feel cleaner than the SSE/AVX instructions. However, for many problems, the total lack of a movemask instruction might limit the scope of what is possible with ARM NEON.

But maybe I underestimate ARM NEON… can you do better than I did?

Note: The post has been edited: it is possible on 64-bit ARM processors to reshuffle 16 bits in one instruction as one of the commenters observed.

Science and Technology links (July 1st, 2017)

Canada is 150 years old today.

The iPhone is 10 years old this year. We can safely say that the iPhone 7 is over a hundred times faster, in almost every way than the original iPhone. Very few things get 100 times better over 10 years. You have to improve the performance by 60% each and every year.

Though mammals like us can heal injuries, there is often scarring. Scarring should be viewed as imperfect healing. It is not just a matter of looks, scars make your tissues less functional. As far skin healing is concerned, scientists have found a way to cause skin to heal without scarring, at least in mice.

Essentially, we can manipulate wound healing so that it leads to skin regeneration rather than scarring, (…) the secret is to regenerate hair follicles first. After that, the fat will regenerate in response to the signals from those follicles. (…) regenerating fat cells in skin can be beneficial for conditions beyond scarring. The process could potentially become a new anti-aging treatment, as the formation of deep wrinkles is thought to result from permanent loss of skin fat.

It seems that fasting (going without food) could be a key to regenerating your immune system:

The study has major implications for healthier aging, in which immune system decline contributes to increased susceptibility to disease as people age. By outlining how prolonged fasting cycles — periods of no food for two to four days at a time over the course of six months — kill older and damaged immune cells and generate new ones, the research also has implications for chemotherapy tolerance and for those with a wide range of immune system deficiencies, including autoimmunity disorders.

Chimpanzees are not that much stronger than we are:

But now a research team reports that contrary to this belief, chimp muscles’ maximum dynamic force and power output is just about 1.35 times higher than human muscle of similar size, a difference they call “modest” compared with historical, popular accounts of chimp “super strength,” being many times stronger than humans.

Human beings are optimized for high endurance:

The flip side is that humans, with a high percentage of slow-twitch fibers, are adapted for endurance, such as long-distance travel, at the expense of dynamic strength and power. When we compared chimps and humans to muscle fiber type data for other species we found that humans are the outlier, suggesting that selection for long distance, over-ground travel may have been important early in the evolution of our musculoskeletal system

So how do you fight a chimpanzee? I would guess that getting the fight to last as long as possible is your best bet as a human being. The chimpanzee will get exhausted first. So I would probably either keep the chimpanzee at bay or run away. If the chimpanzee pursues, I would just wear him down.

A few weeks ago, there was an article in Nature claiming that human lifespan is limited to 115 years. There are very few of us that can hope to ever reach 115 years of age at the present time, but the question is whether it will change. Some people believe that 115 years of age is a hard limit that cannot be exceeded. Several scientists have now issued counterpoints. Siegfried Hekimi from McGill University (Montreal) says that…

You can show the data are compatible with many different trajectories and not at all an ongoing plateau (…) by extending trend lines, we can show that maximum and average lifespans could continue to increase far into the foreseeable future. (…) If this trend continues and our life expectancy of the average person becomes 100, the longest person might make it to 150 (…)

Jim Vaupel from the Max Planck Institute writes:

The evidence points towards no looming limit. At present the balance of the evidence suggests that if there is a limit it is above 120, perhaps much above – and perhaps there is not a limit at all.

Maarten Rozing from the University of Copenhagen writes about a biological clock limiting our lifespan:

We now know not only that the idea of such a clock is highly implausible, but also that ageing is proving to be more amenable to change than used to be supposed

The rebuttals can be found in Nature:

Of course, the real answer is at this point is that we do not know how long human beings could live. This being said, Yuval Noah Harari makes a compelling case in his book Homo Deus: A Brief History of Tomorrow that homo sapiens has reached the end of the line. Very solid arguments can be made that, say, in 100 years, there won’t be any homo sapiens left on the planet. So it is entirely possible that we will never find out how long homo sapiens could live.

Video game review… Nier: Automata

Single-player RPG games are having a tough time. Last year I reviewed Deus Ex: Mankind Divided. Though I felt it was an excellent game, it was not a commercial success and it seems that there will not be a follow-up game in the series in the foreseeable future. More recently, I reviewed Mass Effect: Andromeda. I felt that it was a very solid game, but occasional poor writing and some botched graphical models opened up the game to harsh criticism. Again, it looks like Mass Effect might come to an end because of the poor sales.

I am currently playing another single-player RPG, this time from Japan, Nier: Automata. Sales-wise, it looks to be one of the top-10 games of all time on the PlayStation 4, so it is doing quite well.

The game mechanic itself is very much that of an old-school game. In fact, a fair amount of time is spent playing the game as if it were a two-dimensional shooter. Otherwise, the game plays quite a bit like a typical and classical action RPG “à la Zelda”.

The game looks good, but it is quite simple, even simplistic. There are only so many different enemy types. Most of the map looks the same. The 3D models are crude at times though always effective. The layouts are simplistic. I get the impression that the game engine must be simple. This gives the game an old-school look and feel. I also suspect that this means that the game is a massive success financially for its producers. A game like Mass Effect: Andromeda has a sophisticated design, with finely tuned non-trivial combat mechanics and lots of massive unique environments, so it has to be far more expensive to develop.

You play as an android that has two modes of attack that can be used simultaneously. Firstly, there is a drone that follows you around, and you can order this drone to shot continuously at enemies. Given that most enemies, including bosses, have a hard time damaging you if you stay far away, this drone almost trivializes the game. There are entire boss fights that you can win by jumping up a ledge and just having your drone shoot the enemy down. It helps that you have infinite ammunition. Secondly, you can use melee weapons like swords. That’s where the game gets interesting because though your melee weapons can cause a lot of damage, they also open you up to receiving a lot of damage. There is real skill involved in fighting powerful enemies up close.

Because you are an android, you can reprogram yourself by acquiring new programs. For example, you can make it so that whenever your health levels fall under a threshold, you automatically heal yourself using one of your “healing potions”. You can also make it so that after receiving some damage, you become invincible for a second or two. Combining these two programs is sufficient that, for most purposes, you are invincible… as long as you have enough “healing potions”… but these are cheap and widely available in stores.

When I first starting playing, I paid little to no attention to these programs, nor did I pay much attention to my choice of weapon. However, it ends up making a critical difference, at least on the default difficulty level.

There is no automatic save points, so you can die and have to restart the game from the beginning. You have to think about saving. If you die, your body will remain where you die along with some of your gear. You can retrieve it by playing again and getting back to your body.

Playing the game requires some skill, but on the default difficulty level, I only ever had trouble with one part of the game… there is a crazy boss at some point, “the Opera boss”, it is a giant lady with an armored dress. And I suspect that I had so much trouble because I did not understand the game very well.

Not everything is absolutely smooth. Several times I was left wondering about where I was supposed to go, what I was supposed to do, but I never got stuck long enough to be annoyed.

I have done an entire first playthrough but the game has this weird mechanic whereas you are supposed to beat the game several times, and each time you do so, you get to see a different side of the story. Looking at the Wikipedia entry for the game, it seems that I will need to play at least two more times through the game to really see the bulk of the story.

The music of the game really adds a lot to the experience. To be honest, I suspect that I play just to be immersed in the music and aesthetic of the game. I find it relaxing.

Though I have not played through the entire game, I know enough to appreciate the story and the theme. The game is set in our far future. It is supposedly very, very far in our future but, oddly, city structures are still holding more or less intact. There is no human being anywhere, though you are told that they reside on the Moon, unseen. You are an android that looks like a young human being, but there are cruder robots all over the surface of the Earth. The crude robots are your enemies, sometimes. Supposedly, there is a war going on between the crude robots and the androids, but looks can be deceiving.

It is probably most accurate to depict the story about being about the post-human era on Earth. Human beings are gone, but intelligent machines remain behind. It is very reminiscent of Stross’ Saturn’s Children. Though everybody around is a machine, you get to care for them, very much so.

That’s maybe the surest sign that the game is a success. You care for the characters. Even if they are machines that can be rebooted at will. It is saying a lot because I don’t normally empathize easily with Japanese characters, as I find the Japenese culture a bit too strange. So while the game is simple, it is skillfully made.

If you ever liked playing Zelda, and you don’t mind something a bit more serious where Zelda could die, this is a game for you.

Science and Technology links (June 23rd, 2017)

Elon Musk, Making Humans a Multi-Planetary Species, New Space. June 2017, 5(2): 46-61.

Reportedly, Ikea is working on augmented reality software that would allow you to see the furniture in your home before buying it.

Current virtual reality headsets provide a good experience, but if you ever tried to read text while where one of the headsets, you may have noticed that it is quite hard. It seems that it is because the image resolution is too low. When using prototypes with very high resolution, you can “read text across the room”.

You would think that a tree that is over 200 years old would have accumulated a lot of (random) genetic mutations. However, it seems that it is not the case: as trees grow old, even very old, they preserve their genes intact. We do not know why or how.

Top speed for top-k queries

Suppose that I give you a stream of values and you want to maintain the top-k smallest or largest values. You could accumulate all these values, sort them all and keep the smallest k values.

You can do somewhat better by using a binary heap to construct a priority queue. And you can also use the QuickSelect algorithm. I compared naive implementations of both the approach based on a binary heap and on QuickSelect in a previous blog post.

Let us review the code of these techniques. I use JavaScript because it is supported everywhere.

We can sort and slice…

var a = new Array();
for (var i = 0; i < streamsize; i++) {
            a.push(rand(i));
}
a.sort(function(a, b) {
            return a - b
});
var answer = a.slice(0, k);

We can dump everything naively into a priority queue (backed by a binary heap)…

var b = new FastPriorityQueue(reverseddefaultcomparator);
for (var i = 0; i < k; i++) {
            b.add(rand(i));
}
for (i = k; i < streamsize; i++) {
            b.add(rand(i));
            b.poll();
}
var answer = b.array.slice(0, k).sort(function(a, b) {
            return a - b
});

Each time we call poll in the priority queue, we remove the largest element. Thus we maintain a list of k smallest elements.

I now want to consider less naive implementations of these ideas.

At all time, we have access to the largest element in the priority queue through the peek function. Long-time reader Kendall Willets pointed out to me that we can make good use of the peek function as follows.

var b = new FastPriorityQueue(reverseddefaultcomparator);
 for (var i = 0; i < k; i++) {
            b.add(rand(i));
 }
 for (i = k; i < streamsize; i++) {
            var x = rand(i);
            if (x < b.peek()) {
                b.add(x);
                b.poll();
            }
 }
 var answer = b.array.slice(0, k).sort(function(a, b) {
            return a - b
 });

This has the potential of reducing substantially the amount of work that the priority queue has to do.

What about the QuickSelect algorithm?

Michel Lemay pointed out to me what the Google Guava library does, which is to use a small buffer (of size k) and to run QuickSelect on this small buffer instead of the whole stream of data.

var a = new Array();
for (var i = 0; i < 2 * k; i++) {
            a.push(rand(i));
}
QuickSelect(a, k, defaultcomparator);
var eaten = 2 * k;
while (eaten < streamsize) {
            for (var i = 0; i < k; i++) {
                a[k + i] = rand(i + eaten);
            }
            QuickSelect(a, k, defaultcomparator);
            eaten += k;
}
var answer = a.slice(0, k).sort(function(a, b) {
            return a - b
});

David Eppstein described to me something slightly more complex where we use our knowledge of the pivot in the QuickSelect algorithm to only merge potential candidates.

var a = new Array();
var i = 0;
for (; i < 2 * k; i++) {
            a.push(rand(i));
}
QuickSelect(a, k, defaultcomparator);
var pivotvalue = a[k];
var locationinbuffer = k;
for (; i < streamsize; i += 1) {
            var newvalue = rand(i);
            if (newvalue < pivotvalue) {
                a[locationinbuffer++] = newvalue;
            }
            if (locationinbuffer == 2 * k) {
                QuickSelect(a, k, defaultcomparator);
                locationinbuffer = k;
            }
}
if (locationinbuffer != k) {
            QuickSelect(a, k, defaultcomparator);
}
var answer = a.slice(0, k).sort(function(a, b) {
            return a - b
});

It is not exactly what David described, but I believe that it captures the essence of his idea.

So how do these techniques fare?
I have implemented a benchmark that you can run yourself, here are my raw results:

$ node test.js
Platform: linux 4.4.0-38-generic x64
Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Node version 4.5.0, v8 version 4.5.103.37


starting dynamic queue/enqueue benchmark
FastPriorityQueue x 3,095 ops/sec ±0.15% (97 runs sampled)
FastPriorityQueue-KWillets x 18,875 ops/sec ±0.59% (93 runs sampled)
sort x 490 ops/sec ±0.37% (94 runs sampled)
QuickSelect x 8,576 ops/sec ±0.31% (97 runs sampled)
QuickSelect-naiveGoogleGuava x 6,003 ops/sec ±0.20% (98 runs sampled)
QuickSelect-naiveDavidEppstein x 7,317 ops/sec ±0.22% (98 runs sampled)

So in my particular test, the approach proposed by Kendall Willets, using a priority queue with pruning based on the peek value, is a net winner. Your results might vary, but I think that the Willets approach is attractive. It is simple, it uses little memory and it should provide consistent performance.

Exercise for the reader: I am using a random input in my benchmark. As pointed out by David Eppstein, this means that we can bound the probability that the ith value is in the top-k by min(1, k/i), so that the Willets check only fails O(k log n) times. How well would the different techniques do for nearly sorted data?

Update: As pointed out in the comments, we can further improve the Willets approach by merging the add and poll functions into a single function. The code is similar…

var b = new FastPriorityQueue(reverseddefaultcomparator);
 for (var i = 0; i < k; i++) {
                b.add(rand(i));
 }
 for (i = k; i < streamsize; i++) {
                var x = rand(i);
                if (x < b.peek()) {
                    b.replaceTop(x);
                }
 }
 var answer = b.array.slice(0, k).sort(function(a, b) {
                return a - b
 });

So what is the performance this time?

$ node test.js
Platform: linux 4.4.0-38-generic x64
Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Node version 4.5.0, v8 version 4.5.103.37


starting dynamic queue/enqueue benchmark
FastPriorityQueue x 3,137 ops/sec ±0.13% (97 runs sampled)
FastPriorityQueue-KWillets x 19,027 ops/sec ±0.37% (94 runs sampled)
FastPriorityQueue-KWillets-replaceTop x 22,200 ops/sec ±0.46% (95 runs sampled)
sort x 479 ops/sec ±0.42% (92 runs sampled)
QuickSelect x 8,589 ops/sec ±0.31% (98 runs sampled)
QuickSelect-naiveGoogleGuava x 5,661 ops/sec ±0.22% (97 runs sampled)
QuickSelect-naiveDavidEppstein x 7,320 ops/sec ±0.25% (98 runs sampled)

So we improve the performance once more!

Update 2: You can combine a binary heap and QuickSelect using introselect.

Science and Technology links (June 16th, 2017)

How much bandwidth do we have? It seems that each of our eyes has 1 megabyte per second. That’s about 100GB per day assuming you close one of your eyes and you never sleep.

I found a fascinating article on “academic urban legends” including the myth that spinach is a good source of iron:

First, it falsified an idea that I had carried with me since I was a child, that spinach is an excellent source of iron. The most striking thing, however, was that a single decimal point, misplaced 80 years ago, had affected not just myself and my now deceased parents, but also a large number of others in what we place on our table.

(Read the whole article, it is not as simple as a decimal point.)
(via John D. Cook)

There is an app for the Apple Watch called cardiogram that is taking the wearers of the watch by storm:

Cardiogram uses the Apple Watch’s built-in heart sensor to give you advice about heart health — for instance, showing when you’re unusually stressed out by checking your heart rate.

I really need to get myself one of these watches!

As we age, we tend to lose muscle mass, to get weaker. The process is not entirely well understood, but it seems that it could be mediated at least in part by a loss of activity… As we age, we exercise less and this leads to detrimental changes in our metabolism. There is some hard evidence for this theory:

(…) we find that sedentary but not active humans display an age-related decline in the mitochondrial protein that is associated with muscle loss.

(via P. D. Mangan)

China just switched on the world’s largest floating solar power plant. It is still tiny (40MW) in comparison with many conventional power plants.

You can find melatonin at your local drug store or you can order it from Amazon. Many people take it at night to help them sleep. It seems that it could prevent neurological diseases. I sleep very well, so I am not tempted to take melatonin, but if I ever have trouble sleeping, I might quickly change my mind.

WordPress (the company) is shutting down its San Francisco office because employees never show up, preferring to work from home. I tend to work mostly from home myself, and I might never leave my home, except that I have graduate students and postdoctoral fellows to meet, and I am also department chair, so I have to show up for meetings from time to time. This being said, for programming or writing, I am far more productive at home. If I were a full-time programmer, I would certainly expect my employer to let me work from home. Somewhat ironically, it is part of my job to try to entice graduate students to show up to our lab.

When your skin gets red and your body aches, you are probably suffering from inflammation. It is a crude response from our immune system that helps us fight viruses and bacterias. We can suppress inflammation using something as simple as aspirin. However, inflammation is also involved in tissue repair. A recent paper shows that a protein that drives inflammation is critical for muscle regeneration.

Perutz wrote in Nature:

The number of scientists in the biomedical field is growing exponentially (…)

Peter Rothman comments:

Contrary to some assertions, exponential advances in science and technology are not due to some fundamental law of nature but rather the exponential increase in the number of scientists doing science. (…) Biological research progress doesn’t generally scale with computer chip density although some specific areas might scale with available computational power. In other areas, you still need to spend a large number of hours in the lab actually doing the work.

Science, even medical science, is not homogeneous. Many areas of science advance at a breakneck speed, even when relatively few people are involved. Back in 2012, doing gene editing was science-fiction for most people, but today high school students can do it, affordably. Back a few years ago, advanced computing applications such as face recognition required a Ph.D. in the field and advanced mathematics… whereas today you can get it done in a few hours using free software libraries. Meanwhile, other research areas, like cancer research, remain painfully slow and expensive. What we hope is that the areas that make rapid progress can rescue the slow moving fields. This has happened time and time again in our history. That is, it would have been useless for the Roman Empire to try to build the Internet. No matter how much they would have invested, the progress would have been non-existent. However, the Roman Empire could have benefited greatly from an industrial revolution.

Senescent cells are “old” cells that won’t die. We now have the technology to kill them, albeit that’s not yet perfected to the point where your doctor can get rid of your senescent cells. What do senescent cells do? A lot of bad. A recent article shows that senescent cells drive the “fatty liver” disease.

QuickSelect versus binary heap for top-k queries

In a previous post, I considered the problem of finding the k smallest (or k largest) elements from a stream of values.

The naive approach is to collect all the values in an array, sort the array, and return the first k values. When an index is not available, that is how some relational databases reportedly solve SELECT/ORDER BY/LIMIT queries. It is also how many programmers will solve this problem.

Using a JavaScript benchmark, I showed that an approach based on a binary heap could be several times faster. That is a net win for computer science since an approach based on a binary heap has complexity O( n log k ) where n is the number of elements. Moreover the binary heap is attractive because it has an optimal storage requirement of O( k ).

Lots of people objected that we could do even better using an algorithm called QuickSelect (e.g., Rasmus Pagh). Whereas QuickSelect has far higher memory requirements than a binary heap (O( n )) and a worst complexity of O( n² ), it has a superior average case complexity of O( n ). (Note: you can combine a binary heap and QuickSelect using introselect, I do not consider this possibility.)

Finally, Anno Langen sent me a code sample in JavaScript, so I no longer any excuse not to test QuickSelect. I put together a JavaScript benchmark that you can run and review.

In this benchmark, we receive 1408 random integers and we must collect the smallest 128.

The approach using a binary heap can run about 37,000 times a second, whereas QuickSelect runs 45,000 times per second, or about 20% faster. They are both about an order of magnitude faster than the naive sort/slice approach.

For all practical purposes, 20% faster is negligible in this case. I have actually hit a sweet spot where QuickSelect and the binary heap are comparable.

What are other cases of interest?

If you only seek the top 5 elements out of an array, then the binary heap is likely to beat QuickSelect, irrespective of how many elements I have. The binary heap will fit in one or two cache lines, and the log k factor will be small.
If I keep k at a sizeable 128, but I increase substantially the size of the array, then QuickSelect will start to win big.
However, if I keep increasing the array size, the benefits of QuickSelect might start to fade. Indeed, QuickSelect will start to operate in RAM whereas the binary heap will remain in CPU cache. QuickSelect will become increasingly limited by potentially expensive cache misses.
QuickSelect still has the worst case quadratic-time scenario that could be triggered by really bad input data. The binary heap is more likely to provide consistent speed.

What else could we do?

Martin Cohen suggested a simple insertion sort on the basis that, for large streams, you are unlikely to encounter many candidates for the top-k position. This would probably work well for very long streams, but it could degenerate badly in some cases.
Michel Lemay refered to a fancier approach used by Google Guava: maintain a buffer of 2k elements initially empty. Fill it up. Once it is full, use QuickSelect on it and discard half of it. Rinse and repeat. This seems like an appealing idea and Michel testifies that it provides very good practical performance.

So I am probably going to have to return to this problem.

Follow-up blog post: Top speed for top-k queries.

Science and Technology links (June 9th, 2017)

This week, Apple told us about two new interesting pieces of technology.

On the one hand, we have ARKit. It makes it easy for developers to build and deploy “augmented reality” applications to the iPhone in your pocket. What is augmented reality? In the visual space, it is a way to add “virtual” objects on top of what you’d normally see. So, for example, a furniture company could make it easy for you to “see” the new furniture in your living room as if you owned it. That’s nice except that looking at your mobile phone is an awkward way to do augmented reality. We’d prefer to have something like “smart glasses”. It is surely coming in the near future.
On the other hand, Apple released Core ML which is a way for app developers to make use of fancy machine learning techniques without too much effort. Suppose you want to build an app that can look at people’s skin to help diagnose a disease. You could train a model using your PC, and then drag-and-drop this model into your app, and then deploy it to all new iPhones.

Suppose that we had smart watches that measure your pace, your sleep, your heart rate and, one might dream, your glucose level… combine this with Core ML and you might have an app that assesses your health and makes personalized recommendations.

It is interesting to look at the contrast between Apple and the rest of the industry. Google and Microsoft want us to use their services, Apple wants us to buy devices on which we can run apps. This leads to a different focus.

Speaking of Apple, its new iPad Pro uses a processor designed by Apple. It is faster than the Intel processor Apple uses in its cheapest laptops. Given that even the cheapest laptop from Apple is good enough for most people since its cheap laptops are about “mid-range” compared to other vendors, we can say that an iPad Pro could, in theory, replace adequately a laptop for most people, if we only cared about performance. To be fair to Intel, the Apple processor used in its iPad Pro has extra cores and it runs at a much higher clock frequency than the mobile Intel processor used in the laptops (though the laptop has more RAM). Still, we are at the point where top-of-the-line mobile phones and tablets are more powerful than mid-range laptops. It will be interesting to see how the trend goes in coming years.

Brent Smith and Greg Linden review two decades of recommender systems at Amazon.com. Do read it no matter what your scientific level is, it is worth your time. The mathematics can be safely skipped if you want.

India Launches 104 Satellites From a Single Rocket.

A professor was fascinated by the fact that cats that had their spinal cord severed at birth could still walk. After decades, he managed to get some positive results in human beings that had their spinal cord damaged. That is, paralyzed human beings can regain some control of their legs. As a researcher, he was the subject of mockery.

It turns out that if you introduce a computerized symptom monitoring system in cancer patients, they do better.

If you combine machine learning and radiology, you can make predictions about one’s health, including longevity.

Digital therapeutics is the idea that you can use software as medicine. I don’t think it is as crazy as it sounds. That you can cure a disease with a pill is not so intuitive, it is only because we are used to it that we take it for granted. I do think we may see software in the future being a strong component of medical therapies. It is much cheaper to deploy software than to distribute and monitor pills.

Dubai has robot cops.

IBM claims to have breakthrough technology regarding the design of smaller and cheaper silicon-based microprocessors.

Sugar is often thought to be bad for you, but a special type of sugar called trehalose seems to protect against cardiovascular diseases. You can find trehalose on Amazon. It is widely used in Japan, so it ought to be safe, at least in modest quantities.