The Cockroach Principle

Sunday, July 24, 2016
By dreeves

A cockroach in the kitchen

If you spot one cockroach in your kitchen you can rest assured [1] that there are hordes of them sneaking around not making themselves noticed. Or maybe possibly it was just that one passing through, but if you see another one you’re very probably supporting a colony with the biomass of a blue whale in your kitchen walls. And, three? Forget about it. Burn your house to the ground.

This might be bad form to analogize our users to cockroaches, but we have a mildly interesting point here. If 10% of users will actually complain about something and the rest will walk away then, in expectation, 9 users walked before this one complained.

(And it’s probably less than 10%. Especially bad problems can be even less likely to get reported, like if something’s so broken that people figure the whole site is moribund and there’s no point speaking up. At least anecdotally it seems that people are less likely to report something like “the site won’t let me create goals”.)

In conclusion, mentally 10x or 20x user complaints.


 

Case study 1: Infinitely buzzing bee

A few weeks ago we found a bug that seemed especially cockroachy. You know how when Beeminder is generating your graph it goes gray and a little bee buzzes around in a figuure eight until it’s ready? Well sometimes it would get stuck and never stop doing that. Especially if it’s happening mostly to new users, they have no sense of that being a bug vs the server just being super slow or something. So it’s highly unlikely to get reported despite causing huge inconvenience for lots of people.

The point is, when a user complains about something, think about how likely it would be that that problem would cause a user to complain. The lower it is the bigger the emergency you might have on your hands.

Case study 2: Spamboxed email, or, ninja cockroaches

So now imagine a user tells you that an email you sent wound up in their spam folder. This is like spotting the kind of cockroach breed that spends decades being trained by a giant anthropomorphic rat. We may be taking the analogy way too far. The point is, how can someone complain about something that didn’t happen? It’s pure luck that they checked their spambox and knew to complain! Pretty much everyone else won’t. So email failure is about the cockroachiest bug ever.

Remember the rule of thumb about supporting a blue whale’s worth of cockroach biomass? Well after sending a mass email last week (our so-called monthly beemail) we in fact got two independent reports from users about needing to fish it out of spam.

True confession: This was probably our fault (we’re still scrambling to find out for sure) because the first 5,000 or so emails went out with broken unsubscribe links. We did fix it quickly (and retroactively) so we don’t think too many people would’ve noticed but it might not take many people marking us as spam (which of course they’ll justifiably do if our dang unsubscribe link doesn’t work!) before email providers start penalizing us.

Fingers crossed that we don’t have to burn our Mailgun account to the ground and start over with a fresh IP address for sending email. And a plaintive plea to our faithful readers: if you could search your spam folder for Beeminder email and mark it Not Spam, that could be a huge help to us.


 

Selection bias

One more admonition before we get back to debugging our email woes, since it’s related to the cockroach principle. Be careful about straw polls of your users! Selection bias means you’ve filtered out all but the sufficiently tolerant or sufficiently lucky to still be hanging around willing to tell you about your crappy software. (Also, people are way too nice.)

Don’t get us wrong, straw polls are wonderful — we do them all the time — as long as you keep that in mind. Like if even the people willing to take your straw poll say that something’s a problem then you know that among your full userbase it’s a massive problem.

So, brave readers who’ve made it to the end of this blog post, how’s Beeminder been beehaving for you?

UPDATE: We’re taking a bit of flak for comparing our users to cockroaches but Frank Wouters comes to our rescue eloquently on Quora. See also the hover text on that link or @logicalFramework in the comments who points out that we could’ve gone with the more innocuous “Iceberg Principle”.


 

Footnotes

[1] Highly ironic use of “rest assured”.

Tags: , , , , ,

  • JayDugger

    “This might be bad form to analogize our users to cockroaches, but we have a mildly interesting point here. If 10% of users will actually complain about something and the rest will walk away then, in expectation, 9 users walked before this one complained.”

    Not “might be,” but certainly is, interesting point notwithstanding. Please save this metaphor for private conversations.

  • logicalFramework

    Hmm, I feel like both positive and negative feedback suffers from the cockroach problem (or, to call it something slightly more neutral, the Iceberg Problem) — because people are unlikely to spontaneously tell you when things are going fine, too, while they are more likely to angrily complain when things go poorly. Just look at Yelp for examples of that! Myself, I always ignore those pop-up “tell us how our website is doing!” surveys — except when I’ve had an amazingly bad time with the website in question, and want to yell into the void to make myself feel better.

    I think the basic problem is there is an activation barrier for feedback, and so no matter what feedback you’re getting, it’s a distorted and tiny fraction of what the Silent Majority is actually thinking.

  • http://www.nickpascucci.com Nick Pascucci

    This sounds to me like you’re missing monitoring on your service. A lot of these issues can be detected automatically:

    + Graph generation latency went up? Time to look at the service.
    + User logins dropped below average levels? Something bad happened.
    + Goal creation rates dropped? Better ping whoever’s carrying the pager.

    One of the nice things about having large numbers of users is that you can create pretty strong models in terms of average behavior and alert when things deviate.

  • Ian

    It’s been behaving fine for me. I have that infini-bee off and on but it’s never caused any real trouble for me.