How to manage impostor syndrome in data science
What if they find out you’re clueless?
Impostor syndrome is the elephant in the data science lab. Everyone has it, no one thinks other people have it, and no one talks about it.
I’m amazed that more people don’t discuss it openly. I work at a data science mentorship startup where I probably spend 20% of my time helping data scientists overcome impostor syndrome and the self-doubt that comes with it. I’ve seen it hold back more aspiring data scientists and machine learning engineers than I can count.
I’ve learned quite a bit from the hundreds of conversations I’ve had about impostor syndrome, and from what I’ve seen, the best way to overcome it is to understand it. And that’s what this post is all about.
Symptoms and Causes
The strangest thing about impostor syndrome is that it’s positively correlated to technical skill: the most capable data scientists and ML engineers I talk to are often the most self-critical.
At first I found this confusing: why do the most knowledgeable data scientists suffer from the most self-doubt?
Then, I remembered something called the Dunning-Kruger Effect. The DKE is a well-studied phenomenon whereby people who suck at a task wildly over-estimate their level of ability at that task.
For example, if you give people a physics test and you ask them to predict how they did (their “raw score”), and how confident they are in their prediction (“item confidence”), you’ll find that the people who perform the worst tend to think they did pretty well, thank you very much:
Impostor syndrome is the reverse of the Dunning-Kruger effect. Just like incompetent people tend to overestimate their performance, competent people tend to underestimate it.
In a way, this makes sense. If you know nothing about a task, you might not know enough to know it’s hard. Think of all the armchair coaches who spend 30% of the NBA finals shouting at their TV screens about how they’d do a better job than the professional staff of a pro basketball team.
On the other hand, once you’ve developed a solid understanding of a problem, you’re more familiar with the challenges you’ll need to overcome to solve it. You can also see where the holes are in your understanding.
Which brings me to data science.
Yes, it’s worse in data science
Every day, new algorithms, new libraries, and new workflows are being invented. So the number of holes in your understanding will never be zero. In fact, since we’re creating more knowledge each day than any person could possibly learn, the holes in your understanding will keep growing forever. And the more knowledgeable you are in data science, the more likely you are to be aware of that.
Because there’s no way to avoid ignorance in data science, you have to manage it. Here’s the good news: No company has ever hired a data scientist or ML engineer because of what they didn’t know or couldn’t do. Companies hire people for what they can do.
And that’s the key to overcoming impostor syndrome.
How to think about impostor syndrome
If you feel like an impostor because you don’t know everything about data science, start by changing the way you think about what a data scientist is: it’s not about knowledge, it’s about usefulness.
If you can derive accurate insights from data, or make models that are good enough for practical purposes, then you’re useful. You aren’t an impostor. You’re a value-add. Most data scientists never touch neural networks, most don’t do any clustering, and most don’t use KNNs on a day-to-day basis. Not knowing how to do any of these things doesn’t prevent them being useful.
The bottom line is that it’s okay not to know everything if you can do one thing really well. If you can, then you should define your identity as a data scientist with respect to that thing. e.g., “I’m a data scientist who specializes in using tree-based models for regression,” rather than “I’m a data scientist”. You can always define your way out of impostor syndrome by taking a realistic view of the role you expect yourself to play.
What if you *are* an impostor?
This won’t be a popular thing to say, but…
Maybe you are an impostor. Not in the sense that you don’t belong in data science, but only that you might not be job-ready quite yet. Being good at data science is hard, and the bar is rising every year.
The danger with self-diagnosing your impostor syndrome is that it’s an awfully tempting diagnosis: telling yourself you have impostor syndrome lets you imagine that you already have all the skills you need to succeed, and that your feelings of inadequacy aren’t justified. Sometimes that’s true. But sometimes the “I feel awfully ignorant” alarm in our heads goes off for a reason.
So if you think you’re suffering from impostor syndrome — as opposed to correctly diagnosing your own ignorance —a good first step is to ask someone who knows what they’re doing. That’s where mentors and informational interviews can be valuable: line up a coffee chat with a LinkedIn connection, or go to a local meetup.
And when you do, always remember that key guiding principle — it doesn’t matter what you know; all that matters is what you can do.
If you want to chat, feel free to connect with me on Twitter. I’m @jeremiecharris!