A place for data science practitioners and professionals to discuss and debate data science career questions.
Why do you ask about garbage collection in Python specially for data scientists?
I have read about it out of curiosity but not something I would expect anyone working in DS to know.
This might be the “harmonic mean” of 2023.
The correct answer is “It’s handled automatically, so I don’t have to worry about it”
I'm curious as well since it's not something I've ever had to worry about.
Thanks for posting. Fwiw, I'm a Principal DS with 20 years experience, and I couldn't answer questions 1, 2, 3, or 13, mainly because I haven't used those particular tools all that much.
Maybe it's domain specific, but as a hiring manager some of these seem higher level than what I'd expect from a junior DS, the one about backprop in particular. Can you share any more about your subject domain?
I'll be hiring a senior DS soon so I might do the same thing and post my interview questions here too
"Describe how you'd estimate the number of gallons of paint needed to paint my house" is ridiculous for a data scientist who is already at 2-4 years of professional experience.
There's value in seeing how someone explain a topic and their thought process but there are plenty of more relevant topics to choose from. I'd immediately think less of the interviewer if I was asked that.
Any advice in general for juniors? In terms of learning, or portfolios / projects. Do you look at portfolios like GitHub or Kaggle?
Having been on both sides of the interview table, I passed a very low percentage of interviews structured like this one, and it's not an interview I'd give to anyone.
The basic problem is that there's a near-infinite scattering of information from across data science. This interview tests whether the happened to bump into similar things as the interviewer, and in similar contexts.
The results are basically random for determining actual competence. You apply to 100 jobs, get 5 offers where things line up, and take one. That's no good for anyone.
I can almost guarantee I have more data science experience than OP, and deeper knowledge.
Good questions require design, analysis, or other complex skills. These, I can answer with 30 seconds with a web search, and even more with gpt. Examples:
-
Pose a question ("We'd like to understand the impact of X on Y") and have the interviewee walk through how they would answer it.
-
Ask for the design of a data visualization
-
Coding / algorithms screen (can the person program?)
-
Etc.
What people know comes out in those. But it's not about whether people ran into a particular factoid or specific quirk anymore.
What role would you recommend a new phd grad to apply for immediately who has lots of python/SE + DS/ML expertise, some internship exp, but otherwise lacks some of the industry fundamentals like sql, tableau?
It’s kind of a weird combination of super specific things like gc (which I have never used or thought about in my 8 years coding in python) and broad questions like “classes in ML” or “volatility”. I like the way you think though, it definitely covers a wide range of topics that I’ve come across in my day to day (other than gc, still confused about that one lol). You mention monitoring but nothing regarding the actual act of putting those models in production, have you asked any questions about that?
How does Python handle garbage collection and memory management?
Why would a data scientist--junior or not--need to tinker with the underlying C code for memory management (assuming that we're talking about the CPython distribution)? I have 12 years of experience in data science and came at it from a background other than computer science. I have no idea how Python manages its memory. I mostly use distributed systems, where the bottleneck is usually network IO, not memory or CPU.
How do you stay up to date on recent trends / how would you bring that to the table here?
Unless you're hiring for R&D positions at Google (or the like), your data scientists won't need to keep up with the latest data science trends. It's completely unnecessary.
If I were to ask you to paint my house; how would you determine how many gallons of paint to buy?
How would you describe the color of the sun to someone who is blind?
Ah yes, these kinds of inane questions. [eyeroll] You put them in the "behavioral" section. What kind of behaviors are answers to these questions supposed to exhibit (other than behaviors indicating that the candidate has 110% had it with your bullshit)?
What kind of company?
This guy just deleted everything??
Thank you for this
Thanks for sharing OP! I’d like to know about the behavioural questions if you’re open to do so? Are the focused on DS or more working in a team, communicating with stakeholders?
For sure. I will put some of those in the post a little later today :).
None DS related :)
Describe to me how you would define volatility as a metric for risk analysis?
Haha if I'm not wrong, you are working in the financial domain, don't you :)
For what kind of junior data scientist position is this? Eg. Product Analytics or Algorithms?
My roles for data scientists are never new grad; I require 2-4 YOE
Otherwise, I do hire for Data Analysts/Data Engineers as new grads
When you're looking for 2-4 YOE, what kind of experience are you looking for? Do you want people with 2-4 years of DS experience specifically, or just 2-4 years of working with data (DA, DE, researcher, etc.)? What about something like experience as a SWE?