The mods are presently talking about how to deal with gatekeeping on the subreddit. We want to make a fair and precise rule for what is and what is not allowed. Your patience is appreciated.
I'm leaving the thread open as it seems you guys need to get some things off your chest.
26 more replies
Wait can someone tell me how to get a PhD salary with a PhD?
Sell your PhD certificate, then kill yourself once the money runs out. You will have earned a Phd salary for the rest of your life.
Hahaha. Thank you for making me genuinely laugh in the midst of this serious and kinda depressing conversation.
And by PhD salary, you’re talking about the NIH minimum, right? Isn’t it a whopping $40K now?
Be careful what you wish for.
You could do novel work that leads to publications/patents even without a PhD. The impact and value you can demonstrate in your track record define your salary. Being attributed to a widely used technique to solve X problem speaks far more about your value than getting a PhD with a thesis/publication that no one aside from the advisor has read.
Write and patent an algorithm that saves or makes people money.
Looks like a PhD salary is about 50k these days--those crazy high-rollers! https://grants.nih.gov/grants/guide/notice-files/NOT-OD-19-036.html
That's for academia and doesn't even consider field of study (NIH grants are primarily for medical research, i.e. PhDs in medicine, biology, neurology, etc. rather than CS/Stats). Look at "Research Scientist" salaries at tech companies. Glassdoor gives most ranges as around USD$120-170k, (I actually expected more like $170-250k, maybe that job title isn't specific enough to denote a PhD requirement).
(I actually expected more like $170-250k, maybe that job title isn't specific enough to denote a PhD requirement).
$250k is highly unrealistic as a base salary for all but an elite few with major name recognition in their field. At that level, a good chunk of comp is usually going to come in the form of stock options that do not count toward base salary.
base salary
Whose talking about base? Why wouldn't we be talking about total comp?
2 more replies
Go into the pharmaceutical industry. Keep publishing in peer reviewed journals or you're going to have a tough time migrating towards that industry. Be good. The state I live in publishes all the salaries for workers online. I saw one statistician I knew earning about 170k per year and had tenure, then he joined big pharma industry - state doesn't pay as high as the private sector.
Spend the time you would have spent on a PhD working and moving up the echelons.
Here ya go, its as dumbfounding as it is hilarious
I mean, it does work if your goal is to increase the p-value, but that's about all it does
What the hell? I want to believe that there is a miscommunication between him and his manager because that’s more comfortable.
I just spent about 10 minutes trying to understand that question. At first I was embarrassed because I couldn’t understand what was the problem in sorting your data (not that it would make any difference, but at least it shouldn’t affect regression).
It was only after seeing the examples that I realized that people were talking about sorting X values and Y values “independently” i.e. making up new data so that any relation becomes a positive linear relation.
It never even crossed my mind that anyone could think that makes sense. It would be like trying to make a horse drink gasoline when it’s tired. Actually, that probably still makes more sense that this.
I needed to sigh, close my eyes, and take a few deep breaths after reading that.
Does this mean what I think it means? Literally separating your outcomes from your predictors by sorting them separately?
I think I get it but the idea is so dumbfounding that my brain is like this can’t be it, there has to be a smarter interpretation to this.
nope, it's that dumb. It was a stack question.. it's linked a couple comments above yours.
Reminds me a bit of the manager who sorts his X's and Y's seperately to get a better linear regression
You just don’t appreciate that manager’s hustle at getting results you gatekeeper/s
Honestly the top responses are almost as troubling... The only right answer here is "don't fucking do that"
I think this whole discussion is missing the far more predominant category of Data Scientists: people who have an MS or PhD in some highly specialized field but didn’t wind up continuing into academic research positions, who teach themselves coding in order to apply their probability and statistics training to more practical business applications. I count myself and every data scientist I’ve contracted with in this group, and it’s my distinct impression that the way the field got started was in fact with a few HR people taking a chance on people like this instead of straight-up business degree holders, who always had an advantage in industry but were getting overpaid relative to their skills whereas refugees from academia are a bargain because the research job market continues to suck. The true would-be gatekeepers are the other HR people who never understood this and now demand that everyone being hired for a business analytics role have a masters or PhD in computer science when the statistical training you get in almost any other advanced degree is way more important for understanding inference from data and predictive model-building.
Edit: my first gold! Thank you, kind Redditor, whoever you are...
I am currently on this track. I have a masters in physics and a job in industry where I can use minitab to supplement my learning while I teach myself python. My current position is more of an engineer/project manager role, but I've already discussed transitioning into a data science role over the next three years and my boss is supportive.
This starter pack sounds like it's made by someone just out of school who was super salty when they realized that schooling can be supplemented with job experience. Some of the engineers I work with don't have any college education. They just worked on the floor for 15 years and gained the experience they needed to become engineers. At the end of the day, education is less valuable than ability.
After their first job, many people don’t even put their schooling on their resume
I'm so tempted to just chuck out some of my 'education' from my resume at times ...
My data science team were all at one point in a PhD track for chemistry/bioinformatics
Exactly, people bitch and moan about the DS title without realizing it is not meant to be as well defined because it was historically intended as a workaround to getting HR screener to let the right people have a shot with their transferable skills
"Overfitting? Yeah bro, I know how it feels to hit the gym too hard at the start, but it'll get better."
People in Data Science are really bitter about low barriers to entry. Like any emerging and fast growing industry, those who have put in the most time (years of life) and resources (money for degrees, special certifications/trainings) are trying to erect higher barriers to entry to protect themselves.
If it were up to the “real data scientists” they would create an “American Association of Certified Data Scientists” that sets up the same sorts of barriers that we see in other established professions (teaching, medical, law, hell even hair styling).
If it were up to these guys you would need the right “pedigree” and have to jump through the right “hoops”, get all kinds of formal education, invest thousands in becoming “certified.”
Data Science is a great field because it’s growing and relatively not-established. If you have skills, show me and I’ll give you a job. No need to kiss any rings. Just prove you can play and bring value to the person paying you.
Don’t be bitter because you are having to compete with Data “plebs”. And the data “plebs” are winning and making a path for themselves. Don’t hate and moan, appreciate the hustle.
I think this division has become exacerbated by the vague definitions of what a data scientist is, at least in industry. It can be frustrating coming from an academic research background and contributing value through the design of novel network architectures, algorithms and providing fully fleshed out statistical analyses to be given the same title / role as someone who knows some SQL queries and can use scikit-learn. These differences are important to someone looking for a fulfilling role that will utilise their skills, on both sides of the fence.
Indeed, the problem does not exist everywhere and I would certainly say that almost anyone could become a data engineer or analyst with the things this “starter pack” includes, the clue is even in the name and it would be my first thought that a data engineer is someone who warehouses, wrangles and deals with data pipeline / processing whilst a scientist is someone more focussed on research etc. But a few years ago this line started to blur; beforehand the data scientist name barely existed and was really just a term for a computational statistician / machine learning research engineer.
Despite this, the majority of people still just seem like they are bitter and shooting in the dark and I welcome anyone to the field who wants to give it a go, part of what makes it great is how broad it is even if it can cause confusion. Like you say if people are finding their own success and some certain people feel it is of detriment to their own success then perhaps the problem is them and not the newcomers.
Perhaps the simplest solution to this is for people to stop valuing themselves by their job titles. Just do good work and everything else will follow.
I agree completely. I only refer to job titles as a vehicle for the expectations of what a role might encompass and the type of work it would involve. There is a lack of clarity from companies on what they expect from a data scientist / engineer / analyst etc. as the lines continue to be blurred for no real reason other than the prestige associated with a job title like you say.
Sure. We can stop once our employers stop setting pay brackets, thereby literally valuing our contributions, by job title...
I don’t think that’s a data science problem; that’s a life problem.
Upvoted you because I agree with the “let’s not have institutions gatekeeping people” argument, I think that ultimately hurts aspiring data scientists. But I do want to disagree with the “appreciate the hustle” of the The boot camp people vs PhD math grads. You say people like the op of this post are bitter because they have to compete with data “plebs” but I’m not so sure about that. There are tiers within data science, like any field and like any field, the more educated/qualified people will get the better roles. I don’t think boot camp people are taking jobs away from post docs, but they’re getting their own foot in the entry level door, which you’re right, we shouldn’t prevent them from doing
Quick edit: I do dislike the broadening of the DS term to include every SQL programmer and their mothers
I was a bit impassioned so I get what you are saying. I do agree that there are certainly tiers in the field, but when it comes to entry level, I’m sure the specialized major people are not too happy when someone who learned on YouTube landed a data science job.
Data science / analytics should all be about delivering value to the person who pays you. If you can deliver value and do what I need you to do, I don’t care if you went to a top University, went to boot camps, or taught yourself on YouTube. In fact, if there is any semblance of “training” and a “team to help develop” I’ll take the YouTube guy. Shows he’s a self-starter and willing to learn. Also will probably be able to pay him less because he’d be willing to get his foot in the door.
People coming out of school with the pedigree expecting 70-80k for jobs that at most require easily taught ETL functions and mid level query writing with pivots, CTEs, Stored Procs then visualizing in a BI tool. I can teach this to someone on 3 months.
But yes, if the position is more strategic, more project-Analyst like, then I would want a more experienced analyst who has a more comprehensive understanding about how data flows through the org and can imagine creative solutions.
And call yourself the best data scientist west of the Mississippi if that makes you feel better inside. I’ll even get you a little trophy that says “Best Data Scientist.” I don’t care what you “consider yourself.” Your going to be an “x” for me and I need you to do “y”. Fair? (Speaking rhetorically, not at you)
If you can run a linear regression on weather and ice cream sold, you can save an ice cream store hundreds of thousands of dollars costs. People have a really hard time understanding the fact that you don't need to be vectoring for loops to deliver value to an organization. As long as you can save them(or make them) more than they will pay you, you can get a job in data. Not everyone has to work at OpenAI...
I agree with the general spirit of this post, but...
>Using a logistic regression to predict sales volume
Can you elaborate on this? Or is your complaint my lack of specificity?
Edit: nvm, I think you mean I should have said linear regression? My bad, can edit the post, just had logistic regression on the brain
The guy who went to a top university is more likely to have the math fundamentals and scientific method skills. Doesn't mean the bootcamp or youtube person do not have it; TBF I would probably interview all 3 and pick the best one.
The problem of this assessment about people being “bitter about low barriers to entry” just to create “sorts of barriers that we see in other established professions” is that some of the work done actually has an impact on the health and livelihood of populations, which is beyond the “Business Analytics” listed. For example, medical students jump through hoops to become certified medical doctors (MD) in their subspecialty. This is to ensure the integrity of the professions in medicine (VERY IMPORTANT, I CAN EXPLAIN HOW THIS IS EVEN ABUSED). MD’s are required to go to seminars and courses to obtain their continuing education units (CEU) for periodic recertification regardless of whether they own a private practice or in a major academic hospital - this keeps them current. The academically inclined MD’s publish papers in medical journals on their findings from seeing patients - this guides and pushes their profession forward, especially when new methods and treatments discussed - even the docs that publish need their seminars for their CEUs. The problem is that sometimes “bad” science, or badly performed studies can be published by doctors. A MD publishing research using statistics can do so under the guise of their prestigious title - by itself it carries clout, but has no designation for any other educational experience. For example, I know someone (sorry, need to be vague) that is also a data scientist called an epidemiologist with a Ms. in public health (MPH). She worked with a subspecialty of medicine (focus on one type of disease, which later that company expanded on related disease types) that was at a cross between academics and the pharmaceutical industry - specifically they owned a proprietary data set with no ID identifiers that was collected through the many efforts of doctors. While pharmaceutical companies have 7 figure contracts with this company to access the results (they can’t use the data directly, the analysts from both companies work together for research) the contributing MDs have access to this service for their own publications. Often, these MDs have no idea about statistics, or how to set up a proper study, so epidemiologists and biostatisticians doing the analysis collaborate with the MDs to investigate and form a proper study and to publish a peer reviewed paper. (dirty little secret - right before conferences, the analysts inform the MDs how to discuss the methodology of their results from population sample selection to the type of statistical modelling and parameters of this model) The MD gets first authorship on the paper (most of the credit), the other contributors and analysts also get authorship, and the data is referenced - (another dirty little secret - it often takes a while for the docs to come up with a viable study as the analysts do most of the leg work in figuring this thing out). (quick side note, I would also be curious about the integrity of the data upon reading this since no one outside the company has direct access, but verification was in the comparison of results with other datasets in the peer reviewed publications using the same methodology - that was at the beginning, now they have to adhere to FDA regs because of their work in pharmacovigilance) So, everyone is happy and the MD has published and looks active. Sometimes, an overreaching MD with a pedestrian knowledge of statistics and study design try to publish in a peer reviewed paper (without the resources from a team of analysts and a huge dataset that took over 10 years to accumulate like in the previous cases), and they succeed. In 1998, Dr. Andrew J. Wakefield, M.D. and a few others published “Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children”; they “found” that the “onset of behavioral symptoms was associated, by the parents, with measles, mumps, and rubella vaccination in eight of the 12 children,...” and they interpreted it as “developmental regression in a group of previously normal children,...” Since he was a board certified MD, and the implications of this finding was huge, the media outlets were able jump on this and broadcast this finding unknowingly giving credence to a much smaller anti-vaxxer movement’s fears - bad study from a person of authority. A little over a year later, the CDC began the dismantling of A.J. Wakefield’s study by using their own extensive resources from the “Vaccine Safety and Development Branch” - Frank DeStefano, MD, MPH,... lead the charge. And what followed were publications that poked holes at his study design completely discrediting his work. This lead to Wakefield losing his accreditation - at most these days he can publish on his blog to his followers. But the damage was done, and here we are with countries like the US, France, and now Japan facing measle outbreaks. It did not help that we had celebrities like Jenny McCarthy (a mom trying to figure out why her son has autism) were used as pawns to push similar agendas of other quack doctors with no controlled-study experience like Jake Gordon, MD, … - now he’s another one who has changed his tune and is urgently advocating vaccinations. My point is that there is a reason why there are these educational and experience barriers to some of these professions - they carry authority in their findings. We can’t trust off-the-street data scientists to share the responsibility with actual health professionals as it would diminish their voice - we are in the middle of observing the result of the misuse of this authority (measles outbreak). Aside from the cost, part of the education that an analyst with an MPH often requires at least one publication in a peer-reviewed journal and a thesis paper.
Although, that does not mean that there is no room for a self-made data scientist working on projects in relation to those fields.
Also: An epidemiologist/data analyst bouncing around subspecialties can show their worth with their publications in peer reviewed journals
An data analyst or mathematician with a phd in statistics can prove their worth in those subspecialty fields by publishing results in peer review journals - body of work on a CV works as credentials.
Entry into marketing / business analytics is not as strict as healthcare policy and pharmaceuticals industry that include study design and pharmacovigilence of their drugs out in the market Those bitter people might be upset that they’re working alongside with others with non-conventional career paths in business, but in the pharmaceutical industry that is rare and the pay reflects high demand for people with expensive educational credentials and experience (for example, that epidemiologist screwed up on negotiation and was getting underpaid at $250k/ per year + annual bonuses that were often between $50k-100k)
I really take your overall points, which I see as MDs are not statistical experts, and that sometimes a little knowledge and lot of motivation can lead to disastrous results.
All that said. No amount of education or experience makes someone immune (see what I did there) to making mistakes in research, or the desire to show positive results in business. The most experienced data scientist or business analyst will still have pressure to perform and deliver certain results. Anyone can make slight modifications to their practices in order to increase their apparent predictive ability.
That said, I think the more you understand something, the more you should be able to see your own fallacies.
The thing is usually the start of analysis like a rough draft. There is the data, and start to consider different parameters, the type of modelling, and eventually it's polished. Some of these takes weeks or even a months to finish especially when there are a few of these being done concurrently. In this line of work there is a lot of collaboration and back-and-forth. For stuff like pharmacovigilence, these pharma companies are paying too much for mistakes to be made - this can be really bad as it could lead to multiple fatalities.
I found this for you. NIH PHARMACOVIGILANCE A challenge is flagging events which data mining. It's a huge thing with pharma companies. This identifies some of the challenges for a drug company's tracking of their product use.
My point with MDs, and even epidemiologists that work at the CDC is best said by Uncle Ben, "With power comes great responsibility." - their positions carry authority and this reason for an educational and experience barrier. No one is error proof, that is why these collaborations take a while but where something hasn't been tested, results would not be published. They are thorough. And sometimes there are new things in their data that hasn't been tested, but they make sure what they publish is correct. A boss of mine once lamented about what is being taught at schools with "they teach you that 90-95% is great, but that means you're fucking up 5-10% of the time." It took 1 bad paper to catalyze the anti-vax movement leading to outbreaks. On a different topic, but same point - it takes 1 terrorist attack to slip through in the US and the damage is done, people are afraid, mourning, and death gets plastered all over.
Pharmacovigilance
Pharmacovigilance (PV or PhV), also known as drug safety, is the pharmacological science relating to the collection, detection, assessment, monitoring, and prevention of adverse effects with pharmaceutical products. The etymological roots for the word "pharmacovigilance" are: pharmakon (Greek for drug) and vigilare (Latin for to keep watch). As such, pharmacovigilance heavily focuses on adverse drug reactions, or ADRs, which are defined as any response to a drug which is noxious and unintended, including lack of efficacy (the condition that this definition only applies with the doses normally used for the prophylaxis, diagnosis or therapy of disease, or for the modification of physiological disorder function was excluded with the latest amendment of the applicable legislation). Medication errors such as overdose, and misuse and abuse of a drug as well as drug exposure during pregnancy and breastfeeding, are also of interest, even without an adverse event, because they may result in an adverse drug reaction.Information received from patients and healthcare providers via pharmacovigilance agreements (PVAs), as well as other sources such as the medical literature, plays a critical role in providing the data necessary for pharmacovigilance to take place.
[ ^PM | Exclude ^me | Exclude from ^subreddit | FAQ / ^Information | ^Source ] Downvote to remove | v0.28
To put it another way, accreditation is not gatekeeping. Or maybe it is but its good gatekeeping
I'd say it's more like accreditation can be good gatekeeping, but it isn't always.
Nah, I wish I check each of test grades of a barber before getting a cut.
That cert the state makes them display at their booth gives me the confidence I need.
7 more replies
100% agree. I also think the criticising of ones credentials fails as a valid argument for one’s ability. Even within the well-established DS community (i.e. Gary Marcus vs Yann LeCun) this is apparent.
I always thought one of the strengths of the industry or field is that it accepts people from various degrees, thus making it more diverse in the sense of viewpoints and perspectives. Seems like the industry has to self-correct or create its own balance between accepting people from diverse fields (like medical, stat, math, engineering, heck even psychology or sociology) without being too inaccessible.
1 more reply
I think the issue is less about gatekeeping and more about how data/ business analysts present themselves. Every analyst is referring to themselves as a data scientist and I think that’s what the harm is, not that theres bootcamp people trying to get in the field. By all means, take the jobs you qualify for with your skills but for God’s sake please stop calling yourself a data scientist because you work in excel all day.
On the other hand, it would be nice if places like Stack would allow the option to display Data Analyst as a title because I do not have a background in math. I enjoy being an analyst.
If everyone’s a data scientist than it doesn’t really mean much. I think it takes away from the profession and field if it’s over saturated with people who are not truly data scientists. Also if companies are just starting to build out a data science team and hire an analyst thinking they can do real data science, that’s going to negatively impact that company.
People who have a PhD are not really looking for Data Science jobs, they are either in academia or at least doing some kind of research in the industry or are at least looking for an actual research job. The PhDs and the data "Plebs" are not really competing for the same jobs, so i don't think they are the ones who are bitter. I think it's the slightly more experienced data "plebs" that are bitter.
“...slightly more experienced data “plebs” that are bitter.”
Yeah, and there’s a helluva lot more of them than PhDs.
IMHO, some people are bitter. Some of them are PhDs, some slightly more experienced plebes, and some are newbs. I’m not sure if the bitterness is caused by experience or degree; it’s a temperament.
Given the frequency of each of these classes, I think that the most common bitterness comes from the more experienced data plebs, simply based on there prevalence in the population.
The best data scientists I know don’t have that chip on their shoulders. They’re just excited about this stuff.
People who have a relevant PhD*
People that realize there really isn't a job market for their field except becoming a highschool/community college teacher or slaving away as a post-doc on noodles for 10 more years and hope for tenure track. These people flock to data science because they did some matlab/SPSS/R/numpy work and think they're better than anyone else and quite frankly there's nothing else what they could do.
People with a relevant PhD which is basically applied statistics or computer science don't really go for data science jobs. It's beneath them and a waste of their knowledge to clean data or do set up pipelines. You're far more likely to find them in management positions or something highly specialized such as machine learning engineer positions.
If you look at companies with big data science teams, they're filled with PhD's from fields that are barely relevant and people with software developer backgrounds. Computer science PhD's and applied statistics PhD's are usually absent because they're not called data scientists to distinguish them.
For some reason people think having a PhD instantly makes you qualified. It doesn't. Which is why it's getting harder and harder to get your foot in the door in this field. 5-6 years ago you got a job when you could do basic hypothesis testing and today you'll have to pass the same coding interviews as every other technical employee.
The quality of data scientists skyrockets once you start testing their ability to code well. 99.99% of data science work does not require anything beyond those 2-3 courses on coursera and it's easier to teach a software developer to do data science (they already have linear algebra, statistics, calculus, information theory as part of their education) than to teach someone else how to write code.
If you're thinking in becoming a data scientist, spend 90% of your time just doing programming courses and your computer science fundamentals and do those first. You learn by doing and the only way to learn data science is to write code. If you're not proficient at writing code, you'll be spending most of your time making mistakes and trying to figure out basic programming stuff instead of learning what the course is about. It's like signing up for an ice hockey course when you can't even skate.
WTF? Lol a PhD in mathematics can get a good damn job homie in just about any Quantitative field where there are actual barriers to entry! ...And odds are they'll probably be suited at designing an actual Algorithm!
It takes dedication to get a PhD and passion, don't hate and moan, appreciate the hustle ....lol
Yeah I really don't understand all the gatekeeping and "no true scotsman" (or I guess "no true data scientist") fallacy here.
There's no such thing as a true data scientist. Its a vague title, and that's part of the problem.
Somewhere in this debate is the practicality of the data engineer vs the perfection of the data scientist who misses the critical practical inputs. Not a trivial issue.
4 more replies
Online bootcamps and courses are great resources to learn data science and machine learning.
Coursera has courses taught by Andrew Ng and Geoffrey Hinton. Their data science specialization is taught by JHU. Udacity's courses are taught by Georgia Tech and Google.
Aside from going over the applied aspects, they go in depth into all of the math in a very rigorous manner. Ng and Hinton's courses have you build many algorithms from scratch in matlab so you can understand it more intimately. The JHU courses include several weeks of courses on statistical inference and regression models.
The courses break the concepts down into digestible videos that you can watch at your own pace and quiz yourself for understanding.
The issue with bootcamps is that any doofus can take it and complete it to get the certificate. But like people who sit through courses and cram the night before the exam to pass the classes, most people who complete the courses don't have the rigor. With a real degree from an accredited university, at least the admissions process will weed out most of the doofuses. This is why most people think degrees are worth more than certificates.
But neither are as valuable as someone who has a portfolio of work who can directly demonstrate their skills and knowledge. MOOCs can be a great way to obtain the skills to be able to complete that portfolio of work.
I can't agree at all with the argument that MOOCs offer the same inherent value as in-person courses at a university (I'll just abbreviate this to 'university courses'), even though this is a pervasive opinion. I don't think this is exactly what you're arguing, but it seems pretty close to me, so I'm going to comment here anyways.
Yes, MOOCs will frequently cover the same material as university courses on the same subject. But to say that material is the be-all and end-all of any course is, in itself, an anti-intellectual opinion, because that forwards the view that knowledge is just a currency to be traded for material goods (data science jobs, in this case). To me, that's a fairly dismal philosophy, especially because one consequence of that worldview is a society where the appearance of knowing things becomes more important than actually knowing things.
I get that this is, arguably, the world we live in, but we don't have to like or agree with that.
Instead, I would argue that in-person courses are far better equipped to teach intangibles (not going to elaborate here because that's a really deep rabbit hole) than online courses, and that university courses which can be easily replaced with online courses are not worth teaching in the first place. Those sorts of courses, be they university courses or MOOCs, serve as nothing more than expensive, glorified textbooks or youtube tutorials.
This isn't to say that MOOCs are useless, or that people shouldn't try and learn the skills necessary for their chosen career. As you say, it's useful to be able to demonstrate your knowledge to potential employers. I'm just arguing that to equate MOOCs and university courses, one must also view knowledge as something needed primarily to make $$$, and that has some pretty unfortunate implications.
MOOCs are awesome to use as preparation into a graduate program for Computer Science, or statistics. It is definitely useful if you have an undergraduate degree that is not strongly related as you're not rushing to learn in one semester everything you should had learned so you can process the more advanced stuff.
Yes, absolutely. To clarify, since I think a lot of people mistakenly assumed I was saying that MOOCs = bad, I was simply arguing that MOOCs are not a drop-in replacement for well-taught university courses. MOOCs can definitely be useful.
What intangibles do MOOCs not cover? There's this idea that MOOCs purely focus on the practical aspects as if it only teaches you to import keras like a code monkey. But that's simply not true.
I have a master's degree in engineering and the MOOCs covered material in similar breadth and depth as the courses at my degree.
Critical thinking, effective communication in both speaking and writing, deep reading, etc. MOOCs are not designed to teach anything beyond the material itself (speaking as a PhD student, like most graduate courses).
Those are some valuable points. A traditional university setting will often have classes with discussions and projects with presentations that can teach critical thinking and effective communication.
The closest equivalent with MOOCs would be online engagement and discussion with the mentors and with peers. Also, if someone is active in online communities, that person can get much more perspectives to critically consider than the discussions in class. While MOOCs don't have an equivalent to class presentations, you are required to write papers and proposals as part of the program and they have very well defined guidelines on how to do so effectively.
Im surprised to see this here. A while back I asked on this subreddit what skills were required to be a data scientists and I got nothing but arrogant responses. A few good ones. So to this this meme just irritates me, the arrogance and egoism. Instead of putting people down why dont you offer some advice, "How to be a good Data Scientists" "Skills you need to be a successful data scientist"
Don't take it too seriously. I'm mainly poking fun at the people that chase the "data scientist" title because they think it will bring them prestige and wealth.
I agree with your sentiment. I think everyone could benefit from being kinder, more patient, and more humble, but I also hope you can understand that it can be exhausting seeing dozens of low-effort "How do I become a data scientist?" questions pop up every day. Oftentimes these questions are asked by people seeking fame and fortune instead of actual knowledge and advice and asked with the minimum degree of effort. It's difficult to give an individual advice to such a loaded question if without knowing more about their personal background and goals. What do you actually want to do? "Data science" isn't an answer to that question.
I think directed, genuine questions are often embraced and rewarded with directed, genuine responses on here and on other communities I participate in.
I'm mainly poking fun at the people that chase the "data scientist" title because they think it will bring them prestige and wealth.
As someone who started out as this meme (and is trying to improve), it's actually a great way to get some wealth. I'm doing the same work with a 20 percent raise and like 1 new skill required.
Lol I got a 60% raise and all I learned was power bi and excel more thoroughly
Find another job. I learned at my previous position and got laid off. Found a new job and I earn almost double than what I was making before.
Best time to look for new work is when you're still working.
Working on it! Actually only have another week at current job, I have notice a couple weeks back. Going to take it easy for a bit before jumping back in.
Congrats! Highly recommend sinking yourself in a course or certification program for sql or python, you can make your life easier as a data analyst/scientist and get jobs that pay a lot more than what you think yo should be making
Keep going though, and good luck
I'm mainly poking fun at the people that chase the "data scientist" title because they think it will bring them prestige and wealth.
Well then I'm not sure you hit the mark. Instead it just looks like you're making fun of aspiring data scientists, especially those who don't/can't jump through the traditional hoops.
And prestige? Since when is data scientist a prestigious title (outside of the DS community)? I think most people making posts about how to become a data scientist are just interested in data science and having a good career. Shocker.
Data scientist has been the 'sexiest' job for the past couple of years on various lists.
Reading the meme I can't get that interpretation; it would be hard for anyone to get thats what you're saying. What I got from it was about title inflation/the field being crowded, and then also the fact that some people think they're a full fledged data scientist after a MOOC while they still need more time to develop skills.
I see. Well I’m in academia and the term “data science” is new to me. We’ve been interviewing companies to get an idea of what skills are needed and it seems to be all over the place. I have a CS background so I’m trying to make the connection between data science and CS and particularly what skills should a student have to be successful. So far all I have is programming, databases and I’m thinking maybe SQL?
It's a little troubling that after interviewing companies 'stats fluency' didn't sort up to the top with 'programming' and 'databases.' I can throw code at data all day long and make all sorts of pretty visualizations, but it's meaningless if I can't justify the methods that I've used.
1 more reply
Key is understanding the “life cycle” of data in a company. Where does it come from? How is it stored in a warehouse? How is it “wrangled” or standardized. How is it queried from that data warehouse? How is it visualized to the end user to provide a meaningful insight?
Then have a basic knowledge of core systems/programs. After that, I just ask a new employee to be willing to learn. If they have that base knowledge, are willing to be coached, and can use google to solve code issues, you got yourself an entry-level Analyst.
Really hope that the people you're interviewing have nothing to do with HR...
SQL is essentially databases. Although it's a language, most relational databases are going to be somehow accessed with SQL. Excepting for "non-relational" databases like MongoDB etc...
Programming can't be generalized, it's specialized programming with a focus on statistics as already mentioned. Things like knowing when to use stochastic methods versus neural networks...when does a problem actually warrant complex analysis versus being solvable by simple regression...
Data Analytics is what most companies need...Data science is needed for industrial scale data flows. For instance GE uses Predix to help analyze digital twins of some machines. And then machine learning to detect patterns in that huge amount of data which can be investigated for improving performance or energy yield. Honestly it could even be argued that isn't so much Data Science as it is Big Data Analytics...
If you want a list of skills, just take a look at these profiles and their skills: https://www.linkedin.com/in/vineetvashishta https://www.linkedin.com/in/dpatil/
I know academia loves their interviews and formal ways of collecting data...but truth is data science is definitely hyped. And so people who will be willing to interview with you are going to more often be people who want to be popular. People who are doing a lot of real impactful work aren't going to be the first ones you get for interviews.
3 more replies
true check the post I did today. do you have any suggestions on how to be good DS though if you got your answers let me know. I'm currently in 3rd year CS Engineering
1 more reply
data scientists were stats nerds before facebook created the title and now they all want to be seen as the uber mensch because they grok regression.
Instead of putting people down why dont you offer some advice, "How to be a good Data Scientists" "Skills you need to be a successful data scientist"
I think it’s because the questions wear people down.
It goes something like “how can I get a good foundation in data science”
A person gives a plan that last a year
Then another or the same person says that is too long how can they do the same thing in half a year
Then another or the same person says that is too long how can they do the same thing in three months
Then another or the same person says that is too long how can they do the same thing in three weeks
I have a non-STEM bachelors. I have taken two statistics class (one involving programming). I have created a decent fullstack JS app with modern frameworks. And I've messed around with Pythons statistics frameworks.
Now I am reading books on Keras (built on TensorFlow) & OpenCV for image classification. It really does not seem very complex, at this level, assuming even a basic background with programing & statistics.
My basic background in statistics (literally just two courses) is handy, as is my programming experience. But this idea that one needs a PhD or even a masters is just silly.
I agree with the 'people trying to raise barriers to entry to secure their positions'. Same thing in software engineering. I really don't see a CS degree as being necessary unless a person is trying to develop some new algorithms or something. I am much more interested in business applications-- i.e. using what frameworks have been developed, creating a product, and selling/marketing it.
If I did study in the future, I think I'd study electronics or electrical engineering a bit, just because I want to learn more about circuitry and renewable energy electrical systems in an in depth level.
Programming is easy enough to pickup. Statistics is pretty esoteric and less practical than it seems in theory (although I understand why government agencies prefer PhD statisticians-- they need credibility & assurance). The combination of the two is awesome though (ML). I can really see how it will revolutionize specific industries which rely on certain types of data (audio, video, imagery, and others, such as chemical analysis or physiological-metric based data)
My basic background in statistics (literally just two courses) is handy, as is my programming experience. But this idea that one needs a PhD or even a masters is just silly.
I think you're right at this moment. But part of the reason for that is a bunch of PhD level people (I'm assuming) already created the basic statistics that all of this is based on. Correctly applying those principles takes intelligence, but not 5 years of math. I think it's entirely possible that in 5 years, employing these techniques will be relatively commonplace. The top paying gigs will go to people who are truly experts in maximizing results from data sets and programmers who can streamline processes to make them run faster for end users.
3 more replies
Here it comes again, the underlying contempt for Analysts.
How do us heathens dare to touch the grandmasters' precious data and try learning some of their tools? How dare we come up with quick practical solutions to fix a business problem although we haven't spent 10 years studying quantum physics.
Heresy!
This is a symptom of academics and PhDs. I honestly hate when a company I'm working with hires a PhD with no prior work experience into a senior position - they have no understanding that you need something that works in weeks, not something that is basically a peer-reviewed solution two years down the line.
BS comp sci / MS business analytics. I thought it was actually a really good background, but yeah... some of those grad students who didn't have technical undergrad degrees seemed a little lost.
Am I the only person on planet earth who studies a business degree and also gets taught R, SQL, Python, Java etc? What good is a business degree in current times without the means to analyze or automate it.
You'd be surprised at how little coding some schools teach. I go to a small private school in Texas and literally the only statistics software that I was exposed to in undergrad was minitab. This is as a Finance major too so I had to take additional stats courses for my undergrad degree. I think R or Python should be a pre requisite to advanced stats, but that requires professors to learn the languages as well.
I’ve been accepted to several MSc in Applied Statistics programs and I find myself very frustrated with this. Loyola Chicago has one SAS course, mandatory for first years. The course is called “Statistical Computing” – yeah, right. Penn State is in love with Minitab (they invented it, after all). My undergrad used SPSS! It shouldn’t be as hard as it is to find a program that emphasizes R and Python, but a lot of programs seem stuck in the past. Boston University uses R and Python, mostly–hoping I hear some good news from them.
104k
Subscribers
198
Online
A place for DS practitioners, amateur and professional, to discuss and debate topics relating to data science.
583,056 subscribers
52,650 subscribers
69,582 subscribers
13,096 subscribers
174,440 subscribers
22,040 subscribers
216,991 subscribers
2,998 subscribers