By Dimitri Antoniou and Maggie Giust
Data Science, Big Data, Machine Learning, NLP, Neural Networks…these buzzwords have rapidly spread into mainstream use over the last few years. Unfortunately, definitions are varied and sources of truth are limited. Data Scientists are in fact not magical unicorn wizards who can snap their fingers and turn a business around! Today, we’ll take a cue from our favorite Mythbusters to tackle some common myths and misconceptions in the field of Data Science.
Myth #1: Data Science = Statistics
At first glance, this one doesn’t sound unreasonable. Statistics is defined as, “A branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data.” That sounds a lot like our definition of Data Science: a method of drawing actionable intelligence from data.
In truth, statistics is actually one small piece of Data Science. As our Senior Data Scientist puts it, “Statistics forces us to make assumptions about the nature of the relationship between variables, the distribution of the data, etc.” In the traditional Data Science venn diagram, you’ll see that math/stats make up ⅓ of a working professional. These are tools and skills to leverage, but data science itself is about drawing intelligence from data.
Myth #2: Data Scientist = Business/Data Analyst
This one is so common that we wrote a whole post about it! These are separate and different roles within the data field. While a data scientist will often do analytics, their spectrum of work is wider. A data analyst will use structured data to create dashboards and KPIs, while a Data Scientist deals with unstructured and messy data for a range of outputs. If they’re interested, business analysts will often progress to data scientists.
Myth #3: Data Science = Data Science
This one’s tricky, because it’s impossible to either confirm or bust! The ‘myth’ is that one person or company using the term Data Science is not necessarily the same as another person or company using the same term. Depending on organizational capacity, individual experience, educational background, and many other variables, we might be using the same name for different animals.
Tl;dr: don’t assume a common understanding across hiring managers, recruiters, and practitioners. Look instead for specifics of tools, techniques, methodologies, and outputs. That being said, this one falls in the “plausible” category, because it may actually be true in some circumstances, while false in others.
Myth #4: Data Science curricula are well-defined and consistent.
We recommend checking this one out for yourself! A quick google search for bootcamps, master’s degree programs, and online courses will reveal that different organizations teach different things. There is no commonly accepted framework for teaching data science! Some focus more on the engineering, others focus more on machine learning, some think deep learning is foundational, and some prefer to use R.
Our curriculum was built through employer interviews, practitioner interviews, market research, and company partnerships. But we’re based in Texas! A bootcamp in New York might follow the same process and end up with a different syllabus. Keep in mind, whatever your learning path, that there will be gaps in your learning. The most important thing is to recognize those gaps.
Myth #5: If I want to be a data scientist, I just need to learn Python or R.
This one is common and dangerous! Just like statistics, programming languages like Python and R are tools. They’re just pieces of a larger puzzle! Knowing Python without understanding the data science pipeline is like knowing how to build a floor without having a floor plan. Of course, these are valuable technical skills that give you a leg up, but they’re second in importance to asking the right questions, knowing what tools to use when, and communicating your findings.
Still have questions? Reach out to us! We’re always here to help.