Harvard Business Review named Data Scientist the “Sexiest Job of the 21st Century,” Glass Door ranked it the #1 Best Job in America, and LinkedIn placed it in the top 10 most in-demand skills. But to many people outside of tech, the term data science doesn’t mean much. Welcome to Codeup Data Science, teaching you the truth beneath the hype. You might not know it, but you actually interact with data and the outputs of data science every day!
First of all, data science is a method of providing actionable intelligence from data using math, statistics, programming, and business expertise. Like any scientific method, it involves gathering data, identifying a problem, forming a hypothesis, and running tests. More specifically, data scientists follow a process of gathering and cleaning data (wrangling), investigation (exploratory data analysis), building automation using machine learning (feature engineering, model development, and deployment), delivering results (visualizations, reporting, storytelling), and maintenance. Practitioners typically spend 70-80% of their time in the wrangling/exploration, 20% on machine learning models, and the rest in maintenance. Most importantly, this whole process should result in a valuable action or insight for the end-user, i.e. a business or customer!
At its core, data is just information – names, dates, times, $$, etc. Data scientists work with large collections of this information to draw conclusions. For example, they might use financial data to predict the seasonality of revenue generation or use the events of applications (like logins, clicks, or downloads) to detect security threats or fraud. Data science typically deals with ‘Big Data,’ which is too large and complex to manage on a local computer (see 4 Vs of Big Data for more). People interact with and create data like this every day: using smartphones, buying houses, rating movies, and more. You can thank data scientists (and the teams supporting them) for guiding you to your favorite Netflix series and helping optimize your workouts.
There’s always more than one way to eat an oreo, and data scientists use dozens of different tools. At Codeup, data science tools include Excel, Python (programming language), SQL (databases), Hadoop (distributed data), Jupyter Notebooks (virtual notebooks for doing data science), and Tableau (visualizations). On top of that, they work with concepts from math, stats, and programming to write functions, create charts and graphs, and model patterns.
Data science isn’t magic, and it isn’t rocket science either. At Codeup, data science is a skill set just like web development. The need for this discipline across all industries in San Antonio is rapidly growing and looking for new talent. If you’re hungry to join the data revolution, reach out to us today!