Dive Deeper Into Data Science_

Check out our curriculum below!

PROGRAM DESCRIPTION
Codeup Data Science is an 18-week, immersive career accelerator to prepare individuals for entry­-level and mid-level jobs as data scientists, data analysts, data engineers, or other data related positions. Students will learn to collect, clean, analyze, model, and communicate data using mathematics, statistics, and programming through a project-based learning method. Codeup will assist with job placement to help graduates find work within 6 months, or we’ll refund 50% tuition.

 

JOB TITLES
Data scientist, data analyst, data engineer, business intelligence analyst, data architect, machine learning engineer, business analyst, etc.

 

MODULE DESCRIPTIONS

DS-1 FUNDAMENTALS 
(Overview of Data Science) (hrs: 16 lec/17 lab) – Students will learn the fundamentals of data science, including the data science pipeline, machine learning methodologies, and types of questions answered by data scientists. They will be introduced to the tools used by data scientists to achieve their diverse set of goals. These include statistical and mathematical concepts, Python, Jupyter Notebooks, Linux/MacOS Terminal, SQL, Microsoft Excel & Google Sheets, Tableau, git, and Hadoop technologies. Prerequisite: Admission to program.

DS-2 STATISTICS 
(Fundamentals of Applied Statistics) (hrs: 6 lec/7 lab) – Students will learn the fundamentals of applied statistics in Data Science using Excel, including measures of central tendency, tests of significance, common distributions, and variable independence. Prerequisite: DS-1

DS-3 SQL 
(Querying Structured Data Using SQL) (hrs: 17 lec/17 lab) – Students will learn to use a relational database management system, such as MySQL, to gather, parse, clean, aggregate, and store data. They will learn skills such as: reading data from SQL databases, how to perform basic joins, aggregates, and group-bys, write tables, export data, explore database structure/schema. Prerequisite: DS-2

DS-4 PYTHON 
(Python for Data Science) (hrs: 33 lec/34 lab) – Python is a programming language often used for statistical and scientific computing. Students will learn how to use Python to achieve data science goals, including data wrangling, data analysis, data visualization, and machine learning, using packages such as Pandas, NumPy, SciPy, Scikit Learn, and Matplotlib. Prerequisite: DS-3

DS-5 REGRESSION 
(Supervised Machine Learning – Regression) (hrs: 16 lec/17 lab) – Students will be introduced to various regression algorithms and learn why and when to use them. They will learn how to develop a regression model for predicting numerical events. They will build on their SQL knowledge to gather and prepare structured data that exists in a relational database. They will then use Python Pandas to parse and further prepare the data. They will use the StatsModels package to analyze the data, Matplotlib to visualize the data, and Scikit Learn to model the data. In doing this, they will learn skills like indexing, selecting, plotting, and linear regression. Students will also learn methods for evaluating performance of regression models. They will deliver a final report and model predictions from a practical use case in this module. Prerequisite: DS-4

DS-6 CLASSIFICATION
(Supervised Machine Learning – Classification) (hrs: 20 lec/20 lab) – Students will be introduced to various classification algorithms and learn why and when to use them. They will gather data from multiple source types, including csv and SQL. They will learn how to develop a classification model for predicting categorical events. They will build on their Pandas knowledge with topics of grouping and aggregations, computational tools, text data, missing data, DataFrame objects. Building on their Matplotlib knowledge, they will learn intermediate skills related plotting and data visualization. They will continue to use sklearn, focusing on the classification algorithms in this module, to develop models. They will also learn methods for evaluating the performance of classification models. They will deliver a final report and model predictions from a practical use case in
this module. Prerequisite: DS-5

DS-7 CLUSTERING
(Unsupervised Machine Learning – Clustering) (hrs: 20 lec/20 lab) – Students will be introduced to various clustering algorithms and learn why and when to use them. They will learn how to use clustering methods to identify similar groups using Python (Scikit-Learn). They will learn how these clusters can then be used for further analysis and modeling. Prerequisite: DS-6

DS-8 TIME SERIES ANALYSIS
(Supervised Machine Learning – Time Series) (hrs: 17 lec/17 lab) – Students will learn why, when, and how to employ forecasting methods for predicting events over time. They will practice with a practical application and raw data to learn how to develop, evaluate, and improve performance of the model. They will understand the differences in other regression models to time series, such as the concept of time series dependency, accounting for seasonality, and how to effectively split the data into training and test sets. They will develop a time series model using Python and its supporting packages and deliver a final report, a model, and predictions. Prerequisite: DS-7

DS-9 ANOMALY DETECTION
(Detecting Anomalies using Stats & Machine Learning) (hrs: 10 lec/10 lab) – Students will learn methods for detecting rare or anomalous events. They will learn how to account for class imbalance as well as various statistical and machine learning based methods for detecting anomalies. They will practice building an anomaly detection model using Python. Also in this module, the students will be introduced to streaming and unstructured data (e.g. logs), regular expressions, and applicability of data science to cyber security. Prerequisite: DS-8

DS-10 NATURAL LANGUAGE PROCESSING 
(Foundations of NLP) (hrs: 20 lec/20 lab) – Students will gather text data in JSON format from the web using a public API. They will use Natural Language Processing techniques such as word2vec, tf-idf, and n-grams to perform common tasks such as sentiment analysis and topic modeling. They will use Python’s NLTK package or an equivalent to analyze the sentiment of tweets related to a particular subject. Also in this module, the students will learn how to access data available through a public API, such as Twitter. Prerequisite: DS-9

DS-11 DISTRIBUTED MACHINE LEARNING 
(Working with Distributed Data) (hrs: 33 lec/34 lab) – Students will learn how to access distributed data from cloud platforms. They will work through the data science pipeline from gathering of data through the deployment of a machine learning model. They will apply some of previous machine learning methodologies to distributed data using technologies such as Spark and Hive. They will understand the Hadoop framework and how to access data within it to develop data products, such a machine learning model. Prerequisite: DS-10
DS-12 ADVANCED TOPICS
(Introductions to Less Common or Advanced Topics) (hrs: 32 lec/32 lab) – Students will be given an introduction to advanced data science topics such as model pipelines, A/B testing in machine learning, graph analysis, recommendation engines, R, deep learning, deployment of production models to the cloud, and NoSQL databases. They will learn about use cases, key concepts, and resources for diving deeper into these topics. Prerequisite: DS-11

DS-13 STORYTELLING WITH DATA
(Presentation of Data Products) (hrs: 14 lec/15 lab) – The ability to present findings is fundamental to work as a Data Scientist. Students will learn best practices for storytelling, visualizations, presentations, calls to action, and more. They will learn how to adapt and appeal to various types of audiences and the important factors to consider in doing so, such as the design of the visuals and tools used in creating them. Students will use tools such as Matplotlib and Seaborn packages, Javascripts D3 library, R’s ggplot2 library, and Tableau. They will deliver a presentation advocating a recommendation based on findings. Prerequisite: DS-12

DS-14 DOMAIN EXPERTISE DEVELOPMENT
(Adaptable in a Fluid Field) (hrs: 3 lec/4 lab) – Data Science is often defined as the intersection of programming, mathematics & statistics, and business or domain expertise. However, most data scientists will switch industries during their career, and the ability to adapt quickly into a new domain is critical to maintaining the data science trifecta. In this module, students will learn frameworks for learning what domain knowledge is most relevant to their work and for quickly acquiring the skills needed to start adding value. Prerequisite: DS-13

DS-15 CAREER SIMULATION & PREPARATION
(Preparing for a Successful Career) (hrs: 7 lec/8 lab) – This module is spread out throughout the program and covers a variety of skills related to career preparation, including soft skills training such as team-building and communication, and career development training such as resume writing, online branding, and interviewing. We will also teach best practices in project/time management, ethics, big data architecture, and portfolio development in Kaggle, Data.world, and Github. Prerequisite: DS-1

DS-16 CAPSTONE PROJECT
(hrs: 33 lec/34 lab) – Students will work in small teams to complete a real-world capstone project synthesizing the skills learned throughout the course.

 More About the Data Science Program_