Welcome to Florence's Data Scientist Day!

Please make sure to register for the event here if you haven’t done so already.

Thank you for joining us as we showcase the talent of our Florence cohort! We’re so proud of all that these graduates have achieved over the last 6 months. Since beginning their journeys into data science on March 15th, 2021, they’ve gained hands-on experience in Applied Statistics, SQL, Python, Pandas, Matplotlib, Machine Learning, Natural Language Processing, Data Storytelling, Git, Jupyter Notebooks, Tableau, and Seaborn. They put all of these skills to use to develop their capstone –an end to end data science project with actionable insights. As you watch, you can learn more about each grad by clicking their headshot to visit their Alumni Portal profile.

The event will start promptly at 2:00 pm below! If the video doesn’t auto-play for you, please click the play button.

Play Video


Project Danger Zone

In Bexar County during 2020, with a reported 16,780 reported injuries and 200 deaths, there were nearly 50,000 motor vehicle collisions. Using vehicle crash statistics for 2021, Project Danger Zone is working to discover the drivers of increased injury rates among motorists with an eye toward providing insights and recommending data-driven action to appropriate agencies, such as TxDOT, DPS, and local governments, in effort to minimize loss of life and injury and save tax payer dollars.

Houston, We Have A Pay Gap


Despite being a touchy subject, salary figures are a critical data point for organizations who value diversity, equity, and inclusion. Using data acquired from The Texas Tribune, our goal is to create a regression model that predicts a government employee’s annual salary based on demographic information. In doing so we will provide a methodology for companies and organizations who seek to analyze their own salary data. Possible outcomes of this type of analysis include: eliminating arbitrary biases in salary decisions, improving employee satisfaction, and reducing turnover.

Tech Blues: Mental Health in Tech

This project uses mental health survey data from Open Sourcing Mental Illness to find drivers for mental health issues in the tech workplace. We used general best practices for data science to explore and analyze survey results. We then used our results to provide actionable business solutions for employers to help increase the mental well-being and quality of life of their employees.

Attention Walmart Shoppers

The goal of our capstone project was to leverage historical Walmart sales data and macro economic indicators into a predictive forecasting model that would enable a stakeholder to accurately forecast sales demand one week in advance. In our project, we converted the time series data into a regression problem and identified key weekly sales drivers through various feature manipulations and engineering. We constructed multiple machine learning models, but ultimately found that our Polynomial Regression model, which leveraged a Recursive Feature Elimination, performed the best. Based on our RMSE scores, we were able to outperform our Baseline (predictions equal to last year’s sales numbers) by roughly 29% with an R-squared of 0.987.

Impressed? Start hiring today!

Find a great candidate to interview? Let us know!
Need some guidance? We can help you find the best fit for your team, for free!
Email partners@codeup.com and you’ll have a new data scientist in no time!