Welcome to Florence's Data Scientist Day!
Please make sure to register for the event here if you haven’t done so already.
Thank you for joining us as we showcase the talent of our Florence cohort! We’re so proud of all that these graduates have achieved over the last 6 months. Since beginning their journeys into data science on March 15th, 2021, they’ve gained hands-on experience in Applied Statistics, SQL, Python, Pandas, Matplotlib, Machine Learning, Natural Language Processing, Data Storytelling, Git, Jupyter Notebooks, Tableau, and Seaborn. They put all of these skills to use to develop their capstone –an end to end data science project with actionable insights. As you watch, you can learn more about each grad by clicking their headshot to visit their Alumni Portal profile.
The event will start promptly at 2:00 pm below! If the video doesn’t auto-play for you, please click the play button.
Project Danger Zone
Houston, We Have A Pay Gap
Despite being a touchy subject, salary figures are a critical data point for organizations who value diversity, equity, and inclusion. Using data acquired from The Texas Tribune, our goal is to create a regression model that predicts a government employee’s annual salary based on demographic information. In doing so we will provide a methodology for companies and organizations who seek to analyze their own salary data. Possible outcomes of this type of analysis include: eliminating arbitrary biases in salary decisions, improving employee satisfaction, and reducing turnover.
Tech Blues: Mental Health in Tech
This project uses mental health survey data from Open Sourcing Mental Illness to find drivers for mental health issues in the tech workplace. We used general best practices for data science to explore and analyze survey results. We then used our results to provide actionable business solutions for employers to help increase the mental well-being and quality of life of their employees.
Attention Walmart Shoppers
The goal of our capstone project was to leverage historical Walmart sales data and macro economic indicators into a predictive forecasting model that would enable a stakeholder to accurately forecast sales demand one week in advance. In our project, we converted the time series data into a regression problem and identified key weekly sales drivers through various feature manipulations and engineering. We constructed multiple machine learning models, but ultimately found that our Polynomial Regression model, which leveraged a Recursive Feature Elimination, performed the best. Based on our RMSE scores, we were able to outperform our Baseline (predictions equal to last year’s sales numbers) by roughly 29% with an R-squared of 0.987.
Impressed? Start hiring today!
Need some guidance? We can help you find the best fit for your team, for free!
Email email@example.com and you’ll have a new data scientist in no time!