Welcome to Curie's Data Scientist Day!
Thank you for joining us as we celebrate the graduation of our Curie cohort! We’re so proud of all that they have achieved over the last 5 months. Since beginning their journeys into data science on February 3rd, they’ve gained hands-on experience in Applied Statistics, SQL, Python, Pandas, Matplotlib, Machine Learning, Natural Language Processing, Data Storytelling, Git, Jupyter Notebooks, Tableau and Seaborn. They put all of these skills to use to develop their capstone project – a predictive model with actionable insights. As you watch, you can also learn more about each grad by clicking their name to visit their Alumni Portal profile.
Please enjoy our Virtual Red Carpet Pre-Show from 2:23 to 2:35pm as we allow time for everyone to get settled and solve any technical difficulties they might have. The event will start promptly at 2:35pm! If you’re unable to see the video, please watch from one of our social media pages, which are linked below.
We Came, We SAWS, We Conquered
This project attempts to predict the root cause of sewer overflow events in the city of San Antonio by incorporating data from San Antonio Water System (SAWS) and the National Oceanic and Atmospheric Association. By giving the City of San Antonio a better understanding of these events, they can better equip SAWS for such events in the future. It’s important to identify the root cause of these events as the city spends over $100 million a year correcting issues with the sewers.
This analysis incorporates data pooled from several federal and state sources to determine if socioeconomic factors, such as poverty and access to food, can explain the prevalence of COVID-19 cases in Texas counties. The unique data set allowed the team to focus on smaller counties to help identify their vulnerability to COVID-19. It’s crucial to identify vulnerable counties before their healthcare systems are crippled by this social virus.
Data and Urban Development
This team created a machine learning model that can predict which multifamily housing markets will increase in value over the next two years. Using historical data from the U.S. Census Bureau’s building permit surveys, they’ve modeled future multifamily housing construction and identified emerging markets. As a real estate developer, being established in a market before it booms is key to maximizing returns on investment. However, accurately identifying markets that will experience significant growth usually requires decades of domain expertise. With their research, it is now possible to know with a high degree of accuracy which markets will turn a profit over the next two years. Additionally, this knowledge will help stakeholders mitigate losses and opportunity costs.
This machine learning model will predict the probability of survival in the Intensive Care Unit (ICU). With global data on over 130,000 hospital ICU visits derived from MIT’s GOSSIS data community initiative and the Harvard Privacy Lab, this model would allow hospitals to prioritize high-risk patients – especially when they are near capacity. Additionally, it would help in identifying unexpected patient outcomes so hospitals can improve clinical decision making.