Building data-powered solutions to human-centered problems.
Data projects only succeed when they deliver solutions that people can actually use. I bring a human-centered approach to data science by leveraging my deep background in humanistics research and education:
Empathetic project design - Applying humanistic empathy to define problems and design solutions that meet the wants and needs of the people they aim to benefit.
Critical data analysis - Exercising critical thinking to detect bias in the data and find creative ways to extract insights, build models, and make predictions.
Professional data storytelling - Drawing on humanistic narrative techniques to tell compelling stories with data that effectively communicate project results and recommendations to stakeholders.
An analysis of accessibility to the regional park system in the San Francisco Bay Area.
Parks are a form of critical infrastructure, like schools and clean water, that provide essential benefits to the health, climate, and economy of their surrounding communities. However, recent research shows that access to public parks remains highly inequitable in major metropolitan areas across the United States. This project analyzes the accessibility of 125,000 acres of parklands administered by the East Bay Regional Park District to the 2.8 million residents living within its service area. By applying UrbanAccess, Pandana, and other open-source Python libraries to various types of data collected from the park district, local transit agencies, OpenStreetMap, and the American Community Survey, my analysis identifies 117 neighborhoods currently in need of better access to the regional park system.
A research project to identify neighborhoods at risk of displacement in the wake of COVID-19.
The COVID-19 recession featured massive unemployment among low-wage workers, exacerbating already precarious housing conditions and putting millions more at risk of losing their homes. The Housing Precarity Risk Model aims to identify which neighborhoods in major metropolitan areas have the most urgent need for assistance and resources from local, state, and federal agencies. I helped the team to forecast eviction rates and other indicators of housing precarity at the neighborhood level by training machine learning models on large datasets from both government and commercial sources.
A volunteer project to make local air quality data accessible to a disadvantaged community.
The air quality in West Oakland is disproportionately worse than in any other neighborhood of the city of Oakland, California. The mission of the WOAQ project at OpenOakland is to make air quality data collected by a local advocacy group accessible to concerned members of the community. I helped to visualize the data by using GeoPandas, Folium, and Mapbox to build prototypes for an interactive web map showing pollutant levels by time and location.
A study of the current state of climate equity in cities worldwide.
In cities around the world, climate hazards from hurricanes to heat waves have a disproportionate impact on low-income households, migrants, minorities, the elderly, and other vulnerable populations. Based on data reported by more than 500 city governments to the Climate Disclosure Project (CDP) in 2020, I developed key performance indicators (KPIs) to measure how well cities are incorporating social equity and inclusion into their responses to climate change. My analysis finds that cities are becoming increasingly aware of the uneven social impact of climate hazards, but this growing awareness has not yet translated into a corresponding increase in action aimed at protecting vulnerable populations.
A web application that uses machine learning to recommend online courses to jobseekers.
Finding online courses that are relevant to a job search usually entails manually extracting a list of required skills from job descriptions and then searching for courses that teach each of those skills. I automated this process by training a machine learning model to directly take any job description and use Natural Language Processing to instantly match it with the most relevant courses available from online learning platforms. I also made the model accessible to users by deploying it as a web application.
A team project to identify areas at risk for the convergence of COVID-19 and other natural disasters.
In areas with severe outbreaks of COVID-19, responding to natural disasters like floods, fires, and hurricanes requires advance preparations. I examined which parts of the country were most likely to have severe outbreaks over the next few months by critically analyzing COVID forecast data from multiple scientific research centers. Combining this COVID analysis with similar analysis on natural disasters, my colleagues and I identified three regions of particular concern and submitted our findings to New Light Technologies, an organization that provides disaster response solutions to government agencies.