Why Diversity Matters When Building a Data Science Team
Correlation One recently partnered with the City of San Jose, California, to help the Mayor’s Office on Technology and Innovation (MOTI) achieve their data and equity objectives.
In this series of interviews, we talk to Christine Keung, Chief Data Officer for the City of San Jose, and Shane Wilson, Director at Correlation One, about how they worked together to address the City’s data literacy and diversity goals. Fellows Matthew Do and Casey Kongpanickul share their experience working on data analysis projects to ensure the 311 system is providing equitable access to all of the city services.
San Jose is a Reflection of the Future of America
San Jose’s residents come from all corners of the world: over 40% are born outside of the United States and in 57% of the homes, English is not the primary spoken language. In San Jose, diverse talent fuels the innovation eco nomy at the heart of Silicon Valley, while at the same time, the government is facing the challenge of balancing incredible growth which socio-economic equity.
Big Data Transformation in San Jose
There are two core initiatives for MOTI during the current Mayor’s administration: 1) become more data-driven; 2) become an exemplar diverse employer. Christine Keung explains:
“Through this data transformation project, we are committed to using the city’s data ethically and in ways that drive equitable outcomes for constituents.”
Data Science For All / Empowerment
Data Science For All (DS4A)/ Empowerment is a 13-week applied data fellowship program for students and professionals who identify as Black, Latinx, and LGBTQ+.
It’s free for Fellows to join the program to gain applied data training, connections with professional mentors, and access to job opportunities with hiring partners.
DS4A / Empowerment is at its core a data literacy program — it focuses on training learners with skills that they can immediately apply to real-world problems. Shane Wilson describes why he saw this as an opportunity to help the City of San Jose meet its objectives:
“What Christine explained to me, and her goals of practical data literacy improvements, aligned perfectly with the curriculum that Professor Natesh Pillai has designed and the work that our teaching assistants have done with learners throughout the program: to connect what they’re learning in the classroom with the workforce.
Because this is a program that is exclusively focused on diverse talent, I thought it’s a perfect match in both data literacy goals as well as diversity goals that the government is focused on.”
Meet the Fellows who Joined The City of San Jose’s Diverse Data Science Team
Matthew Do and Casey Kongpanickul worked as a data science interns for the City of San Jose. They were responsible for doing data analysis and recommendations for the 311 system.
“I was born and raised right here in the Bay Area, specifically San Jose. Growing up I attended schools that were predominantly Asian and Hispanic, so I’m no stranger to being introduced to new ideas and new cultures.” — Matthew Do
“I was born and raised in the Bay Area, which is where I am now. When I was growing up, my parents had always owned a restaurant. I’m the first person to graduate from college in my family.
I ended up graduating with a bachelor’s degree in mechanical engineering, but it wasn’t until much later in life that I discovered data science and programming in Python. [It] really sparked my interest and that kind of drove me back to self-learning and academia to pursue a career change to data science.” — Casey Kongpanickul
Eliminating Bias Through Data
Matthew describes their responsibilities at the City of San Jose:
“Diving deeper into our responsibilities as data scientists, we [presented] our findings and analyses to city leaders and those who worked with the 311 system. We were also tasked with giving suggestions [to] improve the system. More importantly, we were to look at the data itself and to how well the city is handling service requests from citizens, looking at the time to close a ticket and to see if a ticket is actually resolved.
Coming into this project, our biggest challenge was not the data analysis itself — surprisingly it was more about dealing with data quality issues. We were dealing with lots of non-values [and] really non-descriptive columns. Lots of our effort and time was spent on setting up meetings with people who worked on the system to give us better data sets and better quality data.”
Creating Equitable Access to City Services Through the 311 System
The team at MOTI understood that while pursuing a big data transformation, they also needed to make sure that benefits are applied to all citizens regardless of background, race, or socio-economic class. Casey explains:
“It’s important that the city is able to meet the needs of all communities regardless of gender, race, or religion. That’s why utilizing the data sets is so important — it allows the city to have some insights on how it can best allocate funds and resources for the various city services to these different communities.”
Christine’s work ensures this is built into their processes at the Mayor’s office:
“We created a framework that helps departments define equity objectives and measure progress over time.”
As they reflect on the work they did, Matthew says:
“I hope our project and findings have made San Jose a cleaner and safer place, as well as giving an insight to city leaders on how they can improve the 311 system to better handle the needs of our citizens.”
“I really hope that all communities of San Jose benefit from this project in regards to getting equitable access to all of the city services.
Second, I hope that there is enough transparency in this project such that the residents of San Jose can really see that the Mayor’s office is taking everybody into account to make sure that everybody has equitable access to all city’s programs.
Lastly, I hope that San Jose can be used as kind of a gold standard for other cities across the nation that are striving to reach this equitable state as well.”
DS4A / Empowerment provides governments one way to make sure that, at the very least, the teams that are building their data-driven products are diverse. This is one of the best ways that biases can be spotted before they become algorithmically reproduced. It’s crucial for every government that’s thinking about big data to ask themselves these questions now.
If you are interested in being part of the Data Science for All mission, and hiring talent from our programs, get in touch today!
*This article was originally posted on C1 Insights.