Data for Good

How Big Data Can Help Build Smarter and More Inclusive Cities

The sheer volume of data in cities make them a natural hub for technical breakthroughs and innovations that make a social impact – like machine learning.

January 19, 2016

In 2011, for example, New York City announced that it would open its war chest of datasets, aiming to unlock all public data in the city by 2018. The city has since opened more than 1,300 datasets, with more than 600m rows of data, across 60 agencies.

Machine Learning 1

In 2015, the New York Taxi and Limousine Commission (T.L.C.) released data on every cab ride taken since 2014, revealing such astounding details as the fact that 337 cab rides take place every minute. This is already shaping discussions of how to best promote mobility in cities, a key factor in how economically inclusive a city’s development will be.

Commenting on the insights that such data can make possible, Ben Wellington of IQuantNY, an NGO devoted using data to improve public policy, says: “This helps us provide better and more efficient public services. The way we think of street design, traffic optimization, permitting and ticketing will all be improved through data collection and analysis.”

Data in cities are also helping to support technological advances such as the practical use of machine learning. Machine learning is an area of computer science that focuses on developing programs that go through a process of iterative learning as they are exposed to new data.  These programs become adaptive and look for hidden patterns without being explicitly programmed to do so.

Machine Learning 2 crop
City officials in Chicago teamed up with Allstate Insurance Company and Civic Consulting Alliance to use machine learning to analyze and predict the results of health inspections in restaurants. Each year, the city inspects 9,822 food establishments. Unfortunately, its handful of health inspectors are vastly outnumbered by the city’s staggering number of restaurants by a ratio of 470 to 1.  To help officials better target the workload, the city deployed machine learning on more than 100,000 of the city’s health inspections. Within a two-month trial, the city found establishments with critical violations 7 days earlier on average compared with traditional approaches.

Machine Learning 3

Perhaps more remarkable still, in Beijing, an advanced application of machine learning is being tested to predict air pollution a full 72 hours in advance— a promising tool for proactive responses to an environmental issue that, by one estimate, results in 1,000,000 deaths a year in China.

Other developments that support the widespread integration of data into cities’ planning and management include Open Data Census, a tool that allows citizens and government officials to analyze and compare cities and countries around the world on data- transparency metrics, including transportation, pollution and health performance. In addition to machine learning, tools like SocrataHadoop and Alooma can each help governments and policymakers get new data initiatives off the ground by providing the infrastructure and software needed to process and analyze monumental civic datasets.

In many places, the makings of a movement are there. By making data public, cities invite more transparency and scrutiny by public agents, which can then broaden the conventional applications of a particular set of data and allow for new and original uses. And, as these examples demonstrate, private sector expertise combined with the foresight of public officials and the insights of activists and data scientists, can help catalyze a global Smart Cities movement.