Data for Good
Imagine this scene. A man with an iPad is standing in small plot of maize, talking with a farmer in Odisha in northeast India. The plot is the farmer’s only source of income. In a good year, he can earn just enough to feed his family but not enough to live above subsistence level.
The farmer is nodding yes to a series of questions. Are you using herbicide we discussed? Did you use the hybrid seeds? Are you irrigating? The surveyor eyes the plot. The farmer’s answers seem to be truthful, given the health of the crop. The man with the tablet enters the answers and other data into a software program and tells the farmer to look for a text on when the best time to harvest will be to get the best price.
The man with the tablet works for eKutir, a social enterprise that is turning everyday data into information that both helps farmers across India increase yields and prices and helps lenders judge a farmer’s credit-worthiness. These smallholders have rarely taken out loans, making it hard for financial institutions to judge risk and harder still for the farmers to break out of poverty.
Now imagine another scenario, this one in a village in Uttar Pradesh. The team at Simpa Networks is reading applications from local families for a small solar panel. One family wants the panel, which generates electricity for their home, so their children can study in the evening. In another case, a woman wants to use her sewing machine at night to add to her income.
If chosen, the family can “unlock” the stored electricity by making small payments as needed each month. After a few years of steady payments, they will own the panel outright. Simpa Networks, a for-profit technology company, teamed up with a non-profit that connects mission-driven organizations with data scientists who donate their time to work on pro bono projects. The data scientists developed an algorithm that uses past data to predict the future, and judge which are the best bets. (Disclosure: MasterCard Advisors funded the project; read the full case study.)
Their question: are people predictable, and if so what are the “tells” we can use to judge whether they’re good bets or not?
In both scenarios, both poor rural families want to access services to improve their standard of living and their productivity. In both scenarios, the data teams are working to collect data (not an easy task in itself) and use it to assess risk. Both are arriving at interesting findings.
Lesson #1: Behavior is a good indicator of who is most likely to be a reliable customer
For the solar customers, making the first three payments on schedule increased the odds that the family would repay. For the farmers, being truthful in answering the questions was a good sign. “If he’s listening to the advice on herbicides and such, and acting on the recommendations,” says Michael Turner, president of PERC, a nonprofit organization working with eKutir that uses information solutions to drive financial inclusion, “then that says a lot about the farmer.”
“You follow instructions, you see a bigger picture,” he says. “This notion of compliance—it’s actually where a lot of data analytics is headed.”
Lesson #2: Household assets alone are not always a good indicator
“Very few asset-related things were good predictors,” says Kush Varshney, a researcher at IBM’s TJ Watson Research Center and co-director of its new Social Good Fellowship program. Varshney volunteered for six months with DataKind for the project. Rather than income or assets, he thinks motivations might be a better clue. Families who owned a battery pack, for example, typically have more money, since the packs are not cheap. But they were less likely to repay. “Probably,” he says, “it’s because they have an alternative source of light at night, so there’s less incentive.” He and his colleagues published their findings in a working paper (see Table 5).
Lesson #3: Success relies on collecting the right data
Both teams are confirming the adage in information tech: junk in, junk out. Capturing the right data upfront will be key to success. Simpa Networks has worked to fine-tune and train its algorithm based on its early findings, says Varshney. On a scale of one to ten, Varshney’s estimate is that the model is a seven or eight for accuracy. The model reduced the share of poor risks from 18 percent to 12.5 percent. With better data, predictions will get better as well.
For Turner, that means paying more attention to how information is collected and converted into data. “The lesson is that there needs to be more focus digitizing the data, identifying where financially excluded people transact, and then making sure the point of sale is captured digitally [not just manually] and the data is accessible. There’s been a lot of fascination with the apps and platforms, but digital financial services will remain stuck unless they have access to high-quality predictive data.”
In a world where “big data” seems increasingly ubiquitous, the projects point to the untapped potential to harness data for social good. “Even in very rural, disconnected parts of the world, these advanced technologies actually do make people’s live better,” says Varshney.
Featured image photo courtesy of Simpa Networks.