Responsible (development) data rule No. 1: know thy data

Yesterday, the Center for Global Development published a data-savvy critique of MCC’s control of corruption selection indicator. They bring to bear some serious empirical analysis, and after reminding the reader that the indicator is a hard hurdle that acts as the sole difference between passing or failing the MCC scorecard for some countries, they raise a number of tough questions about why we use the data that we do. The authors point to the difficulties in measuring corruption accurately, empirical work that shows weak correlation between corruption and development outcomes and the indicator’s slow, opaque relationship with policy reform efforts—and conclude that MCC should deeply question how it can rely on this data as a hard hurdle.

I love this. Seriously.

In January, I promised I would discuss what constitutes a responsible use of data for development or foreign assistance purposes. This is a perfect opportunity to talk about the most fundamental principle: know thy data. 

The CGD paper is constructive because it unpacks what is actually rolled up in the data that we rely on for the corruption hurdle—and it does so objectively and with no assertion that this is particularly unfair to any individual country. Rather, they are talking about fundamental data content and behavior. It's technical and it's detailed. It requires math. It’s the stuff most people would prefer to skip over.

But if decision making about a country rests on that data, and if you care about real progress on the measured issue itself, the math matters. 

I have been working with this data for years now, and understanding what is and isn't measured—what annual composite data can and can't tell us about any one country—has been a critical part of building a holistic approach to investigating and briefing MCC’s Board of Directors on anti-corruption and accountable governance in candidate countries. That’s not unique to this dataset. What we do now is something we would need to do for any new or improved indicator measuring corruption or accountability. 

Which is another reason I am glad to see this paper: It suggests alternative data sources we could look at and is upfront that none of the suggested data is yet available for every country. That isn't just a problem for us. For MCC to use a data set as a hard hurdle—or for others to seriously consider using a data set to measure progress against  global development goals—that data set must actually cover all low and lower middle income countries at a decent (preferably annual) frequency. At present, very few anti-corruption measures or proxies do. That's a subject that—as people debate the possibility of a governance-focused goal on the Post-2015 Development Agenda—the world needs to come back to: Why do we still have the same predictable gaps in governance data? And it's a topic you'll hear more about from us.

In the meantime, we have built a practice around making sure MCC remains a responsible user of development data. If you look at the annual Selection Criteria and Methodology Reports, you will see that the section on supplemental information has grown over time. In 2012, we introduced a public guide to supplemental information that includes reference to country performance on international initiatives (like the Extractive Industries Transparency Initiative or Open Government Partnership) that weren't fully operational when MCC got started. And if you look at our on our approach to corruption, you will see we've built a thoughtful methodology for tracking corruption concerns.

My colleagues and I sincerely welcome the questions raised by this paper and look forward to participating in the conversation.