TREDD carefully: How MCC balances open data with responsible data to maximize accountability and learning
May 27, 2020
Since its inception in 2004, MCC has led the way amongst U.S. Government agencies in evidence-based decision making, transparency, and data sharing, as demonstrated by Publish What You Fund’s Aid Transparency Index and Results for America’s rankings. Why is this a priority for MCC? Because transparency is one tool for achieving accountability and learning, paving the way for more efficient and effective programming to achieve results. Our emphasis on transparency means that MCC publishes a broad range of data and documentation, including information related to financial activities, project-related indicators, and cost-benefit analysis.
MCC also invests in sharing the rich, primary source data, and its documentation, that underlie its independent evaluations. Since these studies often collect personally identifiable information (PII) and/or sensitive data on individuals, entities, or communities, how the data is shared, and with whom, must be carefully considered. For example, a breach of data collected on refugees as part of the independent evaluation could place vulnerable populations at risk of physical harm from hostile groups either in host countries or from the countries they fled. In other evaluations, a data breach could potentially harm certain groups financially, if it became widely known that they had benefitted monetarily from a program and others attempt to target them as a result.
What is TREDD?
Considering these potential risks alongside the benefits of open data has informed how MCC defines open and responsible data and documentation: it must be transparent, reproducible, and ethical data and documentation (TREDD):
• Transparent means that MCC discloses the data and documentation involved in design, implementation, and dissemination of these types of studies. For example, all MCC-funded independent evaluations—such as the Tanzania Transmission and Distribution evaluation—publish their design reports, questionnaires, analysis reports, data, and other information in the MCC Evaluation Catalog.
• Reproducible means that MCC makes it easier to access the data and documentation required for other analysts (including MCC staff, country partners, or other researchers) to reproduce analysis with minimal effort. As discussed below, making the data and documentation available (to the extent feasible) is just the last step, but making sure the analysis is reproducible requires contractors to think early and often about reproducibility.
• Ethical means that MCC and its contractors observe research ethics principles and practices, with an emphasis on informed consent, independent ethics review, and proper data de-identification before data is shared.
Establishing systems and tools for TREDD
To achieve TREDD, MCC has developed systems and tools to ensure consistent evaluation, methodical scrutiny, and careful consideration of privacy concerns. In 2013, MCC created a Disclosure Review Board (DRB) to (i) establish principles and procedures for handling MCC-funded independent evaluations that involve PII or sensitive data; and (ii) review and clear contractor proposals for this type of data sharing in the MCC Evaluation Catalog.
On February 21, 2020, the DRB released the fourth version of its TREDD guidelines. Previous versions of the TREDD guidelines focused heavily on the data sharing component. However, the updated procedures take a big-picture approach to ensure TREDD principles and practices are woven into the life cycle of a relevant study. From design to data collection, analysis and publication, relevant studies that require PII or sensitive data are governed by these guidelines to mitigate risk while providing the most comprehensive and useful information possible.
TREDD Guidelines emphasize three points
- MCC does not share all the data collected. TREDD emphasizes the need to balance open data while honoring confidentiality agreements made during the data collection process. For example, for the Jordan’s Water Infrastructure evaluation, the data collected on refugees will never be shared outside the independent evaluation team due to the strict promises of confidentiality and agreements between MCC and the independent evaluator to mitigate risk of data breach for this vulnerable population. Since 2013, the DRB has identified 35 evaluations for which the data underlying the evaluation can never be published due to similar disclosure risk mitigation efforts.
- Transparency is a necessary but not sufficient condition for reproducibility. Since 2013, MCC has required both the data and the statistical analysis code used in independent evaluations be submitted as a standard deliverable, with the aim to publish both in its Evaluation Catalog. However, the analysis code was often written using identifiable data, which meant anyone wishing to use the code to reproduce the analysis needed access to the identifiable data, not the de-identified, publicly available data. For this reason, the TREDD Guidelines specifically request contractors to keep the data that underlies the analysis as close as possible to the data that can be shared (as feasible). In addition, contractors must now document in their Transparency Statements the extent to which the final analysis can be reproduced using the shared data—such as in the case of the Philippines Community Development evaluation—or not— such as in the case of Indonesia Community-based Health and Nutrition evaluation, which requires access to a restricted-use file to reproduce results.
- There is no one-size-fits all approach for ethical data management and sharing. Every study’s procedures and informed consent statement must be tailored to the unique context, vulnerabilities, and risks of that study. This carries through the design to the data sharing stages. In line with this, every DRB review of a data package focuses on the unique context and risks when considering the contractor’s proposals for data de-identification to decide whether data can be published as public use, restricted-access, or not at all.
This process remains an ever evolving one as MCC continues to learn and develop new evaluation, analysis, and collection methods that strike the balance between usability and protection of confidentiality. Across the social sciences, in the intergovernmental community, and within the U.S. Government, continual monitoring of best practices and research development tools will inform our methods and help us achieve both open and responsible data.
Results of TREDD in action at MCC
- To date, MCC has published 93 independent evaluation reports and over 120 data packages in the Evaluation Catalog. These documents inform learning and accountability on the extent to which MCC investments achieved expected results.
- Independent researchers also utilize MCC-funded, third-party evaluation data on topics as wide ranging as: gender roles and water sources, health impacts of air pollution, and varying project impacts among disadvantaged groups.
- Academic institutions have used agency data sources to build students’ capacity utilizing research and developing analytical skills.