First, a special thanks to Mark DeLong for many fruitful discussions on data management. This post grows out of a series of discussions with Mark this spring about the evolving nature of data management as well as an upcoming publication on data sharing in academic journals due to be published this fall.
In wake of the recent LaCour research scandal, it seems an especially appropriate time to discuss changes in the standards of data based evidence and data management planning. A recent confluence of demands for increased data transparency and data access from funding agencies, scholarly communities, and academic journals promises to reshape existing norms for data based research and improve access to secondary data.
Most recent discussions about data sharing and data management for grant funded research began with a recent revision in the National Science Foundation (NSF) grant guidelines. In 2011, NSF mandated that all grant applications include a data management plan (DMP) detailing how funded projects would produce, describe, and share data. The NSF DMP requirement reinvigorated discussions about the proper way to share and describe research data and increased scholarly attention on the role of data as evidence in academic research. The DMP requirement also influenced related data management plans in other granting agencies such as the Office of Digital Humanities (NEH). In 2013, the White House Office of Science and Technology Policy Data Access Plan expanded the scope of the discussion around data sharing and reuse by requiring federal agencies with R&D budgets over $100 million to develop plans for sharing data produced by these funds. The OSTP policy expanded the conversation about proper data management among both funders and researchers. As of fall 2015, nearly all federally funded grants will require some form of data sharing policy/planning.
While changes in grant policies raised awareness of data sharing and transparency, research communities in both the sciences and social sciences have also begun to advocate for a increase in data sharing and better data management. Noting concerns with the ability to replicate the results of many published articles (Lupia and Ellman, 2014), groups such as the American Political Science Association have passed policies designed to improve data access and research transparency (DA-RT). The DA-RT initiative improves the level of evidence provided in published research by seeking to specify a framework for presenting acceptable evidence in both qualitative and quantitative research. In the sciences, groups such as the Center for Open Science have advocated for similar guidelines in the sciences and sought to provide both research tools and guidelines that foster both transparency and data sharing.
While most journals failed to even mention research data as part of their guideline for supplementary materials ten years ago, an increasing number journals across both the social sciences and sciences mandate data sharing as part of their publication policy. Some journals host their own websites for article related data while others have partnered with data archives like the Dataverse and Dryad repositories to store related data content. In some journals such as the American Journal of Political Science (AJPS), the data sharing and validation practices are even more stringent. Articles relying on numeric data are now required to pass a replication test as part of the review process before publication.
The Duke Environment
At Duke, changes in data management planning for grants, community expectations for data sharing, and increasingly proscriptive journal policies have resulted in an increased awareness of data sharing and management on campus. The 2011 NSF data management requirement sparked a number of requests for consultations on data management plans. In Duke Libraries, we see more faculty requesting identifiers for their data collections and advice on the proper way to cite and license their data to receive credit for work. We also consult with more researchers seeking advice on a locating an appropriate place to host their data for long-term access and visibility. Questions on how to appropriately manage restricted data have also increased in the last few years.
What can I do?
Duke offers a range of consulting, tools, and training designed to make data management and data sharing easier. Whether you are seeking advice on grant application, the best way to store and share data, or guidance on the best way to design reproducible research, support is available on campus to help Duke researchers thrive in a data driven research environment. Overall, the confluence of scholarly interest in data management and sharing promises to transform research. As data publication, citation, and sharing become a normal part of the research process, data research becomes more rewarding, more extensible, and more transparent for all.
Joel Herndon, PhD (firstname.lastname@example.org)
This is the second of an occasional series of posts relating to data management. Joel Herndon, PhD, serves as the Head of Data and Visualization Services at Duke University Libraries.
Lupia, Arthur, and Colin Elman. 2014. “Openness in Political Science: Data Access and Research Transparency.” PS: Political Science & Politics 47 (01). Cambridge University Press: 19–42. doi:10.1017/S1049096513001716.