Measuring and Anticipating the Impact of Data Reuse.

Abstract

In this dissertation, I examined data citations in the social sciences, measured the scholarly impact of data reuse as well as explored factors that are associated with whether a dataset is reused. The guiding question for this dissertation is: What is the scholarly impact of data reuse? How can stakeholders anticipate the impact the data they fund, create, or curate will have? I addressed this question is three parts. First, in order to quantify the scholarly impact of data reuse, I looked at identifying reuse through data citation patterns. My study extends previous studies by taking a more nuanced view of how social scientists use citations to acknowledge others’ prior work on which they are building. Second, I developed a suite of impact metrics for data. By testing these metrics on a varied group of social science datasets, I was demonstrated their use and shed light on how these datasets can be high impact in different ways. Finally, I explored what factors correlate with reuse and with high impact. Examining data reuse in the social science literature showed that reusers of data regularly cite data producers’ publications, rather than citing data directly or crediting the data provider. Where they cite the data provider, they typically do so in addition to citing the data producer. This finding suggests that data reusers distinguish between the contributions producers make when they create data and when they share it: in essence, data reusers use citations to credit both actions. The four measures of reuse impact I developed highlighted different aspects of impact for data; no datasets were high-impact across the board, and few were consistently low-impact. The three metrics based on citations were especially divergent, suggesting that data can have an impact in multiple and varying ways. Finally, I showed that two characteristics of data are particularly related to whether the data are reused or not: the size of the data and how actively used they are. Together, these findings indicate that sharing data contributes to scholarship above and beyond the initial contribution a scientist makes when she creates data and publishes from them.PhDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/102481/1/kfear_1.pd

    Similar works

    Full text

    thumbnail-image