15 research outputs found

    Building a repository for record linkage

    Get PDF
    ICPSR is building LinkageLibrary, a repository and community space for researchers involved in linking and combining datasets, as a collaboration between social, statistical, and computer scientists. Unlike surveys or experiments where causal and outcome variables are measured in tandem, it is often necessary when working with organic, non-design data to link to other measures. This makes linkage methodologies particularly important when conducting analyses using administrative data. A common benchmarking repository of linkage methodologies will propel the field to the next level of rigor by facilitating comparison of different algorithms, understanding which types of algorithms work best under different conditions and problem domains, promoting transparency and replicability of research, and encouraging proper citation of methodological contributions and their resulting datasets. It will bring together the diverse scholarly communities (e.g., computer scientists, statisticians, and social, behavioral, economic, and health (SBEH) scientists) who are currently addressing these challenges in disparate ways that do not build on one another’s work. Improving linkage methodologies is critical to the production of representative samples, and thus to unbiased estimates of a wide variety of social and economic phenomena. The repository will accelerate the development of new record linkage algorithms and evaluation methods, improve the reproducibility of analyses conducted on integrated data, allow comparisons on same and different data, and move forward the provision of privacy-aware integrated data. The presentation will focus on lessons learned while building the repository and the community, and introduce the LinkageLibrary website

    Building on the Rich Metadata from Decades of Health Behavior Studies: The Potential for Common Data Elements (CDEs) to Enhance the Identification of Health Data Across Different Research Projects

    Full text link
    Continued analyses of key datasets are extremely important to building understanding of the underlying causes of substance use and addiction, and multiply the benefits of our nation’s investment in this science. ICPSR and the National Addiction and HIV Data Archive Program (NAHDAP) disseminate data from hundreds of NIH-funded research studies, as well as data collected with support from other agencies and foundations, many with questions about health outcomes or status that are not easily discovered with current search protocols which can be either too narrow or too broad. With funding from NIDA, we are working to increase the use of these extant data for health research by making these variables easier to identify. This is of great benefit the research community, providing improved discoverability of relevant health concepts within and, more importantly, across the multiple studies maintained in our repositories.OBSSR/NIHhttps://deepblue.lib.umich.edu/bitstream/2027.42/145467/1/IASSIST2018_CDE_Poster.pdfDescription of IASSIST2018_CDE_Poster.pdf : Poster presentation at IASSIST & CARTO 2018 Annual Meetin

    Demography and Environment in Grassland Settlement: Using Linked Longitudinal and Cross-Sectional Data to Explore Household and Agricultural Systems

    Full text link
    The Demography and Environment in Grassland Settlement project (DEGS) is a study of the relationship between population and environment in Kansas during its settlement and conversion from grassland to grain cultivation and rangeland. The research team involved in this project had as its goal to bring together data about farms and farm families in order to understand the core transformations in land use and family dynamics that took place during the process of settling and developing an agricultural landscape. For reasons we will explain later, the state of Kansas – located near the centre of the U.S. in a grassland ecosystem – is ideally suited for this study by virtue of its location, history and the documents that exist about it. In order to capture the environmental variability of Kansas, we are assembling a linked database of farm and family census records for twenty-five townships scattered across the state. This paper is about the process of choosing that sample, about the data we have accumulated and about the process we are undertaking to link records about families and farms through time and to attempt to find their locations in space.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60442/1/sylvester_etal.demography and environment.pd

    Land Use and Transfer Plans in the U.S. Great Plains

    Get PDF
    In the next decades, aging farmers in the United States will make decisions that affect almost 1 billion acres of land. The future of this land will become more uncertain as farm transfer becomes more difficult, potentially changing the structure of agriculture through farm consolidation, changes in farm ownership and management, or taking land out of production. The Great Plains Population and Environment Project interviewed farmers and their spouses between 1997 and 1999. Farm Family Survey participants were ambiguous about their plans to leave farming, transfer land to others, and even long-term land use, largely due to concerns about the continued economic viability of farming. Participants living far from metropolitan areas expected to sell or rent to other farmers, while those near residential real-estate markets expected to sell to developers. Delays in planning for retirement and succession were common, further threatening the success of intergenerational transitions

    The effects of wealth, occupation, and immigration on epidemic mortality from selected infectious diseases and epidemics in Holyoke township, Massachusetts, 1850−1912

    No full text
    <b>Background</b>: Previous research suggests individual-level socioeconomic circumstances and resources may be especially salient influences on mortality within the broader context of social, economic, and environmental factors affecting urban 19th century mortality. <b>Objective</b>: We sought to test individual-level socioeconomic effects on mortality from infectious and often epidemic diseases in the context of an emerging New England industrial mill town. <b>Methods</b>: We analyze mortality data from comprehensive death records and a sample of death records linked to census data, for an emergent industrial New England town, to analyze infectious mortality and model socioeconomic effects using Poisson rate regression. <b>Results</b>: Despite our expectations that individual resources might be especially salient in the harsh mortality setting of a crowded, rapidly growing, emergent, industrial mill town with high levels of impoverishment, infectious mortality was not significantly lowered by individual socio-economic status or resources

    A3: Innovations in Data Linkage

    No full text
    Moderator: Luiza Antoine Presenters: Ian Crandell: Record Linkage Reconciliation of Arlington Department of Human Services Administrative Data Using Potts Models Dean M. Resnick: Simulation Approach to Assess the Precision of Estimates Derived from Linking Survey and Administrative Records Marc Roemer: An assessment of using frequency weights for record linkage Susan Hautaniemi Leonard: Building a repository for record linkag

    A3: Innovations in Data Linkage

    No full text
    Moderator: Luiza Antoine Presenters: Ian Crandell: Record Linkage Reconciliation of Arlington Department of Human Services Administrative Data Using Potts Models Dean M. Resnick: Simulation Approach to Assess the Precision of Estimates Derived from Linking Survey and Administrative Records Marc Roemer: An assessment of using frequency weights for record linkage Susan Hautaniemi Leonard: Building a repository for record linkag
    corecore