67 research outputs found

    Linkage Quality Assessment for Anonymously linked Administrative Data.

    Get PDF
    Introduction Linked datasets are important resources for research, but linkage errors can lead to incorrect results. For data security and privacy concerns, when linkage of personal identifiers is performed anonymously, it is difficult to assess the quality of the linked dataset. We describe the method used to perform linkage quality. Objectives and Approach We explored how to check the quality of linkages while preserving the privacy of individuals. We also adopted an approach that minimized time and burden on data providers involved in physical verification using randomly-generated appropriate sample sizes. To validate these linkages, data providers were given random samples of 50 unique records from both linked and unlinked individuals across two other Government programs. Data providers were asked to look at the records associated with those individuals in their original datasets. Three types of linkage results were validated: cross-program linkages, cross-program non-linkages, and within-program linkages. Proportions of false-matches and missed-matches were estimated. Results Twenty data providers checked their samples with two other programs which gave us a sample of 2000 individuals. The linkage process, based on anonymized personal identifiers, resulted in high true positive and high true negative rates. Agreement between human judges and the linkage software was strong. Results of this exercise and other linkage validation examinations provided confidence in the accuracy of the linkage process. With false matches occurring approximately only 3% of the time and virtually no missed-matches occurring, no adjustments were deemed necessary. Although linkage rates were reassuring, the sample sizes used for comparison were small, so it is expected that there would be significant variation associated with this 3% estimate; caution is advised in its use. Conclusion/Implications Proportions of false-matches and missed-matches determine linkage quality which is the base for research when linkages are performed anonymously. A low proportion of false-matches and an absence of missed-matches was an indication of robust linkages

    SAGE: supporting secondary data analysis and expediting knowledge mobilization with linked administrative, service delivery, and research data

    Get PDF
    Introduction The cultural revolution of data sharing is becoming a global movement. It allows for scientific replication and verification of research results, avoiding research duplication, and enabling greater transparency and knowledge mobilization with a relatively low cost. However, privacy protection and data security are critical concerns for human-subject related data sharing. Objectives and Approach In order to facilitate data sharing and engage various stakeholders to better inform policy and practice while protecting privacy, SAGE (Secondary Analysis to Generate Evidence) was established by PolicyWise for Children and Families. It is a collaborative data repository platform that connects stakeholders through secondary use of data. SAGE was built to link, manage, and share research data, community service data, and administrative data related to health and social well-being. Governance and technical processes are in place to ensure that data depositors are involved in decision-making, and data accessed by collaborators are secured and re-identification risks assessed. Results SAGE has been in operation for over a year. Through engagement with the research and non-profit communities, SAGE now offers ten data assets. Discovery is facilitated through well-documented metadata through NADA and Dataverse. Six new collaborative projects have been initiated through SAGE. SAGE is working actively with local non-profits to liberate data to generate evidence and collaborate with each other on common goals. SAGE has helped these organizations understand the legal and legislative barriers to data sharing, and build the technical capacity to further this goal. Discussions are underway with Alberta public entities on how SAGE can support the linkage and governance processes in the use of administrative data. Conclusion/Implications SAGE is putting the governance processes and security practices in place to fill a need for a facilitated data sharing model for sensitive data. SAGE is supporting the cultural shift towards data sharing and reuse by fostering trust and collaboration among researchers, non-profit and government ministries

    Hepatitis B Seroprevalence in the U.S. Military and its Impact on Potential Screening Strategies

    Get PDF
    INTRODUCTION: Knowledge of the contemporary epidemiology of hepatitis B virus (HBV) infection among military personnel can inform potential Department of Defense (DoD) screening policy and infection and disease control strategies. MATERIALS AND METHODS: HBV infection status at accession and following deployment was determined by evaluating reposed serum from 10,000 service members recently deployed to combat operations in Iraq and Afghanistan in the period from 2007 to 2010. A cost model was developed from the perspective of the Department of Defense for a program to integrate HBV infection screening of applicants for military service into the existing screening program of screening new accessions for vaccine-preventable infections. RESULTS: The prevalence of chronic HBV infection at accession was 2.3/1,000 (95% CI: 1.4, 3.2); most cases (16/21, 76%) identified after deployment were present at accession. There were 110 military service-related HBV infections identified. Screening accessions who are identified as HBV susceptible with HBV surface antigen followed by HBV surface antigen neutralization for confirmation offered no cost advantage over not screening and resulted in a net annual increase in cost of $5.78 million. However, screening would exclude as many as 514 HBV cases each year from accession. CONCLUSIONS: Screening for HBV infection at service entry would potentially reduce chronic HBV infection in the force, decrease the threat of transfusion-transmitted HBV infection in the battlefield blood supply, and lead to earlier diagnosis and linkage to care; however, applicant screening is not cost saving. Service-related incident infections indicate a durable threat, the need for improved laboratory-based surveillance tools, and mandate review of immunization policy and practice

    TRY plant trait database - enhanced coverage and open access

    Get PDF
    Plant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives

    The number of tree species on Earth.

    Get PDF
    One of the most fundamental questions in ecology is how many species inhabit the Earth. However, due to massive logistical and financial challenges and taxonomic difficulties connected to the species concept definition, the global numbers of species, including those of important and well-studied life forms such as trees, still remain largely unknown. Here, based on global ground-sourced data, we estimate the total tree species richness at global, continental, and biome levels. Our results indicate that there are ∼73,000 tree species globally, among which ∼9,000 tree species are yet to be discovered. Roughly 40% of undiscovered tree species are in South America. Moreover, almost one-third of all tree species to be discovered may be rare, with very low populations and limited spatial distribution (likely in remote tropical lowlands and mountains). These findings highlight the vulnerability of global forest biodiversity to anthropogenic changes in land use and climate, which disproportionately threaten rare species and thus, global tree richness

    Evenness mediates the global relationship between forest productivity and richness

    Get PDF
    1. Biodiversity is an important component of natural ecosystems, with higher species richness often correlating with an increase in ecosystem productivity. Yet, this relationship varies substantially across environments, typically becoming less pronounced at high levels of species richness. However, species richness alone cannot reflect all important properties of a community, including community evenness, which may mediate the relationship between biodiversity and productivity. If the evenness of a community correlates negatively with richness across forests globally, then a greater number of species may not always increase overall diversity and productivity of the system. Theoretical work and local empirical studies have shown that the effect of evenness on ecosystem functioning may be especially strong at high richness levels, yet the consistency of this remains untested at a global scale. 2. Here, we used a dataset of forests from across the globe, which includes composition, biomass accumulation and net primary productivity, to explore whether productivity correlates with community evenness and richness in a way that evenness appears to buffer the effect of richness. Specifically, we evaluated whether low levels of evenness in speciose communities correlate with the attenuation of the richness–productivity relationship. 3. We found that tree species richness and evenness are negatively correlated across forests globally, with highly speciose forests typically comprising a few dominant and many rare species. Furthermore, we found that the correlation between diversity and productivity changes with evenness: at low richness, uneven communities are more productive, while at high richness, even communities are more productive. 4. Synthesis. Collectively, these results demonstrate that evenness is an integral component of the relationship between biodiversity and productivity, and that the attenuating effect of richness on forest productivity might be partly explained by low evenness in speciose communities. Productivity generally increases with species richness, until reduced evenness limits the overall increases in community diversity. Our research suggests that evenness is a fundamental component of biodiversity–ecosystem function relationships, and is of critical importance for guiding conservation and sustainable ecosystem management decisions
    corecore