2,225 research outputs found

    Place of dogmatic theology in the Indian Church

    Get PDF

    Generating synthetic identifiers to support development and evaluation of data linkage methods

    Get PDF
    IntroductionCareful development and evaluation of data linkage methods is limited by researcher access to personal identifiers. One solution is to generate synthetic identifiers, which do not pose equivalent privacy concerns, but can form a 'gold-standard' linkage algorithm training dataset. Such data could help inform choices about appropriate linkage strategies in different settings. ObjectivesWe aimed to develop and demonstrate a framework for generating synthetic identifier datasets to support development and evaluation of data linkage methods. We evaluated whether replicating associations between attributes and identifiers improved the utility of the synthetic data for assessing linkage error. MethodsWe determined the steps required to generate synthetic identifiers that replicate the properties of real-world data collection. We then generated synthetic versions of a large UK cohort study (the Avon Longitudinal Study of Parents and Children; ALSPAC), according to the quality and completeness of identifiers recorded over several waves of the cohort. We evaluated the utility of the synthetic identifier data in terms of assessing linkage quality (false matches and missed matches). ResultsComparing data from two collection points in ALSPAC, we found within-person disagreement in identifiers (differences in recording due to both natural change and non-valid entries) in 18% of surnames and 12% of forenames. Rates of disagreement varied by maternal age and ethnic group. Synthetic data provided accurate estimates of linkage quality metrics compared with the original data (within 0.13-0.55% for missed matches and 0.00-0.04% for false matches). Incorporating associations between identifier errors and maternal age/ethnicity improved synthetic data utility. ConclusionsWe show that replicating dependencies between attribute values (e.g. ethnicity), values of identifiers (e.g. name), identifier disagreements (e.g. missing values, errors or changes over time), and their patterns and distribution structure enables generation of realistic synthetic data that can be used for robust evaluation of linkage methods

    Economic Value of Water-Oriented Recreation Quality

    Get PDF

    Towards a unified approach to formal risk of bias assessments for causal and descriptive inference

    Full text link
    Statistics is sometimes described as the science of reasoning under uncertainty. Statistical models provide one view of this uncertainty, but what is frequently neglected is the invisible portion of uncertainty: that assumed not to exist once a model has been fitted to some data. Systematic errors, i.e. bias, in data relative to some model and inferential goal can seriously undermine research conclusions, and qualitative and quantitative techniques have been created across several disciplines to quantify and generally appraise such potential biases. Perhaps best known are so-called risk of bias assessment instruments used to investigate the likely quality of randomised controlled trials in medical research. However, the logic of assessing the risks caused by various types of systematic error to statistical arguments applies far more widely. This logic applies even when statistical adjustment strategies for potential biases are used, as these frequently make assumptions (e.g. data missing at random) that can never be guaranteed in finite samples. Mounting concern about such situations can be seen in the increasing calls for greater consideration of biases caused by nonprobability sampling in descriptive inference (i.e. survey sampling), and the statistical generalisability of in-sample causal effect estimates in causal inference; both of which relate to the consideration of model-based and wider uncertainty when presenting research conclusions from models. Given that model-based adjustments are never perfect, we argue that qualitative risk of bias reporting frameworks for both descriptive and causal inferential arguments should be further developed and made mandatory by journals and funders. It is only through clear statements of the limits to statistical arguments that consumers of research can fully judge their value for any specific application.Comment: 12 page

    The Ursinus Weekly, February 19, 1962

    Get PDF
    Color Day ceremonies to feature message by President\u27s mother • Church official new college veep • Pi Nu to sponsor music month here • Scholar sought by Scottish society • ACES again offer banquet, lecture • Clinic day planned for delinquents by Varsity Club • May Day petition circulation slows • Lorelei success hailed; Whitians, king honored • Newman Club, Chi Alpha begin semester schedule • Young Republicans list club schedule • Two guest speakers discuss topics related to religion here last week • Cub and Key group invites applications from juniors • Editorial: Free floating displeasure department; Anecdote • Sears gives grant; Footland recipient • Ursinus in the past • Future Pfahler feature film schedule revealed • Aero-space medicine topic at pre-med club meeting • More about Italy: Ravenna visited • Letters to the editor • Demas holds lead in intramural play • Blue Jays, Blue Hens wrestling flocks plundered by U.C.\u27s marauding matmen • Last moment lapse gives PMC win; Swarthmore beats Bears easily Sat. • Greek gleanings • Historical Society to hear Dr. William T. Parsons • Marine, Air Force representatives to visit Ursinus campus this week • Graduate grantshttps://digitalcommons.ursinus.edu/weekly/1311/thumbnail.jp

    Prototype biodiversity digital twin: prioritisation of DNA metabarcoding sampling locations

    Get PDF
    Advancements in environmental DNA (eDNA) metabarcoding have revolutionised our capacity to assess biodiversity, especially for cryptic or less-studied organisms, such as fungi, bacteria and micro-invertebrates. Despite its cost-effectiveness, the spatial selection for sampling sites remains a critical challenge due to the considerable time and resources required for processing and analysing eDNA samples. This study introduces a Biodiversity Digital Twin Prototype, aimed at optimising the selection and prioritisation of eDNA sampling locations. Leveraging available eDNA data and integrating user-defined criteria, this digital twin facilitates informed decision-making in selecting future sampling sites. Through the development of an associated data formatting tool, we also facilitate the accessibility and utility of DNA metabarcoding data for broader conservation efforts. This prototype will serve multiple end-users, from researchers and monitoring initiatives to commercial enterprises, by providing an intuitive interface for interactive exploration and prioritisation, based on estimated complementarity of future samples. The prototype offers a scalable approach to biodiversity sampling. Ultimately, this tool aims to refine our understanding of global biodiversity patterns and support targeted conservation strategies through efficient eDNA sampling

    We need to talk about nonprobability samples

    Get PDF
    In most circumstances, probability sampling is the only way to ensure unbiased inference about population quantities where a complete census is not possible. As we enter the era of ‘big data’, however, nonprobability samples, whose sampling mechanisms are unknown, are undergoing a renaissance. We explain why the use of nonprobability samples can lead to spurious conclusions, and why seemingly large nonprobability samples can be (effectively) very small. We also review some recent controversies surrounding the use of nonprobability samples in biodiversity monitoring. These points notwithstanding, we argue that nonprobability samples can be useful, provided that their limitations are assessed, mitigated where possible and clearly communicated. Ecologists can learn much from other disciplines on each of these fronts

    Descriptive inference using large, unrepresentative nonprobability samples: an introduction for ecologists

    Get PDF
    Biodiversity monitoring usually involves drawing inferences about some variable of interest across a defined landscape from observations made at a sample of locations within that landscape. If the variable of interest differs between sampled and non-sampled locations, and no mitigating action is taken, then the sample is unrepresentative and inferences drawn from it will be biased. It is possible to adjust unrepresentative samples so that they more closely resemble the wider landscape in terms of “auxiliary variables”. A good auxiliary variable is a common cause of sample inclusion and the variable of interest, and if it explains an appreciable portion of the variance in both, then inferences drawn from the adjusted sample will be closer to the truth. We applied six types of survey sample adjustment—subsampling, quasi-randomisation, poststratification, superpopulation modelling, a “doubly robust” procedure, and multilevel regression and poststratification—to a simple two-part biodiversity monitoring problem. The first part was to estimate mean occupancy of the plant Calluna vulgaris in Great Britain in two time-periods (1987-1999 and 2010-2019); the second was to estimate the difference between the two (i.e. the trend). We estimated the means and trend using large, but (originally) unrepresentative, samples from a citizen science dataset. Compared to the unadjusted estimates, the means and trends estimated using most adjustment methods were more accurate, although standard uncertainty intervals generally did not cover the true values. Completely unbiased inference is not possible from an unrepresentative sample without knowing and having data on all relevant auxiliary variables. Adjustments can reduce the bias if auxiliary variables are available and selected carefully, but the potential for residual bias should be acknowledged and reported
    • …
    corecore