81,513 research outputs found

    Impact of variance components on reliability of absolute quantification using digital PCR

    Get PDF
    Background: Digital polymerase chain reaction (dPCR) is an increasingly popular technology for detecting and quantifying target nucleic acids. Its advertised strength is high precision absolute quantification without needing reference curves. The standard data analytic approach follows a seemingly straightforward theoretical framework but ignores sources of variation in the data generating process. These stem from both technical and biological factors, where we distinguish features that are 1) hard-wired in the equipment, 2) user-dependent and 3) provided by manufacturers but may be adapted by the user. The impact of the corresponding variance components on the accuracy and precision of target concentration estimators presented in the literature is studied through simulation. Results: We reveal how system-specific technical factors influence accuracy as well as precision of concentration estimates. We find that a well-chosen sample dilution level and modifiable settings such as the fluorescence cut-off for target copy detection have a substantial impact on reliability and can be adapted to the sample analysed in ways that matter. User-dependent technical variation, including pipette inaccuracy and specific sources of sample heterogeneity, leads to a steep increase in uncertainty of estimated concentrations. Users can discover this through replicate experiments and derived variance estimation. Finally, the detection performance can be improved by optimizing the fluorescence intensity cut point as suboptimal thresholds reduce the accuracy of concentration estimates considerably. Conclusions: Like any other technology, dPCR is subject to variation induced by natural perturbations, systematic settings as well as user-dependent protocols. Corresponding uncertainty may be controlled with an adapted experimental design. Our findings point to modifiable key sources of uncertainty that form an important starting point for the development of guidelines on dPCR design and data analysis with correct precision bounds. Besides clever choices of sample dilution levels, experiment-specific tuning of machine settings can greatly improve results. Well-chosen data-driven fluorescence intensity thresholds in particular result in major improvements in target presence detection. We call on manufacturers to provide sufficiently detailed output data that allows users to maximize the potential of the method in their setting and obtain high precision and accuracy for their experiments

    Cue Phrase Classification Using Machine Learning

    Full text link
    Cue phrases may be used in a discourse sense to explicitly signal discourse structure, but also in a sentential sense to convey semantic rather than structural information. Correctly classifying cue phrases as discourse or sentential is critical in natural language processing systems that exploit discourse structure, e.g., for performing tasks such as anaphora resolution and plan recognition. This paper explores the use of machine learning for classifying cue phrases as discourse or sentential. Two machine learning programs (Cgrendel and C4.5) are used to induce classification models from sets of pre-classified cue phrases and their features in text and speech. Machine learning is shown to be an effective technique for not only automating the generation of classification models, but also for improving upon previous results. When compared to manually derived classification models already in the literature, the learned models often perform with higher accuracy and contain new linguistic insights into the data. In addition, the ability to automatically construct classification models makes it easier to comparatively analyze the utility of alternative feature representations of the data. Finally, the ease of retraining makes the learning approach more scalable and flexible than manual methods.Comment: 42 pages, uses jair.sty, theapa.bst, theapa.st

    Partially Identified Prevalence Estimation under Misclassification using the Kappa Coefficient

    Get PDF
    We discuss a new strategy for prevalence estimation in the presence of misclassification. Our method is applicable when misclassification probabilities are unknown but independent replicate measurements are available. This yields the kappa coefficient, which indicates the agreement between the two measurements. From this information, a direct correction for misclassification is not feasible due to non-identifiability. However, it is possible to derive estimation intervals relying on the concept of partial identification. These intervals give interesting insights into possible bias due to misclassification. Furthermore, confidence intervals can be constructed. Our method is illustrated in several theoretical scenarios and in an example from oral health, where prevalence estimation of caries in children is the issue

    Partially Identifying Treatment Effects with an Application to Covering the Uninsured

    Get PDF
    We extend the nonparametric literature on partially identified probability distributions and use our analytical results to provide sharp bounds on the impact of universal health insurance on provider visits and medical expenditures. Our approach accounts for uncertainty about the reliability of self-reported insurance status as well as uncertainty created by unknown counterfactuals. We construct health insurance validation data using detailed information from the Medical Expenditure Panel Survey. Imposing relatively weak nonparametric assumptions, we estimate that under universal coverage monthly per capita provider visits and expenditures would rise by less than 8% and 16%, respectively, across the nonelderly population.

    Second-generation p-values: improved rigor, reproducibility, & transparency in statistical analyses

    Full text link
    Verifying that a statistically significant result is scientifically meaningful is not only good scientific practice, it is a natural way to control the Type I error rate. Here we introduce a novel extension of the p-value - a second-generation p-value - that formally accounts for scientific relevance and leverages this natural Type I Error control. The approach relies on a pre-specified interval null hypothesis that represents the collection of effect sizes that are scientifically uninteresting or are practically null. The second-generation p-value is the proportion of data-supported hypotheses that are also null hypotheses. As such, second-generation p-values indicate when the data are compatible with null hypotheses, or with alternative hypotheses, or when the data are inconclusive. Moreover, second-generation p-values provide a proper scientific adjustment for multiple comparisons and reduce false discovery rates. This is an advance for environments rich in data, where traditional p-value adjustments are needlessly punitive. Second-generation p-values promote transparency, rigor and reproducibility of scientific results by a priori specifying which candidate hypotheses are practically meaningful and by providing a more reliable statistical summary of when the data are compatible with alternative or null hypotheses.Comment: 29 pages, 29 page Supplemen

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Cerebral atrophy in mild cognitive impairment and Alzheimer disease: rates and acceleration.

    Get PDF
    OBJECTIVE: To quantify the regional and global cerebral atrophy rates and assess acceleration rates in healthy controls, subjects with mild cognitive impairment (MCI), and subjects with mild Alzheimer disease (AD). METHODS: Using 0-, 6-, 12-, 18-, 24-, and 36-month MRI scans of controls and subjects with MCI and AD from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, we calculated volume change of whole brain, hippocampus, and ventricles between all pairs of scans using the boundary shift integral. RESULTS: We found no evidence of acceleration in whole-brain atrophy rates in any group. There was evidence that hippocampal atrophy rates in MCI subjects accelerate by 0.22%/year2 on average (p = 0.037). There was evidence of acceleration in rates of ventricular enlargement in subjects with MCI (p = 0.001) and AD (p < 0.001), with rates estimated to increase by 0.27 mL/year2 (95% confidence interval 0.12, 0.43) and 0.88 mL/year2 (95% confidence interval 0.47, 1.29), respectively. A post hoc analysis suggested that the acceleration of hippocampal loss in MCI subjects was mainly driven by the MCI subjects that were observed to progress to clinical AD within 3 years of baseline, with this group showing hippocampal atrophy rate acceleration of 0.50%/year2 (p = 0.003). CONCLUSIONS: The small acceleration rates suggest a long period of transition to the pathologic losses seen in clinical AD. The acceleration in hippocampal atrophy rates in MCI subjects in the ADNI seems to be driven by those MCI subjects who concurrently progressed to a clinical diagnosis of AD

    Can the Heinrich ratio be used to predict harm from medication errors?

    Get PDF
    The purpose of this study was to establish whether, for medication errors, there exists a fixed Heinrich ratio between the number of incidents which did not result in harm, the number that caused minor harm, and the number that caused serious harm. If this were the case then it would be very useful in estimating any changes in harm following an intervention. Serious harm resulting from medication errors is relatively rare, so it can take a great deal of time and resource to detect a significant change. If the Heinrich ratio exists for medication errors, then it would be possible, and far easier, to measure the much more frequent number of incidents that did not result in harm and the extent to which they changed following an intervention; any reduction in harm could be extrapolated from this
    • …
    corecore