27,363 research outputs found

    HIV with contact-tracing: a case study in Approximate Bayesian Computation

    Full text link
    Missing data is a recurrent issue in epidemiology where the infection process may be partially observed. Approximate Bayesian Computation, an alternative to data imputation methods such as Markov Chain Monte Carlo integration, is proposed for making inference in epidemiological models. It is a likelihood-free method that relies exclusively on numerical simulations. ABC consists in computing a distance between simulated and observed summary statistics and weighting the simulations according to this distance. We propose an original extension of ABC to path-valued summary statistics, corresponding to the cumulated number of detections as a function of time. For a standard compartmental model with Suceptible, Infectious and Recovered individuals (SIR), we show that the posterior distributions obtained with ABC and MCMC are similar. In a refined SIR model well-suited to the HIV contact-tracing data in Cuba, we perform a comparison between ABC with full and binned detection times. For the Cuban data, we evaluate the efficiency of the detection system and predict the evolution of the HIV-AIDS disease. In particular, the percentage of undetected infectious individuals is found to be of the order of 40%

    Functional diversity metrics can perform well with highly incomplete data sets

    Get PDF
    Characterising changes in functional diversity at large spatial scales provides insight into the impact of human activity on ecosystem structure and function. However, the approach is often based on trait data sets that are incomplete and unrepresentative, with uncertain impacts on functional diversity estimates. To address this knowledge gap, we simulated random and biased removal of data from three empirical trait data sets: an avian data set (9579 species), a plant data set (2185 species) and a crocodilian data set (25 species). For these data sets, we assessed whether functional diversity metrics were robust to data incompleteness with and without using imputation to fill data gaps. We compared two metrics each calculated with two methods: functional richness (calculated with convex hulls and trait probabilities densities) and functional divergence (calculated with distance-based Rao and trait probability densities). Without imputation, estimates of functional diversity (richness and divergence) for birds and plants were robust when 20%–70% of species had missing data for four out of 11 and two out of six continuous traits, respectively, depending on the severity of bias and method used. However, when missing traits were imputed, functional diversity metrics consistently remained representative of the true value when 70% of bird species were missing data for four out of 11 traits and when 50% of plant species were missing data for two out of six traits. Trait probability densities and distance-based Rao were particularly robust to missingness and bias when combined with imputation. Convex hull-based estimations of functional richness were less reliable. When applied to a smaller data set (crocodilians, 25 species), all functional diversity metrics were much more sensitive to missing data. Expanding global morphometric data sets to represent more taxa and traits, and to quantify intraspecific variation, remains a priority. In the meantime, our results show that widely used methods can successfully quantify large-scale functional diversity even when data are missing for half of species, provided that missing traits are estimated using imputation. We recommend the use of trait probability densities or distance-based Rao when working with large incomplete data sets and filling data gaps with imputation

    Adaptive imputation of missing values for incomplete pattern classification

    Get PDF
    In classification of incomplete pattern, the missing values can either play a crucial role in the class determination, or have only little influence (or eventually none) on the classification results according to the context. We propose a credal classification method for incomplete pattern with adaptive imputation of missing values based on belief function theory. At first, we try to classify the object (incomplete pattern) based only on the available attribute values. As underlying principle, we assume that the missing information is not crucial for the classification if a specific class for the object can be found using only the available information. In this case, the object is committed to this particular class. However, if the object cannot be classified without ambiguity, it means that the missing values play a main role for achieving an accurate classification. In this case, the missing values will be imputed based on the K-nearest neighbor (K-NN) and self-organizing map (SOM) techniques, and the edited pattern with the imputation is then classified. The (original or edited) pattern is respectively classified according to each training class, and the classification results represented by basic belief assignments are fused with proper combination rules for making the credal classification. The object is allowed to belong with different masses of belief to the specific classes and meta-classes (which are particular disjunctions of several single classes). The credal classification captures well the uncertainty and imprecision of classification, and reduces effectively the rate of misclassifications thanks to the introduction of meta-classes. The effectiveness of the proposed method with respect to other classical methods is demonstrated based on several experiments using artificial and real data sets

    Application of Multiple imputation in Analysis of missing data in a study of Health-related quality of life

    Get PDF
    When a new treatment has similar efficacy compared to standard therapy in medical or social studies, the health-related quality of life (HRQL) becomes the main concern of health care professionals and can be the basis for making a decision in patient management. National Surgical Adjuvant Breast and Bowel Protocol (NSABP) C-06 clinical trial compared two therapies: intravenous (IV) fluorouracil (FU) plus Leucovorin (LV) and oral uracil/ftorafur (UFT) plus LV, in treatment of colon cancer. However, there was a high proportion of missing values among the HRQL measurements that only 481 (59.8%) UFT patients and 421 (52.4%) FU patients submitted the forms at all time points. Ignoring the missing data issue often leads to inefficient and sometime biased estimates. The primary objective of this thesis is to evaluate the impact of missing data on the estimated the treatment effect. In this thesis, we analyzed the HRQL data with missing values by multiple imputation. Both model-based and nearest neighborhood hot-deck imputation methods were applied. Confidence intervals for the estimated treatment effect were generated based on the pooled imputation analysis. The results based on multiple imputation indicated that missing data did not introduce major bias in the earlier analyses. However, multiple imputation was worthwhile since the most estimation from the imputation datasets are more efficient than that from incomplete data. These findings have public health importance: they have implications for development of health policies and planning interventions to improve the health related quality of life for those patients with colon cancer
    • …
    corecore