3,842 research outputs found

    The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data.

    Get PDF
    Multiple imputation can be used as a tool in the process of constructing prediction models in medical and epidemiological studies with missing covariate values. Such models can be used to make predictions for model performance assessment, but the task is made more complicated by the multiple imputation structure. We summarize various predictions constructed from covariates, including multiply imputed covariates, and either the set of imputation-specific prediction model coefficients or the pooled prediction model coefficients. We further describe approaches for using the predictions to assess model performance. We distinguish between ideal model performance and pragmatic model performance, where the former refers to the model's performance in an ideal clinical setting where all individuals have fully observed predictors and the latter refers to the model's performance in a real-world clinical setting where some individuals have missing predictors. The approaches are compared through an extensive simulation study based on the UK700 trial. We determine that measures of ideal model performance can be estimated within imputed datasets and subsequently pooled to give an overall measure of model performance. Alternative methods to evaluate pragmatic model performance are required and we propose constructing predictions either from a second set of covariate imputations which make no use of observed outcomes, or from a set of partial prediction models constructed for each potential observed pattern of covariate. Pragmatic model performance is generally lower than ideal model performance. We focus on model performance within the derivation data, but describe how to extend all the methods to a validation dataset.Angela Wood part supported by MRC grant G0701619. Ian White from MRC _Biostatistics Unit with unit programme number U105260558This is the final version. It was first published by Wiley at http://onlinelibrary.wiley.com/doi/10.1002/bimj.201400004/abstract;jsessionid=144424FA52D50041821329D8A7741BFD.f02t0

    Correcting for optimistic prediction in small data sets.

    Get PDF
    The C statistic is a commonly reported measure of screening test performance. Optimistic estimation of the C statistic is a frequent problem because of overfitting of statistical models in small data sets, and methods exist to correct for this issue. However, many studies do not use such methods, and those that do correct for optimism use diverse methods, some of which are known to be biased. We used clinical data sets (United Kingdom Down syndrome screening data from Glasgow (1991-2003), Edinburgh (1999-2003), and Cambridge (1990-2006), as well as Scottish national pregnancy discharge data (2004-2007)) to evaluate different approaches to adjustment for optimism. We found that sample splitting, cross-validation without replication, and leave-1-out cross-validation produced optimism-adjusted estimates of the C statistic that were biased and/or associated with greater absolute error than other available methods. Cross-validation with replication, bootstrapping, and a new method (leave-pair-out cross-validation) all generated unbiased optimism-adjusted estimates of the C statistic and had similar absolute errors in the clinical data set. Larger simulation studies confirmed that all 3 methods performed similarly with 10 or more events per variable, or when the C statistic was 0.9 or greater. However, with lower events per variable or lower C statistics, bootstrapping tended to be optimistic but with lower absolute and mean squared errors than both methods of cross-validation

    Multiple imputation for an incomplete covariate that is a ratio.

    Get PDF
    We are concerned with multiple imputation of the ratio of two variables, which is to be used as a covariate in a regression analysis. If the numerator and denominator are not missing simultaneously, it seems sensible to make use of the observed variable in the imputation model. One such strategy is to impute missing values for the numerator and denominator, or the log-transformed numerator and denominator, and then calculate the ratio of interest; we call this 'passive' imputation. Alternatively, missing ratio values might be imputed directly, with or without the numerator and/or the denominator in the imputation model; we call this 'active' imputation. In two motivating datasets, one involving body mass index as a covariate and the other involving the ratio of total to high-density lipoprotein cholesterol, we assess the sensitivity of results to the choice of imputation model and, as an alternative, explore fully Bayesian joint models for the outcome and incomplete ratio. Fully Bayesian approaches using Winbugs were unusable in both datasets because of computational problems. In our first dataset, multiple imputation results are similar regardless of the imputation model; in the second, results are sensitive to the choice of imputation model. Sensitivity depends strongly on the coefficient of variation of the ratio's denominator. A simulation study demonstrates that passive imputation without transformation is risky because it can lead to downward bias when the coefficient of variation of the ratio's denominator is larger than about 0.1. Active imputation or passive imputation after log-transformation is preferable

    Non-modifiable risk factors for stress fractures in military personnel undergoing training: A systematic review

    Get PDF
    A fracture, being an acquired rupture or break of the bone, is a significant and debilitating injury commonly seen among athletes and military personnel. Stress fractures, which have a repetitive stress aetiology, are highly prevalent among military populations, especially those undergoing training. The primary aim of this review is to identify non-modifiable risk factors for stress fractures in military personnel undergoing training. A systematic search was conducted of three major databases to identify studies that explored risk factors for stress fractures in military trainees. Critical appraisal, data extraction, and a narrative synthesis were conducted. Sixteen articles met the eligibility criteria for the study. Key non-modifiable risk factors identified were prior stress fracture and menstrual dysfunction, while advancing age and race other than black race may be a risk factor. To reduce the incidence of stress fractures in military trainees, mitigating modifiable risk factors among individuals with non-modifiable risk factors (e.g., optimising conditioning for older trainees) or better accommodating non-modifiable factors (for example, extending training periods and reducing intensity to facilitate recovery and adaptation) are suggested, with focus on groups at increased risk identified in this review

    CRISPR-Cas defense system and potential prophages in cyanobacteria associated with the coral black band disease

    Get PDF
    Understanding how pathogens maintain their virulence is critical to developing tools to mitigate disease in animal populations. We sequenced and assembled the first draft genome of Roseofilum reptotaenium AO1, the dominant cyanobacterium underlying pathogenicity of the virulent coral black band disease (BBD), and analyzed parts of the BBD-associated Geitlerinema sp. BBD_1991 genome in silico. Both cyanobacteria are equipped with an adaptive, heritable clustered regularly interspaced short palindromic repeats (CRISPR)-Cas defense system type I-D and have potential virulence genes located within several prophage regions. The defense system helps to prevent infection by viruses and mobile genetic elements via identification of short fingerprints of the intruding DNA, which are stored as templates in the bacterial genome, in so-called CRISPRs. Analysis of CRISPR target sequences (protospacers) revealed an unusually high number of self-targeting spacers in R. reptotaenium AO1 and extraordinary long CRIPSR arrays of up to 260 spacers in Geitlerinema sp. BBD_1991. The self-targeting spacers are unlikely to be a form of autoimmunity; instead these target an incomplete lysogenic bacteriophage. Lysogenic virus induction experiments with mitomycin C and UV light did not reveal an actively replicating virus population in R. reptotaenium AO1 cultures, suggesting that phage functionality is compromised or excision could be blocked by the CRISPR-Cas system. Potential prophages were identified in three regions of R. reptotaenium AO1 and five regions of Geitlerinema sp. BBD_1991, containing putative BBD relevant virulence genes, such as an NAD-dependent epimerase/dehydratase (a homolog in terms of functionality to the third and fourth most expressed gene in BBD), lysozyme/metalloendopeptidases and other lipopolysaccharide modification genes. To date, viruses have not been considered to be a component of the BBD consortium or a contributor to the virulence of R. reptotaenium AO1 and Geitlerinema sp. BBD_(1)991. We suggest that the presence of virulence genes in potential prophage regions, and the CRISPR-Cas defense systems are evidence of an arms race between the respective cyanobacteria and their bacteriophage predators. The presence of such a defense system likely reduces the number of successful bacteriophage infections and mortality in the cyanobacteria, facilitating the progress of BBD

    Local-Circuit Phenotypes of Layer 5 Neurons in Motor-Frontal Cortex of YFP-H Mice

    Get PDF
    Layer 5 pyramidal neurons comprise an important but heterogeneous group of cortical projection neurons. In motor-frontal cortex, these neurons are centrally involved in the cortical control of movement. Recent studies indicate that local excitatory networks in mouse motor-frontal cortex are dominated by descending pathways from layer 2/3 to 5. However, those pathways were identified in experiments involving unlabeled neurons in wild type mice. Here, to explore the possibility of class-specific connectivity in this descending pathway, we mapped the local sources of excitatory synaptic input to a genetically labeled population of cortical neurons: YFP-positive layer 5 neurons of YFP-H mice. We found, first, that in motor cortex, YFP-positive neurons were distributed in a double blade, consistent with the idea of layer 5B having greater thickness in frontal neocortex. Second, whereas unlabeled neurons in upper layer 5 received their strongest inputs from layer 2, YFP-positive neurons in the upper blade received prominent layer 3 inputs. Third, YFP-positive neurons exhibited distinct electrophysiological properties, including low spike frequency adaptation, as reported previously. Our results with this genetically labeled neuronal population indicate the presence of distinct local-circuit phenotypes among layer 5 pyramidal neurons in mouse motor-frontal cortex, and present a paradigm for investigating local circuit organization in other genetically labeled populations of cortical neurons
    corecore