262,535 research outputs found

    Added predictive value of high-throughput molecular data to clinical data, and its validation

    Get PDF
    Hundreds of ''molecular signatures'' have been proposed in the literature to predict patient outcome in clinical settings from high-dimensional data, many of which eventually failed to get validated. Validation of such molecular research findings is thus becoming an increasingly important branch of clinical bioinformatics. Moreover, in practice well-known clinical predictors are often already available. From a statistical and bioinformatics point of view, poor attention has been given to the evaluation of the added predictive value of a molecular signature given that clinical predictors are available. This article reviews procedures that assess and validate the added predictive value of high-dimensional molecular data. It critically surveys various approaches for the construction of combined prediction models using both clinical and molecular data, for validating added predictive value based on independent data, and for assessing added predictive value using a single data set

    The use of diatom records to establish reference conditions for UK lakes subject to eutrophication

    Get PDF
    A knowledge of pre-disturbance conditions is important for setting realistic restoration targets for lakes. For European waters this is now a requirement of the European Council Water Framework Directive where ecological status must be assessed based on the degree to which present day conditions deviate from reference conditions. Here, we employ palaeolimnological techniques, principally inferences of total phosphorus from diatom assemblages (DI-TP) and classification of diatom composition data from the time slice in sediment cores dated to similar to 1850 AD, to define chemical and ecological reference conditions, respectively, for a range of UK lake types. The DI-TP results from 169 sites indicate that reference TP values for low alkalinity lakes are typically 3 m mean depth) generally had lower reference TP concentrations than the shallow sites. A small group of shallow marl lakes had concentrations of similar to 30 mu g L-1. Cluster analysis of diatom composition data from 106 lakes where the key pressure of interest was eutrophication identified three clusters, each associated with particular lake types, suggesting that the typology has ecological relevance, although poor cross matching of the diatom groups and the lake typology at type boundaries highlights the value of a site-specific approach to defining reference conditions. Finally the floristic difference between the reference and present day (surface sample) diatom assemblages of each site was estimated using the squared chord distance dissimilarity coefficient. Only 25 of the 106 lakes experienced insignificant change and the findings indicate that eutrophication has impacted all lake types with > 50% of sites exhibiting significant floristic change. The study illustrates the role of the sediment record in determining both chemical and ecological reference conditions, and assessing deviation from the latter. Whilst restoration targets may require modification in the future to account for climate induced alterations, the long temporal perspective offered by palaeolimnology ensures that such changes are assessed against a sound baseline

    Two close large quasar groups of size ∼ 350 Mpc at

    Get PDF
    The Clowes & Campusano large quasar group (LQG) at inline image has been re-examined using the quasar data from the DR7QSO catalogue of the Sloan Digital Sky Survey. In the 1991 discovery, the LQG impinged on the northern, southern and eastern limits of the survey. In the DR7QSO data, the western, northern and southern boundaries of the LQG remain essentially the same, but an extension eastwards of ∼2° is indicated. In the DR7QSO data, the LQG has 34 members, with inline image. A new group of 38 members is indicated at inline image and within ∼2bsl000640 of the Clowes & Campusano LQG. The characteristic sizes of these two LQGs, ∼350–400 Mpc, appear to be only marginally consistent with the scale of homogeneity in the concordance cosmology. In addition to their intrinsic interest, these two LQGs provide locations in which to investigate early large-scale structure in galaxies and to identify high-z clusters. A method is presented for assessing the statistical significance and overdensity of groups found by linkage of points

    Over-optimism in bioinformatics: an illustration

    Get PDF
    In statistical bioinformatics research, different optimization mechanisms potentially lead to "over-optimism" in published papers. The present empirical study illustrates these mechanisms through a concrete example from an active research field. The investigated sources of over-optimism include the optimization of the data sets, of the settings, of the competing methods and, most importantly, of the method’s characteristics. We consider a "promising" new classification algorithm that turns out to yield disappointing results in terms of error rate, namely linear discriminant analysis incorporating prior knowledge on gene functional groups through an appropriate shrinkage of the within-group covariance matrix. We quantitatively demonstrate that this disappointing method can artificially seem superior to existing approaches if we "fish for significance”. We conclude that, if the improvement of a quantitative criterion such as the error rate is the main contribution of a paper, the superiority of new algorithms should be validated using "fresh" validation data sets

    Prevalence and diagnosis of congenital uterine anomalies in women with reproductive failure: a critical appraisal

    Get PDF
    BACKGROUND: The prevalence of congenital uterine anomalies in women with reproductive failure remains unclear, largely due to methodological bias. The aim of this review is to assess the diagnostic accuracy of different methodologies and estimate the prevalence of congenital uterine anomalies in women with infertility and recurrent miscarriage (RM). METHODS: Studies from 1950-2007 were identified through a computer MEDLINE search; all relevant references were further reviewed. RESULTS: The most accurate diagnostic procedures are combined hysteroscopy and laparoscopy, sonohysterography (SHG) and possibly three-dimensional ultrasound (3D US). Two-dimensional ultrasound (2D US) and hysterosalpingography (HSG) are less accurate and are thus inadequate for diagnostic purposes. Preliminary studies (n=24) suggest MRI is a relatively sensitive tool. A critical analysis of studies suggests that the prevalence of congenital uterine anomalies is ~6.7% [confidence interval (CI) 95%, 6.0 – 7.4] in the general population, ~7.3% (CI 95%, 6.7 – 7.9) in the infertile population and ~16.7% (CI 95%, 14.8 – 18.6) in the recurrent miscarriage (RM) population. The arcuate uterus is the commonest anomaly in the general and RM population. In contrast, the septate uterus is the commonest anomaly in the infertile population, suggesting a possible association. CONCLUSION: Women with RM have a high prevalence of congenital uterine anomalies and should be thoroughly investigated. HSG and/or 2D US can be used as an initial screening tool. Combined hysteroscopy and laparoscopy, SHG and 3D US can be used for a definitive diagnosis. The accuracy and practicality of magnetic resonance imaging (MRI) remains unclear
    corecore