1,013 research outputs found

    False discovery rates in somatic mutation studies of cancer

    Full text link
    The purpose of cancer genome sequencing studies is to determine the nature and types of alterations present in a typical cancer and to discover genes mutated at high frequencies. In this article we discuss statistical methods for the analysis of somatic mutation frequency data generated in these studies. We place special emphasis on a two-stage study design introduced by Sj\"{o}blom et al. [Science 314 (2006) 268--274]. In this context, we describe and compare statistical methods for constructing scores that can be used to prioritize candidate genes for further investigation and to assess the statistical significance of the candidates thus identified. Controversy has surrounded the reliability of the false discovery rates estimates provided by the approximations used in early cancer genome studies. To address these, we develop a semiparametric Bayesian model that provides an accurate fit to the data. We use this model to generate a large collection of realistic scenarios, and evaluate alternative approaches on this collection. Our assessment is impartial in that the model used for generating data is not used by any of the approaches compared. And is objective, in that the scenarios are generated by a model that fits data. Our results quantify the conservative control of the false discovery rate with the Benjamini and Hockberg method compared to the empirical Bayes approach and the multiple testing method proposed in Storey [J. R. Stat. Soc. Ser. B Stat. Methodol. 64 (2002) 479--498]. Simulation results also show a negligible departure from the target false discovery rate for the methodology used in Sj\"{o}blom et al. [Science 314 (2006) 268--274].Comment: Published in at http://dx.doi.org/10.1214/10-AOAS438 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Mixed Effect Modeling of Dose and Linear Energy Transfer Correlations With Brain Image Changes After Intensity Modulated Proton Therapy for Skull Base Head and Neck Cancer

    Get PDF
    Purpose Intensity modulated proton therapy (IMPT) could yield high linear energy transfer (LET) in critical structures and increased biological effect. For head and neck cancers at the skull base this could potentially result in radiation-associated brain image change (RAIC). The purpose of the current study was to investigate voxel-wise dose and LET correlations with RAIC after IMPT. Methods and Materials For 15 patients with RAIC after IMPT, contrast enhancement observed on T1-weighted magnetic resonance imaging was contoured and coregistered to the planning computed tomography. Monte Carlo calculated dose and dose-averaged LET (LETd) distributions were extracted at voxel level and associations with RAIC were modelled using uni- and multivariate mixed effect logistic regression. Model performance was evaluated using the area under the receiver operating characteristic curve and precision-recall curve. Results An overall statistically significant RAIC association with dose and LETd was found in both the uni- and multivariate analysis. Patient heterogeneity was considerable, with standard deviation of the random effects of 1.81 (1.30-2.72) for dose and 2.68 (1.93-4.93) for LETd, respectively. Area under the receiver operating characteristic curve was 0.93 and 0.95 for the univariate dose-response model and multivariate model, respectively. Analysis of the LETd effect demonstrated increased risk of RAIC with increasing LETd for the majority of patients. Estimated probability of RAIC with LETd = 1 keV/µm was 4% (95% confidence interval, 0%, 0.44%) and 29% (95% confidence interval, 0.01%, 0.92%) for 60 and 70 Gy, respectively. The TD15 were estimated to be 63.6 and 50.1 Gy with LETd equal to 2 and 5 keV/µm, respectively. Conclusions Our results suggest that the LETd effect could be of clinical significance for some patients; LETd assessment in clinical treatment plans should therefore be taken into consideration.publishedVersio

    Rescaled bootstrap confidence intervals for the population variance in the presence of outliers or spikes in the distribution of a variable of interest

    Get PDF
    Confidence intervals for the population variance in the presence of outliers or spikes in the distribution of a variable of interest are topics that have not been investigated in depth previously. Results derived from a first Monte Carlo simulation study reveal the limitations of the customary confidence interval for the population variance when the underlying assumptions are violated, and the use of alternative confidence intervals is thus justified. We suggest confidence intervals based on the rescaled bootstrap method for many reasons. First, this is a simple technique that can be easily applied in practice. Second, it is free of probabilistic distributions. Finally, it can be easily applied to the cases of finite populations and samples selected from complex sampling designs. Results derived from a second Monte Carlo simulation study indicate that the suggested confidence intervals have desirable coverage rates with smaller average widths. Accordingly, an advantage of the suggested confidence intervals is that they offer a good compromise between simplicity and desirable properties. The various simulation studies are based on different scenarios that may arise in practice, such as the presence of outliers or spikes, and the fact that the underlying assumptions of the customary confidence interval are violated

    Continuous testing for Poisson process intensities: A new perspective on scanning statistics

    Full text link
    We propose a novel continuous testing framework to test the intensities of Poisson Processes. This framework allows a rigorous definition of the complete testing procedure, from an infinite number of hypothesis to joint error rates. Our work extends traditional procedures based on scanning windows, by controlling the family-wise error rate and the false discovery rate in a non-asymptotic manner and in a continuous way. The decision rule is based on a \pvalue process that can be estimated by a Monte-Carlo procedure. We also propose new test statistics based on kernels. Our method is applied in Neurosciences and Genomics through the standard test of homogeneity, and the two-sample test

    Algorithmic Analysis Techniques for Molecular Imaging

    Get PDF
    This study addresses image processing techniques for two medical imaging modalities: Positron Emission Tomography (PET) and Magnetic Resonance Imaging (MRI), which can be used in studies of human body functions and anatomy in a non-invasive manner. In PET, the so-called Partial Volume Effect (PVE) is caused by low spatial resolution of the modality. The efficiency of a set of PVE-correction methods is evaluated in the present study. These methods use information about tissue borders which have been acquired with the MRI technique. As another technique, a novel method is proposed for MRI brain image segmen- tation. A standard way of brain MRI is to use spatial prior information in image segmentation. While this works for adults and healthy neonates, the large variations in premature infants preclude its direct application. The proposed technique can be applied to both healthy and non-healthy premature infant brain MR images. Diffusion Weighted Imaging (DWI) is a MRI-based technique that can be used to create images for measuring physiological properties of cells on the structural level. We optimise the scanning parameters of DWI so that the required acquisition time can be reduced while still maintaining good image quality. In the present work, PVE correction methods, and physiological DWI models are evaluated in terms of repeatabilityof the results. This gives in- formation on the reliability of the measures given by the methods. The evaluations are done using physical phantom objects, correlation measure- ments against expert segmentations, computer simulations with realistic noise modelling, and with repeated measurements conducted on real pa- tients. In PET, the applicability and selection of a suitable partial volume correction method was found to depend on the target application. For MRI, the data-driven segmentation offers an alternative when using spatial prior is not feasible. For DWI, the distribution of b-values turns out to be a central factor affecting the time-quality ratio of the DWI acquisition. An optimal b-value distribution was determined. This helps to shorten the imaging time without hampering the diagnostic accuracy.Siirretty Doriast

    Ownership and Financial Performance in the German Hospital Sector

    Get PDF
    This paper considers the role of ownership form for the financial performance of German acute care hospitals and its development over time.We measure financial performance by a hospital-specific yearly probability of default (PD). Using a panel of hospital data, our models allow for state dependence in the PD as well as unobserved individual heterogeneity. We find that private ownership is more likely to be associated with sound levels in financial performance than public ownership. Moreover, state dependence in the PD is substantial, albeit not ownership-specific.Finally, our evidence suggests that overall efficiency may be enhanced most by closing down some loss-making public hospitals rather than by their restructuring, especially because the German hospital market has substantial excess capacities.Hospitals ownership, financial performance, state dependence

    UNCERTAINTY IN MACHINE LEARNING A SAFETY PERSPECTIVE ON BIOMEDICAL APPLICATIONS

    Get PDF
    Uncertainty is an inevitable and essential aspect of the worldwe live in and a fundamental aspect of human decision-making. It is no different in the realm of machine learning. Just as humans seek out additional information and perspectives when faced with uncertainty, machine learning models must also be able to account for and quantify the uncertainty in their predictions. However, the uncertainty quantification in machine learning models is often neglected. By acknowledging and incorporating uncertainty quantification into machine learning models, we can build more reliable and trustworthy systems that are better equipped to handle the complexity of the world and support clinical decisionmaking. This thesis addresses the broad issue of uncertainty quantification in machine learning, covering the development and adaptation of uncertainty quantification methods, their integration in the machine learning development pipeline, and their practical application in clinical decision-making. Original contributions include the development of methods to support practitioners in developing more robust and interpretable models, which account for different sources of uncertainty across the core components of the machine learning pipeline, encompassing data, the machine learning model, and its outputs. Moreover, these machine learning models are designed with abstaining capabilities, enabling them to accept or reject predictions based on the level of uncertainty present. This emphasizes the importance of using classification with rejection option in clinical decision support systems. The effectiveness of the proposed methods was evaluated across databases with physiological signals from medical diagnosis and human activity recognition. The results support that uncertainty quantification was important for more reliable and robust model predictions. By addressing these topics, this thesis aims to improve the reliability and trustworthiness of machine learning models and contribute to fostering the adoption of machineassisted clinical decision-making. The ultimate goal is to enhance the trust and accuracy of models’ predictions and increase transparency and interpretability, ultimately leading to better decision-making across a range of applications.A incerteza é um aspeto inevitável e essencial do mundo em que vivemos e um aspeto fundamental na tomada de decisão humana. Não é diferente no âmbito da aprendizagem automática. Assim como os seres humanos, quando confrontados com um determinado nível de incerteza exploram novas abordagens ou procuram recolher mais informação, também os modelos de aprendizagem automática devem ter a capacidade de ter em conta e quantificar o grau de incerteza nas suas previsões. No entanto, a quantificação da incerteza nos modelos de aprendizagem automática é frequentemente negligenciada. O reconhecimento e incorporação da quantificação de incerteza nos modelos de aprendizagem automática, irá permitir construir sistemas mais fiáveis, melhor preparados para apoiar a tomada de decisão clinica em situações complexas e com maior nível de confiança. Esta tese aborda a ampla questão da quantificação de incerteza na aprendizagem automática, incluindo o desenvolvimento e adaptação de métodos de quantificação de incerteza, a sua integração no pipeline de desenvolvimento de modelos de aprendizagem automática e a sua aplicação prática na tomada de decisão clínica. Nos contributos originais, inclui-se o desenvolvimento de métodos para apoiar os profissionais de desenvolvimento na criação de modelos mais robustos e interpretáveis, que tenham em consideração as diferentes fontes de incerteza nos diversos componenteschave do pipeline de aprendizagem automática: os dados, o modelo de aprendizagem automática e os seus resultados. Adicionalmente, os modelos de aprendizagem automática são construídos com a capacidade de se abster, o que permite aceitar ou rejeitar uma previsão com base no nível de incerteza presente, o que realça a importância da utilização de modelos de classificação com a opção de rejeição em sistemas de apoio à decisão clínica. A eficácia dos métodos propostos foi avaliada em bases de dados contendo sinais fisiológicos provenientes de diagnósticos médicos e reconhecimento de atividades humanas. As conclusões sustentam a importância da quantificação da incerteza nos modelos de aprendizagem automática para obter previsões mais fiáveis e robustas. Desenvolvendo estes tópicos, esta tese pretende aumentar a fiabilidade e credibilidade dos modelos de aprendizagem automática, promovendo a utilização e desenvolvimento dos sistemas de apoio à decisão clínica. O objetivo final é aumentar o grau de confiança e a fiabilidade das previsões dos modelos, bem como, aumentar a transparência e interpretabilidade, proporcionando uma melhor tomada de decisão numa variedade de aplicações
    corecore