188 research outputs found

    On the future of astrostatistics: statistical foundations and statistical practice

    Full text link
    This paper summarizes a presentation for a panel discussion on "The Future of Astrostatistics" held at the Statistical Challenges in Modern Astronomy V conference at Pennsylvania State University in June 2011. I argue that the emerging needs of astrostatistics may both motivate and benefit from fundamental developments in statistics. I highlight some recent work within statistics on fundamental topics relevant to astrostatistical practice, including the Bayesian/frequentist debate (and ideas for a synthesis), multilevel models, and multiple testing. As an important direction for future work in statistics, I emphasize that astronomers need a statistical framework that explicitly supports unfolding chains of discovery, with acquisition, cataloging, and modeling of data not seen as isolated tasks, but rather as parts of an ongoing, integrated sequence of analyses, with information and uncertainty propagating forward and backward through the chain. A prototypical example is surveying of astronomical populations, where source detection, demographic modeling, and the design of survey instruments and strategies all interact.Comment: 8 pp, 2 figures. To appear in "Statistical Challenges in Modern Astronomy V," (Lecture Notes in Statistics, Vol. 209), ed. Eric D. Feigelson and G. Jogesh Babu; publication planned for Sep 2012; see http://www.springer.com/statistics/book/978-1-4614-3519-

    Bayesian astrostatistics: a backward look to the future

    Full text link
    This perspective chapter briefly surveys: (1) past growth in the use of Bayesian methods in astrophysics; (2) current misconceptions about both frequentist and Bayesian statistical inference that hinder wider adoption of Bayesian methods by astronomers; and (3) multilevel (hierarchical) Bayesian modeling as a major future direction for research in Bayesian astrostatistics, exemplified in part by presentations at the first ISI invited session on astrostatistics, commemorated in this volume. It closes with an intentionally provocative recommendation for astronomical survey data reporting, motivated by the multilevel Bayesian perspective on modeling cosmic populations: that astronomers cease producing catalogs of estimated fluxes and other source properties from surveys. Instead, summaries of likelihood functions (or marginal likelihood functions) for source properties should be reported (not posterior probability density functions), including nontrivial summaries (not simply upper limits) for candidate objects that do not pass traditional detection thresholds.Comment: 27 pp, 4 figures. A lightly revised version of a chapter in "Astrostatistical Challenges for the New Astronomy" (Joseph M. Hilbe, ed., Springer, New York, forthcoming in 2012), the inaugural volume for the Springer Series in Astrostatistics. Version 2 has minor clarifications and an additional referenc

    Sequential design of computer experiments for the estimation of a probability of failure

    Full text link
    This paper deals with the problem of estimating the volume of the excursion set of a function f:RdRf:\mathbb{R}^d \to \mathbb{R} above a given threshold, under a probability measure on Rd\mathbb{R}^d that is assumed to be known. In the industrial world, this corresponds to the problem of estimating a probability of failure of a system. When only an expensive-to-simulate model of the system is available, the budget for simulations is usually severely limited and therefore classical Monte Carlo methods ought to be avoided. One of the main contributions of this article is to derive SUR (stepwise uncertainty reduction) strategies from a Bayesian-theoretic formulation of the problem of estimating a probability of failure. These sequential strategies use a Gaussian process model of ff and aim at performing evaluations of ff as efficiently as possible to infer the value of the probability of failure. We compare these strategies to other strategies also based on a Gaussian process model for estimating a probability of failure.Comment: This is an author-generated postprint version. The published version is available at http://www.springerlink.co

    Improving the in silico assessment of pathogenicity for compensated variants

    Get PDF
    Understanding the functional sequelae of amino-acid replacements is of fundamental importance in medical genetics. Perhaps, the most intuitive way to assess the potential pathogenicity of a given human missense variant is by measuring the degree of evolutionary conservation of the substituted amino-acid residue, a feature that generally serves as a good proxy metric for the functional/structural importance of that residue. However, the presence of putatively compensated variants as the wild-type alleles in orthologous proteins of other mammalian species not only challenges this classical view of amino-acid essentiality but also precludes the accurate evaluation of the functional impact of this type of missense variant using currently available bioinformatic prediction tools. Compensated variants constitute at least 4% of all known missense variants causing human-inherited disease and hence represent an important potential source of error in that they are likely to be disproportionately misclassified as benign variants. The consequent under-reporting of compensated variants is exacerbated in the context of next-generation sequencing where their inappropriate exclusion constitutes an unfortunate natural consequence of the filtering and prioritization of the very large number of variants generated. Here we demonstrate the reduced performance of currently available pathogenicity prediction tools when applied to compensated variants and propose an alternative machine-learning approach to assess likely pathogenicity for this particular type of variant

    Data harmonisation for information fusion in digital healthcare: A state-of-the-art systematic review, meta-analysis and future research directions

    Get PDF
    Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and protocols to improve stability and robustness. Previous studies have described various computational approaches to fuse single modality multicentre datasets. However, these surveys rarely focused on evaluation metrics and lacked a checklist for computational data harmonisation studies. In this systematic review, we summarise the computational data harmonisation approaches for multi-modality data in the digital healthcare field, including harmonisation strategies and evaluation metrics based on different theories. In addition, a comprehensive checklist that summarises common practices for data harmonisation studies is proposed to guide researchers to report their research findings more effectively. Last but not least, flowcharts presenting possible ways for methodology and metric selection are proposed and the limitations of different methods have been surveyed for future research

    PRIMAGE project : predictive in silico multiscale analytics to support childhood cancer personalised evaluation empowered by imaging biomarkers

    Get PDF
    PRIMAGE is one of the largest and more ambitious research projects dealing with medical imaging, artificial intelligence and cancer treatment in children. It is a 4-year European Commission-financed project that has 16 European partners in the consortium, including the European Society for Paediatric Oncology, two imaging biobanks, and three prominent European paediatric oncology units. The project is constructed as an observational in silico study involving high-quality anonymised datasets (imaging, clinical, molecular, and genetics) for the training and validation of machine learning and multiscale algorithms. The open cloud-based platform will offer precise clinical assistance for phenotyping (diagnosis), treatment allocation (prediction), and patient endpoints (prognosis), based on the use of imaging biomarkers, tumour growth simulation, advanced visualisation of confidence scores, and machine-learning approaches. The decision support prototype will be constructed and validated on two paediatric cancers: neuroblastoma and diffuse intrinsic pontine glioma. External validation will be performed on data recruited from independent collaborative centres. Final results will be available for the scientific community at the end of the project, and ready for translation to other malignant solid tumours

    Estrogen-like activity of seafood related to environmental chemical contaminants

    Get PDF
    BACKGROUND: A wide variety of environmental pollutants occur in surface waters, including estuarine and marine waters. Many of these contaminants are recognised as endocrine disrupting chemicals (EDCs) which can adversely affect the male and female reproductive system by binding the estrogen receptor and exhibiting hormone-like activities. In this study the estrogenic activity of extracts of edible marine organisms for human consumption from the Mediterranean Sea was assayed. METHODS: Marine organisms were collected in two different areas of the Mediterranean Sea. The estrogenic activity of tissues was assessed using an in vitro yeast reporter gene assay (S. cerevisiae RMY 326 ER-ERE). Concentrations of polychlorinated biphenyls (PCBs) (congeners 28, 52, 101, 118, 138, 153, 180) in fish tissue was also evaluated. RESULTS: Thirty-eight percent of extracts showed a hormone-like activity higher than 10% of the activity elicited by 10 nM 17b-estradiol (E2) used as control. Total PCB concentrations ranged from 0.002 up to 1.785 ng/g wet weight. Chemical analyses detected different levels of contamination among the species collected in the two areas, with the ones collected in the Adriatic Sea showing concentrations significantly higher than those collected in the Tyrrhenian Sea (p < 0.01). CONCLUSION: The more frequent combination of chemicals in the samples that showed higher estrogenic activity was PCB 28, PCB 101, PCB 153, PCB 180. The content of PCBs and estrogenic activity did not reveal any significant correlation

    Bayesian Probability and Statistics in Management Research: A New Horizon

    Get PDF
    This special issue is focused on how a Bayesian approach to estimation, inference, and reasoning in organizational research might supplement—and in some cases supplant—traditional frequentist approaches. Bayesian methods are well suited to address the increasingly complex phenomena and problems faced by 21st-century researchers and organizations, where very complex data abound and the validity of knowledge and methods are often seen as contextually driven and constructed. Traditional modeling techniques and a frequentist view of probability and method are challenged by this new reality
    corecore