74 research outputs found

    A novel approach to the clustering of microarray data via nonparametric density estimation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cluster analysis is a crucial tool in several biological and medical studies dealing with microarray data. Such studies pose challenging statistical problems due to dimensionality issues, since the number of variables can be much higher than the number of observations.</p> <p>Results</p> <p>Here, we present a general framework to deal with the clustering of microarray data, based on a three-step procedure: (i) gene filtering; (ii) dimensionality reduction; (iii) clustering of observations in the reduced space. Via a nonparametric model-based clustering approach we obtain promising results both in simulated and real data.</p> <p>Conclusions</p> <p>The proposed algorithm is a simple and effective tool for the clustering of microarray data, in an unsupervised setting.</p

    A Comprehensive and Universal Method for Assessing the Performance of Differential Gene Expression Analyses

    Get PDF
    The number of methods for pre-processing and analysis of gene expression data continues to increase, often making it difficult to select the most appropriate approach. We present a simple procedure for comparative estimation of a variety of methods for microarray data pre-processing and analysis. Our approach is based on the use of real microarray data in which controlled fold changes are introduced into 20% of the data to provide a metric for comparison with the unmodified data. The data modifications can be easily applied to raw data measured with any technological platform and retains all the complex structures and statistical characteristics of the real-world data. The power of the method is illustrated by its application to the quantitative comparison of different methods of normalization and analysis of microarray data. Our results demonstrate that the method of controlled modifications of real experimental data provides a simple tool for assessing the performance of data preprocessing and analysis methods

    Assessment and prevention of acute health effects of weather conditions in Europe, the PHEWE project: background, objectives, design

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The project "Assessment and prevention of acute health effects of weather conditions in Europe" (PHEWE) had the aim of assessing the association between weather conditions and acute health effects, during both warm and cold seasons in 16 European cities with widely differing climatic conditions and to provide information for public health policies.</p> <p>Methods</p> <p>The PHEWE project was a three-year pan-European collaboration between epidemiologists, meteorologists and experts in public health. Meteorological, air pollution and mortality data from 16 cities and hospital admission data from 12 cities were available from 1990 to 2000. The short-term effect on mortality/morbidity was evaluated through city-specific and pooled time series analysis. The interaction between weather and air pollutants was evaluated and health impact assessments were performed to quantify the effect on the different populations. A heat/health watch warning system to predict oppressive weather conditions and alert the population was developed in a subgroup of cities and information on existing prevention policies and of adaptive strategies was gathered.</p> <p>Results</p> <p>Main results were presented in a symposium at the conference of the International Society of Environmental Epidemiology in Paris on September 6<sup>th </sup>2006 and will be published as scientific articles. The present article introduces the project and includes a description of the database and the framework of the applied methodology.</p> <p>Conclusion</p> <p>The PHEWE project offers the opportunity to investigate the relationship between temperature and mortality in 16 European cities, representing a wide range of climatic, socio-demographic and cultural characteristics; the use of a standardized methodology allows for direct comparison between cities.</p

    Mining epidemiological time series: an approach based on dynamic regression

    No full text
    In epidemiology, time-series regression models are specially suitable for evaluating short-term effects of time-varying exposures to pollution. To summarize findings from different studies on different cities, the techniques of designed meta-analyses have been employed. In this context, city-specific findings are summarized by an \u2018effect size\u2019 measured on a common scale. Such effects are then pooled together on a second hierarchy of analysis. The objective of this article is to exploit exploratory analysis of city-specific time series. In fact, when dealing with many sources of data, that is, many cities, an exploratory analysis becomes almost unaffordable. Our idea is to explore the time series by fitting complete dynamic regression models. These models are easier to fit than models usually employed and allowimplementation of very fast automated model selection algorithms. The idea is to highlight the common features across cities through this analysis,which might then be used to design the meta-analysis. The proposal is illustrated by analysing data on the relationship between daily nonaccidental deaths and air pollution in the 20 US largest cities

    Partially parametric interval estimation of Pr(Y&gt;X)

    No full text
    Let X andY be two independent continuous random variables. Three techniques to obtain confidence intervals for \rho=PrY >X are discussed in a partially parametric framework. One method relies on the asymptotic normality of an estimator for \rho; the remaining methods involve empirical likelihood and combine it with maximum likelihood estimation and with full parametric likelihood, respectively. Finite-sample accuracy of the confidence intervals is assessed through a simulation study.An illustration is given using a data set on the detection of carriers of Duchenne Muscular Dystrophy
    corecore