13 research outputs found

    Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data

    Get PDF
    Background The synthesis of information across microarray studies has been performed by combining statistical results of individual studies (as in a mosaic), or by combining data from multiple studies into a large pool to be analyzed as a single data set (as in a melting pot of data). Specific issues relating to data heterogeneity across microarray studies, such as differences within and between labs or differences among experimental conditions, could lead to equivocal results in a melting pot approach. Results We applied statistical theory to determine the specific effect of different means and heteroskedasticity across 19 groups of microarray data on the sign and magnitude of gene-to-gene Pearson correlation coefficients obtained from the pool of 19 groups. We quantified the biases of the pooled coefficients and compared them to the biases of correlations estimated by an effect-size model. Mean differences across the 19 groups were the main factor determining the magnitude and sign of the pooled coefficients, which showed largest values of bias as they approached ±1. Only heteroskedasticity across the pool of 19 groups resulted in less efficient estimations of correlations than did a classical meta-analysis approach of combining correlation coefficients. These results were corroborated by simulation studies involving either mean differences or heteroskedasticity across a pool of N \u3e 2 groups. Conclusions The combination of statistical results is best suited for synthesizing the correlation between expression profiles of a gene pair across several microarray studies

    A Novel Protein Kinase-Like Domain in a Selenoprotein, Widespread in the Tree of Life

    Get PDF
    Selenoproteins serve important functions in many organisms, usually providing essential oxidoreductase enzymatic activity, often for defense against toxic xenobiotic substances. Most eukaryotic genomes possess a small number of these proteins, usually not more than 20. Selenoproteins belong to various structural classes, often related to oxidoreductase function, yet a few of them are completely uncharacterised

    Реструктуризация как инструмент повышения эффективности функционирования предприятий регионального инвестиционно-строительного комплекса (на примере Республики Татарстан): автореферат диссертации на соискание ученой степени кандидата экономических наук: специальность 08.00.05 - Экономика и управление народным хозяйством (региональная экономика)

    No full text
    One innovative solution to traffic congestion is to use real-time data and Intelligent Transportation Systems (ITSs) to optimize the existing transportation system. To address this need, we propose an algorithm for real-time automatic congestion identification that uses speed probe data and the corresponding weather and visibility to build a unified model. Based on traffic flow theory, the algorithm assumes three traffic states: congestion, speed-at-capacity, and free-flow. Our algorithm assumes that speed is drawn from a mixture of three components, whose means are functions of weather and visibility and defined using a linear regression of their predictors. The parameters of the model were estimated using three empirical datasets from Virginia, California, and Texas. The fitted model was used to calculate the speed cut-off between congestion and speed-at-capacity by minimizing either the Bayesian classification error or the false positive (congestion) rate. The test results showed promising congestion identification performance.</p
    corecore