15 research outputs found

    Incorporating Nonlinear Relationships in Microarray Missing Value Imputation

    Get PDF
    Microarray gene expression data often contain missing values. Accurate estimation of the missing values is important for down-stream data analyses that require complete data. Nonlinear relationships between gene expression levels have not been well-utilized in missing value imputation. We propose an imputation scheme based on nonlinear dependencies between genes. By simulations based on real microarray data, we show that incorporating non-linear relationships could improve the accuracy of missing value imputation, both in terms of normalized root mean squared error and in terms of the preservation of the list of significant genes in statistical testing. In addition, we studied the impact of artificial dependencies introduced by data normalization on the simulation results. Our results suggest that methods relying on global correlation structures may yield overly optimistic simulation results when the data has been subjected to row (gene) – wise mean removal

    Hierarchical Clustering of High- Throughput Expression Data Based on General Dependences

    No full text

    Incorporating Nonlinear Relationships in Microarray Missing Value Imputation

    No full text
    Microarray gene expression data often contain missing values. Accurate estimation of the missing values is important for down-stream data analyses that require complete data. Nonlinear relationships between gene expression levels have not been well-utilized in missing value imputation. We propose an imputation scheme based on nonlinear dependencies between genes. By simulations based on real microarray data, we show that incorporating non-linear relationships could improve the accuracy of missing value imputation, both in terms of normalized root mean squared error and in terms of the preservation of the list of significant genes in statistical testing. In addition, we studied the impact of artificial dependencies introduced by data normalization on the simulation results. Our results suggest that methods relying on global correlation structures may yield overly optimistic simulation results when the data has been subjected to row (gene) – wise mean removal

    BUS Vignette

    No full text
    GOAL: The BUS package allows the computation of two types of similarities (correlation [Sokal, 2003] and mutual information [Cover, 2001]) for two different goals: (i) identification of the similarity among the activity of molecules sampled across different experiments (we name this option Unsupervised, U), (ii) identification of the similarity between such molecules and other types of information (clinical, anagraphical, etc, we name this optio

    MeDiA: Mean Distance Association and Its Applications in Nonlinear Gene Set Analysis

    Get PDF
    <div><p>Probabilistic association discovery aims at identifying the association between random vectors, regardless of number of variables involved or linear/nonlinear functional forms. Recently, applications in high-dimensional data have generated rising interest in probabilistic association discovery. We developed a framework based on functions on the observation graph, named MeDiA (<u>M</u>ean <u>D</u>istance <u>A</u>ssociation). We generalize its property to a group of functions on the observation graph. The group of functions encapsulates major existing methods in association discovery, e.g. mutual information and Brownian Covariance, and can be expanded to more complicated forms. We conducted numerical comparison of the statistical power of related methods under multiple scenarios. We further demonstrated the application of MeDiA as a method of gene set analysis that captures a broader range of responses than traditional gene set analysis methods.</p></div

    Network interaction for celiac disease pathways.

    No full text
    <p>Red edge indicates that the interaction between connected pathways are amplified in disease individuals. Blue edge indicates the interaction suppressed in disease individuals.</p

    Gene sets associated with the two-dimensional clinical outcome based on MeDiA.

    No full text
    <p><sup>*</sup> Superscripts by the GO terms are for easy reference from the main text.</p><p>Gene sets associated with the two-dimensional clinical outcome based on MeDiA.</p

    Comparison between the independent bivariate normal distribution and mixture normal distribution in Fig 1.

    No full text
    <p>Comparison between the independent bivariate normal distribution and mixture normal distribution in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124620#pone.0124620.g001" target="_blank">Fig 1</a>.</p
    corecore