13 research outputs found

    Visualization of TCGA CNV data with inferred maximum likelihood simplicial complex structure.

    No full text
    <p>The inferred structure of three arms sharing a point corresponds to a phylogeny of one most recent common ancestor, and three branches of a tree. Data points, corresponding to distinct tumor samples plotted in principal component space, are color coded by immunohistological subtype (red circle: Her2+, purple plus: ER/PR+, blue asterisk: triple-negative).</p

    Spearman correlation from PyClone clusters derived from TCGA breast cancer CNV data to clinical labels of samples.

    No full text
    <p>P-values for the correlations appear in parentheses. Significant values (p<0.01) are marked in bold.</p

    Visualization of TCGA RNA-Seq data with inferred maximum likelihood simplicial complex structure.

    No full text
    <p>Note that the tetrahedron inferred was considered alongside other simplices and simplicial complex but considered most likely. The data are enclosed in the tetrahedron, and as such can be approximated as mixtures of the vertices. Data points, corresponding to distinct tumor samples plotted in principal component space, are color coded by immunohistological subtype (red circle: Her2+, purple plus: ER/PR+, blue asterisk: triple-negative).</p

    Top DAVID term enrichment results for RNA expression deconvolution.

    No full text
    <p>The table provides the ten most significantly enriched terms, identified by source repository and term, Benjamini-corrected p-values, and associated vertices of the inferred simplicial complex.</p

    Overview of the full analysis pipeline: Input samples are represented by collections of copy number (CN) call files and/or RNA expression measurements, which are converted to a matrix format.

    No full text
    <p>These matrix inputs are passed to our simplicial complex inference code, which infers a mixed membership model of the data and associated model likelihood. The inference is computed by (1) principal components analysis (PCA) to perform dimensionality reduction and denoising of geometric structure; (2) medoidshift pre-clustering to identify low-dimensional sub-manifolds corresponding to distinct submixtures of the data; (3) dimensionality inference via sliver estimation to estimate likely numbers of mixture components needed to model each submixture; (4) unmixing on each substructure to identify preliminary mixture decompositions of the submixtures; and (5) a K-nearest-neighbor (KNN-based) reconciliation model to identify likely shared vertices between submanifolds. Each of these steps is explained in more detail in the main text. The inferred low dimension subspaces may be partially- or non-intersecting. We require, however, that the subspaces form a continuous structure, and merge disconnected subspaces using a maximum likelihood model. The inferred mixture components are then used in downstream functional annotation to identify dysregulated pathways or term associations.</p

    Spearman correlation values for simplicial complex unmixing fractional estimates from TCGA breast cancer RNA-Seq data with TCGA-provided clinical subtypes.

    No full text
    <p>P-values for the comparisons appear in parentheses. Significant values (p<0.01) are marked in bold.</p

    Pseudocode for merging protocol to select most likely from a set of candidate models provided none are simplicial complexes.

    No full text
    <p>Pseudocode for merging protocol to select most likely from a set of candidate models provided none are simplicial complexes.</p

    Spearman correlation of simplicial complex mixture fractions derived from TCGA breast CNV cancer data to clinical labels of samples.

    No full text
    <p>P-values for the comparisons appear in parentheses. Significant values (p<0.01) are marked in bold.</p

    Spearman correlation values (Rho values) among inferred vertices from simplicial complex unmixing by our method and subpopulation clusters derived from PyClone applied to TCGA breast cancer CNV data.

    No full text
    <p>The Py prefix is used for PyClone clusters. For our estimates, we use the V prefix. P-values for the comparisons appear in parentheses. Significant values (p<0.01) are marked in bold.</p
    corecore