Search CORE

62 research outputs found

Spearman correlation from PyClone clusters derived from TCGA breast cancer CNV data to clinical labels of samples.

Author: Lu Xie (55571)
Russell Schwartz (39843)
Theodore Roman (4528969)
Publication venue
Publication date
Field of study

P-values for the correlations appear in parentheses. Significant values (p<0.01) are marked in bold.</p

FigShare

Overview of the full analysis pipeline: Input samples are represented by collections of copy number (CN) call files and/or RNA expression measurements, which are converted to a matrix format.

Author: Lu Xie (55571)
Russell Schwartz (39843)
Theodore Roman (4528969)
Publication venue
Publication date
Field of study

These matrix inputs are passed to our simplicial complex inference code, which infers a mixed membership model of the data and associated model likelihood. The inference is computed by (1) principal components analysis (PCA) to perform dimensionality reduction and denoising of geometric structure; (2) medoidshift pre-clustering to identify low-dimensional sub-manifolds corresponding to distinct submixtures of the data; (3) dimensionality inference via sliver estimation to estimate likely numbers of mixture components needed to model each submixture; (4) unmixing on each substructure to identify preliminary mixture decompositions of the submixtures; and (5) a K-nearest-neighbor (KNN-based) reconciliation model to identify likely shared vertices between submanifolds. Each of these steps is explained in more detail in the main text. The inferred low dimension subspaces may be partially- or non-intersecting. We require, however, that the subspaces form a continuous structure, and merge disconnected subspaces using a maximum likelihood model. The inferred mixture components are then used in downstream functional annotation to identify dysregulated pathways or term associations.</p

FigShare

Modeling Effects of RNA on Capsid Assembly Pathways via Coarse-Grained Stochastic Simulation

Author: Gregory R. Smith (3165048)
Lu Xie (55571)
Russell Schwartz (39843)
Publication venue
Publication date: 31/05/2016
Field of study

<div>The environment of a living cell is vastly different from that of an in vitro reaction system, an issue that presents great challenges to the use of in vitro models, or computer simulations based on them, for understanding biochemistry in vivo. Virus capsids make an excellent model system for such questions because they typically have few distinct components, making them amenable to in vitro and modeling studies, yet their assembly can involve complex networks of possible reactions that cannot be resolved in detail by any current experimental technology. We previously fit kinetic simulation parameters to bulk in vitro assembly data to yield a close match between simulated and real data, and then used the simulations to study features of assembly that cannot be monitored experimentally. The present work seeks to project how assembly in these simulations fit to in vitro data would be altered by computationally adding features of the cellular environment to the system, specifically the presence of nucleic acid about which many capsids assemble. The major challenge of such work is computational: simulating fine-scale assembly pathways on the scale and in the parameter domains of real viruses is far too computationally costly to allow for explicit models of nucleic acid interaction. We bypass that limitation by applying analytical models of nucleic acid effects to adjust kinetic rate parameters learned from in vitro data to see how these adjustments, singly or in combination, might affect fine-scale assembly progress. The resulting simulations exhibit surprising behavioral complexity, with distinct effects often acting synergistically to drive efficient assembly and alter pathways relative to the in vitro model. The work demonstrates how computer simulations can help us understand how assembly might differ between the in vitro and in vivo environments and what features of the cellular environment account for these differences.</div

Directory of Open Access Journals

FigShare

Visualization of TCGA CNV data with inferred maximum likelihood simplicial complex structure.

Author: Lu Xie (55571)
Russell Schwartz (39843)
Theodore Roman (4528969)
Publication venue
Publication date
Field of study

The inferred structure of three arms sharing a point corresponds to a phylogeny of one most recent common ancestor, and three branches of a tree. Data points, corresponding to distinct tumor samples plotted in principal component space, are color coded by immunohistological subtype (red circle: Her2+, purple plus: ER/PR+, blue asterisk: triple-negative).</p

FigShare

Top DAVID term enrichment results for RNA expression deconvolution.

Author: Lu Xie (55571)
Russell Schwartz (39843)
Theodore Roman (4528969)
Publication venue
Publication date
Field of study

The table provides the ten most significantly enriched terms, identified by source repository and term, Benjamini-corrected p-values, and associated vertices of the inferred simplicial complex.</p

FigShare

Mass fraction plots for CCMV capsid assembly with combinations of two RNA effects.

Author: Gregory R. Smith (3165048)
Lu Xie (55571)
Russell Schwartz (39843)
Publication venue
Publication date
Field of study

Combinations are described by a four digit binary code as explained in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0156547#pone.0156547.t001" target="_blank">Table 1</a>, where a 1 means an effect has been turned on and a 0 means an effect has been turned off. The first digit represents RNA-RNA, the second Compression, the third RNA-protein, and the fourth Concentration. (A) is 1010, (B) is 1001, (C) is 0110, (D) is 0101.</p

FigShare

Spearman correlation values for simplicial complex unmixing fractional estimates from TCGA breast cancer RNA-Seq data with TCGA-provided clinical subtypes.

Author: Lu Xie (55571)
Russell Schwartz (39843)
Theodore Roman (4528969)
Publication venue
Publication date
Field of study

P-values for the comparisons appear in parentheses. Significant values (p<0.01) are marked in bold.</p

FigShare

Visualization of TCGA RNA-Seq data with inferred maximum likelihood simplicial complex structure.

Author: Lu Xie (55571)
Russell Schwartz (39843)
Theodore Roman (4528969)
Publication venue
Publication date
Field of study

Note that the tetrahedron inferred was considered alongside other simplices and simplicial complex but considered most likely. The data are enclosed in the tetrahedron, and as such can be approximated as mixtures of the vertices. Data points, corresponding to distinct tumor samples plotted in principal component space, are color coded by immunohistological subtype (red circle: Her2+, purple plus: ER/PR+, blue asterisk: triple-negative).</p

FigShare

Frequency matrix plots for CCMV capsid assembly with individual RNA effects.

Author: Gregory R. Smith (3165048)
Lu Xie (55571)
Russell Schwartz (39843)
Publication venue
Publication date
Field of study

Frequency matrix plots averaged over 200 simulation runs for CCMV capsid assembly upon applying: (A) RNA- RNA, (B) Compression, (C) RNA-protein, (D) Concentration. In each plot, each row corresponds to a product size and each column to reactant sizes that produce that product. Pixel color in each position corresponds to the frequency with which the given reactant size is used to produce the given product size. Insets within each plot expand the upper-left corner of the main plot, corresponding to products of size 20 or smaller, to better visualize pathways involved in production of small oligomers.</p

FigShare

Simulated light scattering curves for CCMV capsid assembly under different representative combinations of RNA effects.

Author: Gregory R. Smith (3165048)
Lu Xie (55571)
Russell Schwartz (39843)
Publication venue
Publication date
Field of study

Plot comparing simulated light scattering curves for CCMV capsid assembly averaged over 200 individual simulation trajectories for four representative combinations of RNA effects: hollow capsid (no effects considered), the two negative RNA effects (Compression + RNA-RNA), the two positive RNA effects (RNA-Protein + Concentration) and the combination of all four RNA effects. Time on the x axis is shown on a log scale.</p

FigShare