28 research outputs found
Clustering of Cases from Di erent Subtypes of Breast Cancer Using a Hop eld Network Built from Multi-omic Data
Tesis de Graduaci贸n (Maestr铆a en Computaci贸n) Instituto Tecnol贸gico de Costa Rica, Escuela de Computaci贸n, 2018Despite scienti c advances, breast cancer still constitutes a worldwide major cause of death
among women. Given the great heterogeneity between cases, distinct classi cation schemes
have emerged. The intrinsic molecular subtype classi cation (luminal A, luminal B, HER2-
enriched and basal-like) accounts for the molecular characteristics and prognosis of tumors,
which provides valuable input for taking optimal treatment actions. Also, recent advancements
in molecular biology have provided scientists with high quality and diversity of omiclike
data, opening up the possibility of creating computational models for improving and
validating current subtyping systems. On this study, a Hop eld Network model for breast
cancer subtyping and characterization was created using data from The Cancer Genome
Atlas repository. Novel aspects include the usage of the network as a clustering mechanism
and the integrated use of several molecular types of data (gene mRNA expression, miRNA
expression and copy number variation). The results showed clustering capabilities for the
network, but even so, trying to derive a biological model from a Hop eld Network might
be di cult given the mirror attractor phenomena (every cluster might end up with an opposite).
As a methodological aspect, Hop eld was compared with kmeans and OPTICS
clustering algorithms. The last one, surprisingly, hints at the possibility of creating a high
precision model that di erentiates between luminal, HER2-enriched and basal samples using
only 10 genes. The normalization procedure of dividing gene expression values by their
corresponding gene copy number appears to have contributed to the results. This opens up
the possibility of exploring these kind of prediction models for implementing diagnostic tests
at a lower cost
Computational analysis on the effects of variations in T and B cells. Primary immunodeficiencies and cancer neoepitopes
Computational approaches are essential to study the effects of inborn and somatic variations. Results from such studies contribute to better diagnosis and therapies. Primary immunodeficiencies (PIDs) are rare inborn defects of key immune response genes. Somatic variations are main drivers of most cancers. Large and diverse data on PID genes and proteins can enable systems biology studies on their dynamic effects on T and B cells. Amino acid substitutions (AASs) are somatic variations that drive cancers. However, AASs also cause cancer-associated antigens that are recognized by lymphocytes as non-self, and are called neoantigens. Detail analysis these neoantigens can be performed due to the availability of cancer data from many consortia.The purpose of this thesis was to investigate the effects of PIDs on T and B cells and to explore features of neoepitopes in cancers. The object of the first study was to detect the central T cell-specific protein network. The purpose of the second and third studies were to reconstruct the T and B cell network model and simulate the dynamic effects of PID perturbations. The aim of the fourth study was to characterize neoepitopes from pan-cancer datasets.The immunome interactome was reconstructed, and the links weighed with gene expression correlation of integrated, time series data (Paper I). The significance of the weighted links were computed with the Global Statistical Significance (GloSS) method, and the weighted interactome network was filtered to obtain the central T cell network. Next, the T cell network model was reconstructed from literature mining and the core T cell protein interaction network (Paper II). The B cell network model was reconstructed by mining the literature for central B cell interactions (Paper III). The normalized HillCube software was used to study the dynamic effects of PID perturbations in T and B cells. Proteome-wide amino AASs on putatively derived 8-, 9-, 10-, and 11-mer neoepitopes in 30 cancer types were analyzed with the NetMHC 4.0 software (Paper IV).The interconnectedness of the major T cell pathways are maintained in the central T cell PPI network. Empirical evidence from Gene Ontology term and essential genes enrichment analyses were in support for the central T cell network. In the T and B cell simulations for several knockout PIDs correspond to previous results. In the T cell model, simulations for TCR, PTPRC, LCK, ZAP70 and ITK indicated profound disruption in network dynamics. BCL10, CARD11, MALT1, NEMO and MAP3K14 simulations showed significant effects. In B cell, the simulations for LYN, BTK, STIM1, ORAI1, CD19, CD21 and CD81 indicated profound changes to many proteins in the network. Severe effects were observed in the BCL10, IKKB, knockout CARD11, MALT1, NEMO, IKKB and WIPF1 simulations. No major effects were observed for constitutively active PID proteins. The most likely epitopes are those which are detected by several macromolecular histocompartibility complexes (MHCs) and of several peptide lengths. 0.17% of all variants yield more than 100 neoepitopes. Amino acid distributions indicate that variants at all positions in neoepitopes of any length are, on average, more hydrophobic compared to the wild-type.The core T cell network approach is general and applicable to any system with adequate data. The T and B cell models enable the understanding of the dynamic effects of PID disease processes and reveals several novel proteins that may be of interest when diagnosing and treating immunological defects. The neoepitope characteristics can be employed for targeted cancer vaccine applications in personalized therapies
Processes on the emergent landscapes of biochemical reaction networks and heterogeneous cell population dynamics: differentiation in living matters.
The notion of an attractor has been widely employed in thinking about the nonlinear dynamics of organisms and biological phenomena as systems and as processes. The notion of a landscape with valleys and mountains encoding multiple attractors, however, has a rigorous foundation only for closed, thermodynamically non-driven, chemical systems, such as a protein. Recent advances in the theory of nonlinear stochastic dynamical systems and its applications to mesoscopic reaction networks, one reaction at a time, have provided a new basis for a landscape of open, driven biochemical reaction systems under sustained chemostat. The theory is equally applicable not only to intracellular dynamics of biochemical regulatory networks within an individual cell but also to tissue dynamics of heterogeneous interacting cell populations. The landscape for an individual cell, applicable to a population of isogenic non-interacting cells under the same environmental conditions, is defined on the counting space of intracellular chemical composition
Dynamic sporulation gene co-expression networks for Bacillus subtilis 168 and the food-borne isolate Bacillus amyloliquefaciens:a transcriptomic model
Sporulation is a survival strategy, adapted by bacterial cells in response to harsh environmental adversities. The adaptation potential differs between strains and the variations may arise from differences in gene regulation. Gene networks are a valuable way of studying such regulation processes and establishing associations between genes. We reconstructed and compared sporulation gene co-expression networks (GCNs) of the model laboratory strain Bacillus subtilis 168 and the food-borne industrial isolate Bacillus amyloliquefaciens. Transcriptome data obtained from samples of six stages during the sporulation process were used for network inference. Subsequently, a gene set enrichment analysis was performed to compare the reconstructed GCNs of B. subtilis 168 and B. amyloliquefaciens with respect to biological functions, which showed the enriched modules with coherent functional groups associated with sporulation. On basis of the GCNs and time-evolution of differentially expressed genes, we could identify novel candidate genes strongly associated with sporulation in B. subtilis 168 and B. amyloliquefaciens. The GCNs offer a framework for exploring transcription factors, their targets, and co-expressed genes during sporulation. Furthermore, the methodology described here can conveniently be applied to other species or biological processes
Recommended from our members
Dynamics of embryonic stem cell differentiation inferred from single-cell transcriptomics show a series of transitions through discrete cell states
The complexity of gene regulatory networks that lead multipotent cells to acquire different cell fates makes a quantitative understanding of differentiation challenging. Using a statistical framework to analyze single-cell transcriptomics data, we infer the gene expression dynamics of early mouse embryonic stem (mES) cell differentiation, uncovering discrete transitions across nine cell states. We validate the predicted transitions across discrete states using flow cytometry. Moreover, using live-cell microscopy, we show that individual cells undergo abrupt transitions from a na茂ve to primed pluripotent state. Using the inferred discrete cell states to build a probabilistic model for the underlying gene regulatory network, we further predict and experimentally verify that these states have unique response to perturbations, thus defining them functionally. Our study provides a framework to infer the dynamics of differentiation from single cell transcriptomics data and to build predictive models of the gene regulatory networks that drive the sequence of cell fate decisions during development. DOI: http://dx.doi.org/10.7554/eLife.20487.00