345 research outputs found

    ProbCD: enrichment analysis accounting for categorization uncertainty

    Get PDF
    As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for
the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation

    A generalizable data-driven multicellular model of pancreatic ductal adenocarcinoma.

    Get PDF
    BACKGROUND: Mechanistic models, when combined with pertinent data, can improve our knowledge regarding important molecular and cellular mechanisms found in cancer. These models make the prediction of tissue-level response to drug treatment possible, which can lead to new therapies and improved patient outcomes. Here we present a data-driven multiscale modeling framework to study molecular interactions between cancer, stromal, and immune cells found in the tumor microenvironment. We also develop methods to use molecular data available in The Cancer Genome Atlas to generate sample-specific models of cancer. RESULTS: By combining published models of different cells relevant to pancreatic ductal adenocarcinoma (PDAC), we built an agent-based model of the multicellular pancreatic tumor microenvironment, formally describing cell type-specific molecular interactions and cytokine-mediated cell-cell communications. We used an ensemble-based modeling approach to systematically explore how variations in the tumor microenvironment affect the viability of cancer cells. The results suggest that the autocrine loop involving EGF signaling is a key interaction modulator between pancreatic cancer and stellate cells. EGF is also found to be associated with previously described subtypes of PDAC. Moreover, the model allows a systematic exploration of the effect of possible therapeutic perturbations; our simulations suggest that reducing bFGF secretion by stellate cells will have, on average, a positive impact on cancer apoptosis. CONCLUSIONS: The developed framework allows model-driven hypotheses to be generated regarding therapeutically relevant PDAC states with potential molecular and cellular drivers indicating specific intervention strategies

    Simcluster: clustering enumeration gene expression data on the simplex space

    Get PDF
    Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space.

Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster.

Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data

    Should We Learn Probabilistic Models for Model Checking? A New Approach and An Empirical Study

    Get PDF
    Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt model-based system analysis and development techniques. To overcome this problem, researchers have proposed to automatically "learn" models based on sample system executions and shown that the learned models can be useful sometimes. There are however many questions to be answered. For instance, how much shall we generalize from the observed samples and how fast would learning converge? Or, would the analysis result based on the learned model be more accurate than the estimation we could have obtained by sampling many system executions within the same amount of time? In this work, we investigate existing algorithms for learning probabilistic models for model checking, propose an evolution-based approach for better controlling the degree of generalization and conduct an empirical study in order to answer the questions. One of our findings is that the effectiveness of learning may sometimes be limited.Comment: 15 pages, plus 2 reference pages, accepted by FASE 2017 in ETAP

    CRI iAtlas: an interactive portal for immuno-oncology research.

    Get PDF
    The Cancer Research Institute (CRI) iAtlas is an interactive web platform for data exploration and discovery in the context of tumors and their interactions with the immune microenvironment. iAtlas allows researchers to study immune response characterizations and patterns for individual tumor types, tumor subtypes, and immune subtypes. iAtlas supports computation and visualization of correlations and statistics among features related to the tumor microenvironment, cell composition, immune expression signatures, tumor mutation burden, cancer driver mutations, adaptive cell clonality, patient survival, expression of key immunomodulators, and tumor infiltrating lymphocyte (TIL) spatial maps. iAtlas was launched to accompany the release of the TCGA PanCancer Atlas and has since been expanded to include new capabilities such as (1) user-defined loading of sample cohorts, (2) a tool for classifying expression data into immune subtypes, and (3) integration of TIL mapping from digital pathology images. We expect that the CRI iAtlas will accelerate discovery and improve patient outcomes by providing researchers access to standardized immunogenomics data to better understand the tumor immune microenvironment and its impact on patient responses to immunotherapy

    A self-organized model for cell-differentiation based on variations of molecular decay rates

    Get PDF
    Systemic properties of living cells are the result of molecular dynamics governed by so-called genetic regulatory networks (GRN). These networks capture all possible features of cells and are responsible for the immense levels of adaptation characteristic to living systems. At any point in time only small subsets of these networks are active. Any active subset of the GRN leads to the expression of particular sets of molecules (expression modes). The subsets of active networks change over time, leading to the observed complex dynamics of expression patterns. Understanding of this dynamics becomes increasingly important in systems biology and medicine. While the importance of transcription rates and catalytic interactions has been widely recognized in modeling genetic regulatory systems, the understanding of the role of degradation of biochemical agents (mRNA, protein) in regulatory dynamics remains limited. Recent experimental data suggests that there exists a functional relation between mRNA and protein decay rates and expression modes. In this paper we propose a model for the dynamics of successions of sequences of active subnetworks of the GRN. The model is able to reproduce key characteristics of molecular dynamics, including homeostasis, multi-stability, periodic dynamics, alternating activity, differentiability, and self-organized critical dynamics. Moreover the model allows to naturally understand the mechanism behind the relation between decay rates and expression modes. The model explains recent experimental observations that decay-rates (or turnovers) vary between differentiated tissue-classes at a general systemic level and highlights the role of intracellular decay rate control mechanisms in cell differentiation.Comment: 16 pages, 5 figure

    Modeling Stochasticity and Variability in Gene Regulatory Networks

    Get PDF
    Modeling stochasticity in gene regulatory networks is an important and complex problem in molecular systems biology. To elucidate intrinsic noise, several modeling strategies such as the Gillespie algorithm have been used successfully. This paper contributes an approach as an alternative to these classical settings. Within the discrete paradigm, where genes, proteins, and other molecular components of gene regulatory networks are modeled as discrete variables and are assigned as logical rules describing their regulation through interactions with other components. Stochasticity is modeled at the biological function level under the assumption that even if the expression levels of the input nodes of an update rule guarantee activation or degradation there is a probability that the process will not occur due to stochastic effects. This approach allows a finer analysis of discrete models and provides a natural setup for cell population simulations to study cell-to-cell variability. We applied our methods to two of the most studied regulatory networks, the outcome of lambda phage infection of bacteria and the p53-mdm2 complex.Comment: 23 pages, 8 figure

    Forum on immune digital twins: a meeting report

    Full text link
    Medical digital twins are computational models of human biology relevant to a given medical condition, which can be tailored to an individual patient, thereby predicting the course of disease and individualized treatments, an important goal of personalized medicine. The immune system, which has a central role in many diseases, is highly heterogeneous between individuals, and thus poses a major challenge for this technology. If medical digital twins are to faithfully capture the characteristics of a patient's immune system, we need to answer many questions, such as: What do we need to know about the immune system to build mathematical models that reflect features of an individual? What data do we need to collect across the different scales of immune system action? What are the right modeling paradigms to properly capture immune system complexity? In February 2023, an international group of experts convened in Lake Nona, FL for two days to discuss these and other questions related to digital twins of the immune system. The group consisted of clinicians, immunologists, biologists, and mathematical modelers, representative of the interdisciplinary nature of medical digital twin development. A video recording of the entire event is available. This paper presents a synopsis of the discussions, brief descriptions of ongoing digital twin projects at different stages of progress. It also proposes a 5-year action plan for further developing this technology. The main recommendations are to identify and pursue a small number of promising use cases, to develop stimulation-specific assays of immune function in a clinical setting, and to develop a database of existing computational immune models, as well as advanced modeling technology and infrastructure
    corecore