855 research outputs found

    Modeling cumulative biological phenomena with Suppes-Bayes Causal Networks

    Get PDF
    Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wildtype conditions. Cancer and HIV are two common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pressures are often observed along with competition, cooperation and parasitism among distinct cellular clones. Recently, we presented a mathematical framework to model these phenomena, based on a combination of Bayesian inference and Suppes' theory of probabilistic causation, depicted in graphical structures dubbed Suppes-Bayes Causal Networks (SBCNs). SBCNs are generative probabilistic graphical models that recapitulate the potential ordering of accumulation of such DNA changes during the progression of the disease. Such models can be inferred from data by exploiting likelihood-based model-selection strategies with regularization. In this paper we discuss the theoretical foundations of our approach and we investigate in depth the influence on the model-selection task of: (i) the poset based on Suppes' theory and (ii) different regularization strategies. Furthermore, we provide an example of application of our framework to HIV genetic data highlighting the valuable insights provided by the inferred

    Data-driven learning how oncogenic gene expression locally alters heterocellular networks

    Get PDF
    Developing drugs increasingly relies on mechanistic modeling and simulation. Models that capture causal relations among genetic drivers of oncogenesis, functional plasticity, and host immunity complement wet experiments. Unfortunately, formulating such mechanistic cell-level models currently relies on hand curation, which can bias how data is interpreted or the priority of drug targets. In modeling molecular-level networks, rules and algorithms are employed to limit a priori biases in formulating mechanistic models. Here we combine digital cytometry with Bayesian network inference to generate causal models of cell-level networks linking an increase in gene expression associated with oncogenesis with alterations in stromal and immune cell subsets from bulk transcriptomic datasets. We predict how increased Cell Communication Network factor 4, a secreted matricellular protein, alters the tumor microenvironment using data from patients diagnosed with breast cancer and melanoma. Predictions are then tested using two immunocompetent mouse models for melanoma, which provide consistent experimental results

    Time-delayed models of gene regulatory networks

    Get PDF
    We discuss different mathematical models of gene regulatory networks as relevant to the onset and development of cancer. After discussion of alternativemodelling approaches, we use a paradigmatic two-gene network to focus on the role played by time delays in the dynamics of gene regulatory networks. We contrast the dynamics of the reduced model arising in the limit of fast mRNA dynamics with that of the full model. The review concludes with the discussion of some open problems

    Towards knowledge-based gene expression data mining

    Get PDF
    The field of gene expression data analysis has grown in the past few years from being purely data-centric to integrative, aiming at complementing microarray analysis with data and knowledge from diverse available sources. In this review, we report on the plethora of gene expression data mining techniques and focus on their evolution toward knowledge-based data analysis approaches. In particular, we discuss recent developments in gene expression-based analysis methods used in association and classification studies, phenotyping and reverse engineering of gene networks

    A temporal prognostic model based on dynamic Bayesian networks: mining medical insurance data

    Get PDF
    A prognostic model is a formal combination of multiple predictors from which risk probability of a specific diagnosis can be modelled for patients. Prognostic models have become essential instruments in medicine. The models are used for prediction purposes of guiding doctors to make a smart diagnosis, patient-specific decisions or help in planning the utilization of resources for patient groups who have similar prognostic paths. Dynamic Bayesian networks theoretically provide a very expressive and flexible model to solve temporal problems in medicine. However, this involves various challenges due both to the nature of the clinical domain, and the nature of the DBN modelling and inference process itself. The challenges from the clinical domain include insufficient knowledge of temporal interactions of processes in the medical literature, the sparse nature and variability of medical data collection, and the difficulty in preparing and abstracting clinical data in a suitable format without losing valuable information in the process. Challenges about the DBN methodology and implementation include the lack of tools that allow easy modelling of temporal processes. Overcoming this challenge will help to solve various clinical temporal reasoning problems. In this thesis, we addressed these challenges while building a temporal network with explanations of the effects of predisposing factors, such as age and gender, and the progression information of all diagnoses using claims data from an insurance company in Kenya. We showed that our network could differentiate the possible probability exposure to a diagnosis given the age and gender and possible paths given a patient's history. We also presented evidence that the more patient history is provided, the better the prediction of future diagnosis

    Discovering novel cancer bio-markers in acquired lapatinib resistance using Bayesian methods.

    Full text link
    Signalling transduction pathways (STPs) are commonly hijacked by many cancers for their growth and malignancy, but demystifying their underlying mechanisms is difficult. Here, we developed methodologies with a fully Bayesian approach in discovering novel driver bio-markers in aberrant STPs given high-throughput gene expression (GE) data. This project, namely 'PathTurbEr' (Pathway Perturbation Driver) uses the GE dataset derived from the lapatinib (an EGFR/HER dual inhibitor) sensitive and resistant samples from breast cancer cell lines (SKBR3). Differential expression analysis revealed 512 differentially expressed genes (DEGs) and their pathway enrichment revealed 13 highly perturbed singalling pathways in lapatinib resistance, including PI3K-AKT, Chemokine, Hippo and TGF-β\beta singalling pathways. Next, the aberration in TGF-β\beta STP was modelled as a causal Bayesian network (BN) using three MCMC sampling methods, i.e. Neighbourhood sampler (NS) and Hit-and-Run (HAR) sampler that potentially yield robust inference with lower chances of getting stuck at local optima and faster convergence compared to other state-of-art methods. Next, we examined the structural features of the optimal BN as a statistical process that generates the global structure using p1p_1-model, a special class of Exponential Random Graph Models (ERGMs), and MCMC methods for their hyper-parameter sampling. This step enabled key drivers identification that drive the aberration within the perturbed BN structure of STP, and yielded 34, 34 and 23 perturbation driver genes out of 80 constituent genes of three perturbed STP models of TGF-β\beta signalling inferred by NS, HAR and MH sampling methods, respectively. Functional-relevance and disease-relevance analyses suggested their significant associations with breast cancer progression/resistance
    • …
    corecore