31 research outputs found

    QuateXelero : an accelerated exact network motif detection algorithm

    Get PDF
    Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network

    Aristotle: Stratified Causal Discovery for Omics Data

    Get PDF
    Background There has been a simultaneous increase in demand and accessibility across genomics, transcriptomics, proteomics and metabolomics data, known as omics data. This has encouraged widespread application of omics data in life sciences, from personalized medicine to the discovery of underlying pathophysiology of diseases. Causal analysis of omics data may provide important insight into the underlying biological mechanisms. Existing causal analysis methods yield promising results when identifying potential general causes of an observed outcome based on omics data. However, they may fail to discover the causes specific to a particular stratum of individuals and missing from others. Methods To fill this gap, we introduce the problem of stratified causal discovery and propose a method, Aristotle, for solving it. Aristotle addresses the two challenges intrinsic to omics data: high dimensionality and hidden stratification. It employs existing biological knowledge and a state-of-the-art patient stratification method to tackle the above challenges and applies a quasi-experimental design method to each stratum to find stratum-specific potential causes. Results Evaluation based on synthetic data shows better performance for Aristotle in discovering true causes under different conditions compared to existing causal discovery methods. Experiments on a real dataset on Anthracycline Cardiotoxicity indicate that Aristotle’s predictions are consistent with the existing literature. Moreover, Aristotle makes additional predictions that suggest further investigations

    Probabilistic graphical models for the analysis of omics heterogeneity

    Get PDF
    One of the biggest challenges in diagnosis, prognosis, and treatment of complex diseases like cancer is the heterogeneity of underlying disease mechanisms. This challenge has rendered the conventional and evidence-based medicine ineffective as a common remedy does not cure every patient with the same complex disease. The new paradigm in medicine, called precision or personalized medicine, is aimed at utilizing the new data collection technologies, such as high-throughput DNA sequencing, together with computational resources and algorithms, such as machine learning, to enable the scientists and physicians to understand the specifics of diseases for individuals and provide treatment strategies based on their personal characteristics. In this thesis, we provide probabilistic graphical models to decipher the heterogeneity of diseases with an emphasis on cancer, using the recently available omics data from patients. We model the heterogeneity at two levels. First, we propose unsupervised and supervised biclustering methods for detecting heterogeneity at the level of a population of patients based on their genomic, transcriptomic and clinical characteristics. The provided frameworks are also theoretically applicable to other omics data types. Second, we provide a phylogenetic analysis method to analyze the heterogeneity of a population of cells of a tumor, i.e. intra-tumor heterogeneity, based on genomic data. By transferring the evolutionary information across different tumors, this method leverages the inter-tumor heterogeneity information to infer the intra-tumor heterogeneity of individual tumors with more certainty. The proposed methods have promising performance when compared with the-state-of-the-art using both synthetic and real data

    Uncovering the Subtype-Specific Temporal Order of Cancer Pathway Dysregulation

    Get PDF
    Cancer is driven by genetic mutations that dysregulate pathways important for proper cell function. Therefore, discovering these cancer pathways and their dysregulation order is key to understanding and treating cancer. However, the heterogeneity of mutations between different individuals makes this challenging and requires that cancer progression is studied in a subtype-specific way. To address this challenge, we provide a mathematical model, called Subtype-specific Pathway Linear Progression Model (SPM), that simultaneously captures cancer subtypes and pathways and order of dysregulation of the pathways within each subtype. Experiments with synthetic data indicate the robustness of SPM to problem specifics including noise compared to an existing method. Moreover, experimental results on glioblastoma multiforme and colorectal adenocarcinoma show the consistency of SPM’s results with the existing knowledge and its superiority to an existing method in certain cases. The implementation of our method is available at https://github.com/Dalton386/SPM

    CytoKavosh: a cytoscape plug-in for finding network motifs in large biological networks.

    Get PDF
    Network motifs are small connected sub-graphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plug-in which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape work-space (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh's C++ to the Cytoscape Java program, which makes this plug-in suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plug-in can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plug-in, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plug-in, and we hope that this plug-in can be improved to cover other options to make it the best motif-analyzing tool

    The concept of Equality Point.

    No full text
    <p>Positive and negative equality points are illustrated respectively in the left and the right charts. The vertical axis <i>t</i> indicates the total time of algorithms and the horizontal axis <i>r</i> shows the number of random networks used for motif detection.</p

    Steps taken to search the quaternary tree during expanding (enumerating) a sample subgraph.

    No full text
    <p>In this figure, −1 indicates one way connection from the existing vertex to added vertex, 0 indicates no connection between them, 1 stands for a one way connection in the reverse direction, and 2 shows a two way connection. The order of numbers in the input string is the same order as the corresponding vertices are added during expanding the subgraph (that is 1, 2, 3, and then 4 in this example).</p

    QuateXelero (QX) vs. G-Tries in larger motifs.

    No full text
    <p>5 random networks were used in all experiments. Bolded italic values for Yeast network are estimated with respect to the results in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0068073#pone-0068073-t005" target="_blank">Table 5</a>.</p
    corecore