10 research outputs found

    Integration and visualisation of clinical-omics datasets for medical knowledge discovery

    Get PDF
    In recent decades, the rise of various omics fields has flooded life sciences with unprecedented amounts of high-throughput data, which have transformed the way biomedical research is conducted. This trend will only intensify in the coming decades, as the cost of data acquisition will continue to decrease. Therefore, there is a pressing need to find novel ways to turn this ocean of raw data into waves of information and finally distil those into drops of translational medical knowledge. This is particularly challenging because of the incredible richness of these datasets, the humbling complexity of biological systems and the growing abundance of clinical metadata, which makes the integration of disparate data sources even more difficult. Data integration has proven to be a promising avenue for knowledge discovery in biomedical research. Multi-omics studies allow us to examine a biological problem through different lenses using more than one analytical platform. These studies not only present tremendous opportunities for the deep and systematic understanding of health and disease, but they also pose new statistical and computational challenges. The work presented in this thesis aims to alleviate this problem with a novel pipeline for omics data integration. Modern omics datasets are extremely feature rich and in multi-omics studies this complexity is compounded by a second or even third dataset. However, many of these features might be completely irrelevant to the studied biological problem or redundant in the context of others. Therefore, in this thesis, clinical metadata driven feature selection is proposed as a viable option for narrowing down the focus of analyses in biomedical research. Our visual cortex has been fine-tuned through millions of years to become an outstanding pattern recognition machine. To leverage this incredible resource of the human brain, we need to develop advanced visualisation software that enables researchers to explore these vast biological datasets through illuminating charts and interactivity. Accordingly, a substantial portion of this PhD was dedicated to implementing truly novel visualisation methods for multi-omics studies.Open Acces

    A systematic pathway-based network approach for in silico drug repositioning

    Get PDF
    Drug repositioning, the method of finding new uses for existing drugs, holds the potential to reduce the cost and time of drug development. Successful drug repositioning strategies depend heavily on the availability and aggregation of different drug and disease databases. Moreover, to yield greater understanding of drug prioritisation approaches, it is necessary to objectively assess (benchmark) and compare different methods. Data aggregation requires extensive curation of non-standardised drug nomenclature. To overcome this, we used a graph-theoretic approach to construct a drug synonym resource that collected drug identifiers from a range of publicly available sources, establishing missing links between databases. Thus, we could systematically assess the performance of available in silico drug repositioning methodologies with increased power for scoring true positive drug-disease pairs. We developed a novel pathway-based drug repositioning pipeline, based on a bipartite network of pathway- and drug-gene set correlations that captured functional relationships. To prioritise drugs, we used our bipartite network and the differentially expressed pathways in a given disease that formed a disease signature. We then took the cumulative network correlation between disease pathway and drug signatures to generate a drug prioritisation score. We prioritised drugs for three case studies: juvenile idiopathic arthritis, Alzheimer's and Parkinson's disease. We explored the use of different true positive lists in the evaluation of drug repositioning performance, providing insight into the most appropriate benchmark designs. We have identified several promising drug candidates and showed that our method successfully prioritises disease-modifying treatments over drugs offering symptomatic relief. We have compared the pipeline’s performance to an alternative well-established method and showed that our method has increased sensitivity to current treatment trends. The successful translation of drug candidates identified in this thesis has the potential to speed up the drug-discovery pipeline and thus more rapidly and efficiently deliver disease-modifying treatments to patients

    Nestedness and Modularity in Bipartite Networks

    Get PDF
    Bipartite networks are a useful way of representing interactions between two sets of entities. Understanding the underlying structures of such networks may give insights into the functionality and behaviour of the systems they represent. Two important structural patterns identified in bipartite networks are nestedness and modularity. Nestedness describes a hierarchical ordering of nodes such that more specialised nodes have interactions with a subset of the partners with which the more generalised nodes interact. Modularity captures the community structure of a network as distinct clusters of interactions, such that there are more connections within communities than between communities. While these network architectures are easy to describe in writing, their quantitative measurement for a given network is a difficult task. Several different methods have been proposed in each case and it is currently unclear which of them should be used in practice. This thesis considers the use, measurement and interpretation of nestedness and modularity in bipartite networks. First, it is shown how bipartite networks can be an effective tool for linking data and theory in community ecology, though use of a coevolutionary model of virus-bacteria interactions. Next, a series of studies is presented that push towards clarification of the best procedures to measure nestedness and modularity in bipartite networks. Robustness of nestedness measures is tested on a synthetic ensemble of networks, showing that apparent nestedness depends strongly on the choice of measure, null model and effect size statistics. Recommendations for performing nestedness are made with relation to individual and cross-network comparisons. Additionally, a new algorithm for identifying weighted modularity is proposed that can be shown to outperform existing methods. Crucially, it is shown that quantitative modular structures differ from traditional binary modular structures with implications for how modularity is reported and used. Improving the way in which nestedness and modularity are measured is a necessary step for integrating data and theory in bipartite networks.University of Exete

    Statistical methods for gene selection and genetic association studies

    Get PDF
    This dissertation includes five Chapters. A brief description of each chapter is organized as follows. In Chapter One, we propose a signed bipartite genotype and phenotype network (GPN) by linking phenotypes and genotypes based on the statistical associations. It provides a new insight to investigate the genetic architecture among multiple correlated phenotypes and explore where phenotypes might be related at a higher level of cellular and organismal organization. We show that multiple phenotypes association studies by considering the proposed network are improved by incorporating the genetic information into the phenotype clustering. In Chapter Two, we first illustrate the proposed GPN to GWAS summary statistics. Then, we assess contributions to constructing a well-defined GPN with a clear representation of genetic associations by comparing the network properties with a random network, including connectivity, centrality, and community structure. The network topology annotations based on the sparse representations of GPN can be used to understand the disease heritability for the highly correlated phenotypes. In applications of phenome-wide association studies, the proposed GPN can identify more significant pairs of genetic variant and phenotype categories. In Chapter Three, a powerful and computationally efficient gene-based association test is proposed, aggregating information from different gene-based association tests and also incorporating expression quantitative trait locus information. We show that the proposed method controls the type I error rates very well and has higher power in the simulation studies and can identify more significant genes in the real data analyses. In Chapter Four, we develop six statistical selection methods based on the penalized regression for inferring target genes of a transcription factor (TF). In this study, the proposed selection methods combine statistics, machine learning , and convex optimization approach, which have great efficacy in identifying the true target genes. The methods will fill the gap of lacking the appropriate methods for predicting target genes of a TF, and are instrumental for validating experimental results yielding from ChIP-seq and DAP-seq, and conversely, selection and annotation of TFs based on their target genes. In Chapter Five, we propose a gene selection approach by capturing gene-level signals in network-based regression into case-control association studies with DNA sequence data or DNA methylation data, inspired by the popular gene-based association tests using a weighted combination of genetic variants to capture the combined effect of individual genetic variants within a gene. We show that the proposed gene selection approach have higher true positive rates than using traditional dimension reduction techniques in the simulation studies and select potentially rheumatoid arthritis related genes that are missed by existing methods

    Multi-level characterization and information extraction in directed and node-labeled functional brain networks

    Get PDF
    Current research in computational neuroscience puts great emphasis on the computation and analysis of the functional connectivity of the brain. The methodological developments presented in this work are concerned with a group-specific comprehensive analysis of networks that represent functional interaction patterns. Four application studies are presented, in which functional brain network samples of different clinical background were analyzed in different ways, using combinations of established approaches and own methodological developments. Study I is concerned with a sample-specific decomposition of the functional brain networks of depressed subjects and healthy controls into small functionally important and recurring subnetworks (motifs) using own developments. Study II investigates whether lithium treatment effects are reflected in the functional brain networks of HIV-positive subjects with diagnosed cognitive impairment. For it, microscopic and macroscopic structural properties were analyzed. Study III explores spatially highly resolved functional brain networks with regard to a functional segmentation given by identified module (community) structure. Also, ground truth networks with known module structure were generated using own methodological developments. They formed the basis of a comprehensive simulation study that quantified module structure quality and preservation in order to evaluate the effects of a novel approach for the identification of connectivity (lsGCI). Study IV tracks the time-evolution of module structure and introduces a newly developed own approach for the determination of edge weight thresholds based on multicriteria optimization. The methodological challenges that underly these different topological analyses, but also the various opportunities to gain an improved understanding of neural information processing among brain areas were highlighted by this work and the presented results

    Topology Reconstruction of Dynamical Networks via Constrained Lyapunov Equations

    Get PDF
    The network structure (or topology) of a dynamical network is often unavailable or uncertain. Hence, we consider the problem of network reconstruction. Network reconstruction aims at inferring the topology of a dynamical network using measurements obtained from the network. In this technical note we define the notion of solvability of the network reconstruction problem. Subsequently, we provide necessary and sufficient conditions under which the network reconstruction problem is solvable. Finally, using constrained Lyapunov equations, we establish novel network reconstruction algorithms, applicable to general dynamical networks. We also provide specialized algorithms for specific network dynamics, such as the well-known consensus and adjacency dynamics.Comment: 8 page
    corecore