2,168 research outputs found

    Knowledge-based Biomedical Data Science 2019

    Full text link
    Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table

    Discovering lesser known molecular players and mechanistic patterns in Alzheimer's disease using an integrative disease modelling approach

    Get PDF
    Convergence of exponentially advancing technologies is driving medical research with life changing discoveries. On the contrary, repeated failures of high-profile drugs to battle Alzheimer's disease (AD) has made it one of the least successful therapeutic area. This failure pattern has provoked researchers to grapple with their beliefs about Alzheimer's aetiology. Thus, growing realisation that Amyloid-β and tau are not 'the' but rather 'one of the' factors necessitates the reassessment of pre-existing data to add new perspectives. To enable a holistic view of the disease, integrative modelling approaches are emerging as a powerful technique. Combining data at different scales and modes could considerably increase the predictive power of the integrative model by filling biological knowledge gaps. However, the reliability of the derived hypotheses largely depends on the completeness, quality, consistency, and context-specificity of the data. Thus, there is a need for agile methods and approaches that efficiently interrogate and utilise existing public data. This thesis presents the development of novel approaches and methods that address intrinsic issues of data integration and analysis in AD research. It aims to prioritise lesser-known AD candidates using highly curated and precise knowledge derived from integrated data. Here much of the emphasis is put on quality, reliability, and context-specificity. This thesis work showcases the benefit of integrating well-curated and disease-specific heterogeneous data in a semantic web-based framework for mining actionable knowledge. Furthermore, it introduces to the challenges encountered while harvesting information from literature and transcriptomic resources. State-of-the-art text-mining methodology is developed to extract miRNAs and its regulatory role in diseases and genes from the biomedical literature. To enable meta-analysis of biologically related transcriptomic data, a highly-curated metadata database has been developed, which explicates annotations specific to human and animal models. Finally, to corroborate common mechanistic patterns — embedded with novel candidates — across large-scale AD transcriptomic data, a new approach to generate gene regulatory networks has been developed. The work presented here has demonstrated its capability in identifying testable mechanistic hypotheses containing previously unknown or emerging knowledge from public data in two major publicly funded projects for Alzheimer's, Parkinson's and Epilepsy diseases

    An information model for computable cancer phenotypes

    Get PDF

    Epigenetic Regulation and Inference of Lifestyle Factors and Health

    Get PDF

    Epigenetic Regulation and Inference of Lifestyle Factors and Health

    Get PDF

    Consequences of refining biological networks through detailed pathway information : From genes to proteoforms

    Get PDF
    Biologiske nettverk kan brukes til å modellere molekylære prosesser, forstå sykdomsprogresjon og finne nye behandlingsstrategier. Denne avhandlingen har undersøkt hvordan utformingen av slike nettverk påvirker deres struktur, og hvordan dette kan benyttes til å forbedre spesifisiteten for påfølgende analyser av slike modeller. Det første som ble undersøkt var potensialet ved å bruke mer detaljerte molekylære data når man modellerer humane biokjemiske reaksjonsnettverk. Resultatene bekrefter at det er nok informasjon om proteoformer, det vil si proteiner i spesifikke post-translasjonelle tilstander, for systematiske analyser og viste også store forskjeller i strukturen mellom en gensentrisk og en proteoformsentrisk representasjon. Deretter utviklet vi programmatisk tilgang og søk i slike nettverk basert på ulike typer av biomolekyler, samt en generisk algoritme som muliggjør fleksibel kartlegging av eksperimentelle data knyttet til den teoretiske representasjonen av proteoformer i referansedatabaser. Til slutt ble det konstruert såkalte pathway-spesifikke nettverk ved bruk av ulike detaljnivåer ved representasjonen av biokjemiske reaksjoner. Her ble informasjon som vanligvis blir oversett i standard nettverksrepresentasjoner inkludert: små molekyler, isoformer og modifikasjoner. Strukturelle egenskaper, som nettverksstørrelse, graddistribusjon og tilkobling i både globale og lokale undernettverk, ble deretter analysert for å kvantifisere virkningene av endringene.Biological networks can be used to model molecular processes, understand disease progression, and find new treatment strategies. This thesis investigated how refining the design of biological networks influences their structure, and how this can be used to improve the specificity of pathway analyses. First, we investigate the potential to use more detailed molecular data in current human biological pathways. We verified that there are enough proteoform annotations, i.e. information about proteins in specific post-translational states, for systematic analyses and characterized the structure of gene-centric versus proteoform-centric network representations of pathways. Next, we enabled the programmatic search and mining of pathways using different models for biomolecules including proteoforms. We notably designed a generic proteoform matching algorithm enabling the flexible mapping of experimental data to the theoretic representation in reference databases. Finally, we constructed pathway-based networks using different degrees of detail in the representation of biochemical reactions. We included information overlooked in most standard network representations: small molecules, isoforms, and post-translational modifications. Structural properties such as network size, degree distribution, and connectivity in both global and local subnetworks, were analysed to quantify the impact of the added molecular entities.Doktorgradsavhandlin
    corecore