371 research outputs found

    Computational Methods for the Analysis of Genomic Data and Biological Processes

    Get PDF
    In recent decades, new technologies have made remarkable progress in helping to understand biological systems. Rapid advances in genomic profiling techniques such as microarrays or high-performance sequencing have brought new opportunities and challenges in the fields of computational biology and bioinformatics. Such genetic sequencing techniques allow large amounts of data to be produced, whose analysis and cross-integration could provide a complete view of organisms. As a result, it is necessary to develop new techniques and algorithms that carry out an analysis of these data with reliability and efficiency. This Special Issue collected the latest advances in the field of computational methods for the analysis of gene expression data, and, in particular, the modeling of biological processes. Here we present eleven works selected to be published in this Special Issue due to their interest, quality, and originality

    Strategies for the intelligent integration of genetic variance information in multiscale models of neurodegenerative diseases

    Get PDF
    A more complete understanding of the genetic architecture of complex traits and diseases can maximize the utility of human genetics in disease screening, diagnosis, prognosis, and therapy. Undoubtedly, the identification of genetic variants linked to polygenic and complex diseases is of supreme interest for clinicians, geneticists, patients, and the public. Furthermore, determining how genetic variants affect an individual’s health and transmuting this knowledge into the development of new medicine can revolutionize the treatment of most common deleterious diseases. However, this requires the correlation of genetic variants with specific diseases, and accurate functional assessment of genetic variation in human DNA sequencing studies is still a nontrivial challenge in clinical genomics. Assigning functional consequences and clinical significances to genetic variants is an important step in human genome interpretation. The translation of the genetic variants into functional molecular mechanisms is essential in disease pathogenesis and, eventually in therapy design. Although various statistical methods are helpful to short-list the genetic variants for fine-mapping investigation, demonstrating their role in molecular mechanism requires knowledge of functional consequences. This undoubtedly requires comprehensive investigation. Experimental interpretation of all the observed genetic variants is still impractical. Thus, the prediction of functional and regulatory consequences of the genetic variants using in-silico approaches is an important step in the discovery of clinically actionable knowledge. Since the interactions between phenotypes and genotypes are multi-layered and biologically complex. Such associations present several challenges and simultaneously offer many opportunities to design new protocols for in-silico variant evaluation strategies. This thesis presents a comprehensive protocol based on a causal reasoning algorithm that harvests and integrates multifaceted genetic and biomedical knowledge with various types of entities from several resources and repositories to understand how genetic variants perturb molecular interaction, and initiate a disease mechanism. Firstly, as a case study of genetic susceptibility loci of Alzheimer’s disease, I reviewed and summarized all the existing methodologies for Genome Wide Association Studies (GWAS) interpretation, currently available algorithms, and computable modelling approaches. In addition, I formulated a new approach for modelling and simulations of genetic regulatory networks as an extension of the syntax of the Biological Expression Language (OpenBEL). This could allow the representation of genetic variation information in cause-and-effect models to predict the functional consequences of disease-associated genetic variants. Secondly, by using the new syntax of OpenBEL, I generated an OpenBEL model for Alzheimer´s Disease (AD) together with genetic variants including their DNA, RNA or protein position, variant type and associated allele. To better understand the role of genetic variants in a disease context, I subsequently tried to predict the consequences of genetic variation based on the functional context provided by the network model. I further explained that how genetic variation information could help to identify candidate molecular mechanisms for aetiologically complex diseases such as Alzheimer’s disease (AD) and Parkinson’s disease (PD). Though integration of genetic variation information can enhance the evidence base for shared pathophysiology pathways in complex diseases, I have addressed to one of the key questions, namely the role of shared genetic variants to initiate shared molecular mechanisms between neurodegenerative diseases. I systematically analysed shared genetic variation information of AD and PD and mapped them to find shared molecular aetiology between neurodegenerative diseases. My methodology highlighted that a comprehensive understanding of genetic variation needs integration and analysis of all omics data, in order to build a joint model to capture all datasets concurrently. Moreover genomic loci should be considered to investigate the effects of GWAS variants rather than an individual genetic variant, which is hard to predict in a biologically complex molecular mechanism, predominantly to investigate shared pathology

    Systems Analytics and Integration of Big Omics Data

    Get PDF
    A “genotype"" is essentially an organism's full hereditary information which is obtained from its parents. A ""phenotype"" is an organism's actual observed physical and behavioral properties. These may include traits such as morphology, size, height, eye color, metabolism, etc. One of the pressing challenges in computational and systems biology is genotype-to-phenotype prediction. This is challenging given the amount of data generated by modern Omics technologies. This “Big Data” is so large and complex that traditional data processing applications are not up to the task. Challenges arise in collection, analysis, mining, sharing, transfer, visualization, archiving, and integration of these data. In this Special Issue, there is a focus on the systems-level analysis of Omics data, recent developments in gene ontology annotation, and advances in biological pathways and network biology. The integration of Omics data with clinical and biomedical data using machine learning is explored. This Special Issue covers new methodologies in the context of gene–environment interactions, tissue-specific gene expression, and how external factors or host genetics impact the microbiome

    Generation and Applications of Knowledge Graphs in Systems and Networks Biology

    Get PDF
    The acceleration in the generation of data in the biomedical domain has necessitated the use of computational approaches to assist in its interpretation. However, these approaches rely on the availability of high quality, structured, formalized biomedical knowledge. This thesis has the two goals to improve methods for curation and semantic data integration to generate high granularity biological knowledge graphs and to develop novel methods for using prior biological knowledge to propose new biological hypotheses. The first two publications describe an ecosystem for handling biological knowledge graphs encoded in the Biological Expression Language throughout the stages of curation, visualization, and analysis. Further, the second two publications describe the reproducible acquisition and integration of high-granularity knowledge with low contextual specificity from structured biological data sources on a massive scale and support the semi-automated curation of new content at high speed and precision. After building the ecosystem and acquiring content, the last three publications in this thesis demonstrate three different applications of biological knowledge graphs in modeling and simulation. The first demonstrates the use of agent-based modeling for simulation of neurodegenerative disease biomarker trajectories using biological knowledge graphs as priors. The second applies network representation learning to prioritize nodes in biological knowledge graphs based on corresponding experimental measurements to identify novel targets. Finally, the third uses biological knowledge graphs and develops algorithmics to deconvolute the mechanism of action of drugs, that could also serve to identify drug repositioning candidates. Ultimately, the this thesis lays the groundwork for production-level applications of drug repositioning algorithms and other knowledge-driven approaches to analyzing biomedical experiments

    Department of Computer Science Activity 1998-2004

    Get PDF
    This report summarizes much of the research and teaching activity of the Department of Computer Science at Dartmouth College between late 1998 and late 2004. The material for this report was collected as part of the final report for NSF Institutional Infrastructure award EIA-9802068, which funded equipment and technical staff during that six-year period. This equipment and staff supported essentially all of the department\u27s research activity during that period

    Pacific Symposium on Biocomputing 2023

    Get PDF
    The Pacific Symposium on Biocomputing (PSB) 2023 is an international, multidisciplinary conference for the presentation and discussion of current research in the theory and application of computational methods in problems of biological significance. Presentations are rigorously peer reviewed and are published in an archival proceedings volume. PSB 2023 will be held on January 3-7, 2023 in Kohala Coast, Hawaii. Tutorials and workshops will be offered prior to the start of the conference.PSB 2023 will bring together top researchers from the US, the Asian Pacific nations, and around the world to exchange research results and address open issues in all aspects of computational biology. It is a forum for the presentation of work in databases, algorithms, interfaces, visualization, modeling, and other computational methods, as applied to biological problems, with emphasis on applications in data-rich areas of molecular biology.The PSB has been designed to be responsive to the need for critical mass in sub-disciplines within biocomputing. For that reason, it is the only meeting whose sessions are defined dynamically each year in response to specific proposals. PSB sessions are organized by leaders of research in biocomputing's 'hot topics.' In this way, the meeting provides an early forum for serious examination of emerging methods and approaches in this rapidly changing field

    Investigating the Transcriptome Signature of Depression: Employing Co-expression Network, Candidate Pathways and Machine Learning Approaches

    Get PDF
    Depression is the leading cause of disability worldwide and is one of the major contributors to the overall global burden of disease. Despite significant advances in elucidating the neurobiology of depression in recent years, the molecular factors involved in the pathophysiology of depression remain poorly understood. Chapter 1: An overview of Major Depressive Disorder (MDD) from epidemiological and clinical perspectives with a summary of the current knowledge of the underlying biology is provided. A review of the major pathophysiological hypotheses of MDD highlights a need for a more comprehensive approach that allows studying complex molecular interactions involved in depression. Chapter 2: Transcriptome signature of depression was examined using the measure of replication at individual gene level across different tissues and cell types in both brain and periphery. Fifty-seven replicated genes were reported as differentially expressed in the brain and 21 in peripheral tissues. In-silico functional characterisation of these genes was provided, implicating shared pathways in a comorbid phenotype of depression and cardiovascular disease. Chapter 3: The molecular basis of MDD using co-expression network analysis was investigated. The Weighed Gene Co-expression Network Analysis (WGCNA) allowed for studying complex interactions between individual genes influencing biological pathways in MDD. Utilising the Sydney Memory and Aging Study (sMAS) and the Older Australian Twin Study (OATS) as discovery and replication cohorts respectively, it was found that the eigengenes of four clusters containing over 3,000 highly co-regulated genes are involved in 13 immune- and pathogen-related pathways and associated with recurrent MDD. However, the findings were not replicated on an independent cohort at the network level. Chapter 4: Using a machine learning (ML) approach, a predictive model was built to identify the genome-wide gene expression markers of recurrent MDD. Fuzzy Forests (FF) is a novel ML algorithm, which works in conjunction with WGCNA and was designed to reduce the bias seen in feature selection caused by the presence of correlated transcripts in transcriptome data. FF correctly classified 63% of recurrently depressed individuals in test data using the single top predictive feature (TFRC, encodes for transferrin receptor). This suggests that TFRC can represent a putative marker for recurrent MDD. Chapter 5: Following the findings on immune-related pathways being associated with recurrent MDD in the elderly (Chapter 3), the role of these pathways in recurrent MDD was examined at individual gene levels in an independent cohort (OATS). To target the immune pathways, all known genes (KEGG) involved in these 13 pathways were selected and a differential expression analysis was conducted on 1,302 candidates between individuals with recurrent MDD and those without. We found that CD14 was significantly downregulated in recurrent MDD (FDR < 5%). Considering the key role of CD14 for facilitating the innate immune response, we suggest that CD14 can potentially serve as a peripheral marker of immune dysregulation in recurrent MDD. Chapter 6: A discussion on obtained findings is provided and future directions are outlined with a particular focus on how co-expression network and machine learning approaches that can enhance translation of molecular findings into clinical translation.Thesis (Ph.D.) -- University of Adelaide, Adelaide Medical School, 201

    Responsiveness of genes to long-range transcriptional regulation

    Get PDF
    Developmental genes are highly regulated at the level of transcription and exhibit complex spatial and temporal expression patterns. Key developmental loci are frequently spanned by clusters of conserved non-coding elements (CNEs), referred to as genomic regulatory blocks (GRBs), that have been subject to extreme levels of purifying selection during metazoan evolution. CNEs have been shown to function as long-range enhancers, activating transcription of their developmental target genes over vast genomic distances and bypassing more proximally located unresponsive genes (bystanders). Despite their role in the establishment of cell identity during development, many of these long-range regulatory landscapes remain poorly characterised. In this thesis, I develop a computational method for the genome-wide identification of regulatory enhancer-promoter associations in human and mouse, based on co-variation of enhancer and promoter transcriptional activity across a comprehensive set of tissues and cell types, in combination with chromatin contact data. Using this method, I demonstrate that previously predicted GRB target genes are amongst the genes with the highest level of enhancer responsiveness in the genome, and are frequently associated with extremely long-range enhancers. Remarkably, the activity of some previously predicted bystanders is also weakly but significantly associated with enhancer activity, challenging the notion that the promoters of bystanders are unresponsive to enhancers. Next, I systematically annotate human genes with elevated enhancer responsiveness and identify more than 600 putative target genes, associated with the regulation of a wide range of developmental processes, from pattern specification to axonogenesis, as well as with disease. The analysis performed in this thesis has facilitated the identification of hundreds of previously uncharacterised enhancer-responsive genes and their long-range regulatory landscapes, allowing the study of their unique properties.Open Acces

    Annual Report

    Get PDF
    • …
    corecore