39 research outputs found

    Discovering lesser known molecular players and mechanistic patterns in Alzheimer's disease using an integrative disease modelling approach

    Get PDF
    Convergence of exponentially advancing technologies is driving medical research with life changing discoveries. On the contrary, repeated failures of high-profile drugs to battle Alzheimer's disease (AD) has made it one of the least successful therapeutic area. This failure pattern has provoked researchers to grapple with their beliefs about Alzheimer's aetiology. Thus, growing realisation that Amyloid-β and tau are not 'the' but rather 'one of the' factors necessitates the reassessment of pre-existing data to add new perspectives. To enable a holistic view of the disease, integrative modelling approaches are emerging as a powerful technique. Combining data at different scales and modes could considerably increase the predictive power of the integrative model by filling biological knowledge gaps. However, the reliability of the derived hypotheses largely depends on the completeness, quality, consistency, and context-specificity of the data. Thus, there is a need for agile methods and approaches that efficiently interrogate and utilise existing public data. This thesis presents the development of novel approaches and methods that address intrinsic issues of data integration and analysis in AD research. It aims to prioritise lesser-known AD candidates using highly curated and precise knowledge derived from integrated data. Here much of the emphasis is put on quality, reliability, and context-specificity. This thesis work showcases the benefit of integrating well-curated and disease-specific heterogeneous data in a semantic web-based framework for mining actionable knowledge. Furthermore, it introduces to the challenges encountered while harvesting information from literature and transcriptomic resources. State-of-the-art text-mining methodology is developed to extract miRNAs and its regulatory role in diseases and genes from the biomedical literature. To enable meta-analysis of biologically related transcriptomic data, a highly-curated metadata database has been developed, which explicates annotations specific to human and animal models. Finally, to corroborate common mechanistic patterns — embedded with novel candidates — across large-scale AD transcriptomic data, a new approach to generate gene regulatory networks has been developed. The work presented here has demonstrated its capability in identifying testable mechanistic hypotheses containing previously unknown or emerging knowledge from public data in two major publicly funded projects for Alzheimer's, Parkinson's and Epilepsy diseases

    Data integration, pathway analysis and mining for systems biology

    Get PDF
    Post-genomic molecular biology embodies high-throughput experimental techniques and hence is a data-rich field. The goal of this thesis is to develop bioinformatics methods to utilise publicly available data in order to produce knowledge and to aid mining of newly generated data. As an example of knowledge or hypothesis generation, consider function prediction of biological molecules. Assignment of protein function is a non-trivial task owing to the fact that the same protein may be involved in different biological processes, depending on the state of the biological system and protein localisation. The function of a gene or a gene product may be provided as a textual description in a gene or protein annotation database. Such textual descriptions lack in providing the contextual meaning of the gene function. Therefore, we need ways to represent the meaning in a formal way. Here we apply data integration approach to provide rich representation that enables context-sensitive mining of biological data in terms of integrated networks and conceptual spaces. Context-sensitive gene function annotation follows naturally from this framework, as a particular application. Next, knowledge that is already publicly available can be used to aid mining of new experimental data. We developed an integrative bioinformatics method that utilises publicly available knowledge of protein-protein interactions, metabolic networks and transcriptional regulatory networks to analyse transcriptomics data and predict altered biological processes. We applied this method to a study of dynamic response of Saccharomyces cerevisiae to oxidative stress. The application of our method revealed dynamically altered biological functions in response to oxidative stress, which were validated by comprehensive in vivo metabolomics experiments. The results provided in this thesis indicate that integration of heterogeneous biological data facilitates advanced mining of the data. The methods can be applied for gaining insight into functions of genes, gene products and other molecules, as well as for offering functional interpretation to transcriptomics and metabolomics experiments

    The Pharmacoepigenomics Informatics Pipeline and H-GREEN Hi-C Compiler: Discovering Pharmacogenomic Variants and Pathways with the Epigenome and Spatial Genome

    Full text link
    Over the last decade, biomedical science has been transformed by the epigenome and spatial genome, but the discipline of pharmacogenomics, the study of the genetic underpinnings of pharmacological phenotypes like drug response and adverse events, has not. Scientists have begun to use omics atlases of increasing depth, and inferences relating to the bidirectional causal relationship between the spatial epigenome and gene expression, as a foundational underpinning for genetics research. The epigenome and spatial genome are increasingly used to discover causative regulatory variants in the significance regions of genome-wide association studies, for the discovery of the biological mechanisms underlying these phenotypes and the design of genetic tests to predict them. Such variants often have more predictive power than coding variants, but in the area of pharmacogenomics, such advances have been radically underapplied. The majority of pharmacogenomics tests are designed manually on the basis of mechanistic work with coding variants in candidate genes, and where genome wide approaches are used, they are typically not interpreted with the epigenome. This work describes a series of analyses of pharmacogenomics association studies with the tools and datasets of the epigenome and spatial genome, undertaken with the intent of discovering causative regulatory variants to enable new genetic tests. It describes the potent regulatory variants discovered thereby to have a putative causative and predictive role in a number of medically important phenotypes, including analgesia and the treatment of depression, bipolar disorder, and traumatic brain injury with opiates, anxiolytics, antidepressants, lithium, and valproate, and in particular the tendency for such variants to cluster into spatially interacting, conceptually unified pathways which offer mechanistic insight into these phenotypes. It describes the Pharmacoepigenomics Informatics Pipeline (PIP), an integrative multiple omics variant discovery pipeline designed to make this kind of analysis easier and cheaper to perform, more reproducible, and amenable to the addition of advanced features. It described the successes of the PIP in rediscovering manually discovered gene networks for lithium response, as well as discovering a previously unknown genetic basis for warfarin response in anticoagulation therapy. It describes the H-GREEN Hi-C compiler, which was designed to analyze spatial genome data and discover the distant target genes of such regulatory variants, and its success in discovering spatial contacts not detectable by preceding methods and using them to build spatial contact networks that unite disparate TADs with phenotypic relationships. It describes a potential featureset of a future pipeline, using the latest epigenome research and the lessons of the previous pipeline. It describes my thinking about how to use the output of a multiple omics variant pipeline to design genetic tests that also incorporate clinical data. And it concludes by describing a long term vision for a comprehensive pharmacophenomic atlas, to be constructed by applying a variant pipeline and machine learning test design system, such as is described, to thousands of phenotypes in parallel. Scientists struggled to assay genotypes for the better part of a century, and in the last twenty years, succeeded. The struggle to predict phenotypes on the basis of the genotypes we assay remains ongoing. The use of multiple omics variant pipelines and machine learning models with omics atlases, genetic association, and medical records data will be an increasingly significant part of that struggle for the foreseeable future.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145835/1/ariallyn_1.pd

    Interactions and functionalities of the gut revealed by computational approaches

    Get PDF
    The gastrointestinal tract is subject of much research for its role in an organism’s health owing to its role as gatekeeper. The tissue acts as a barrier to keep out harmful substances like pathogens and toxins while absorbing nutrients that arise from the digestion of dietary components in in the lumen. There is a large population of microbiota that plays an important role in the functioning of the gut. All these sub-systems of the gastrointestinal tract contribute to the normal functioning of the gut. Due to its various functionalities, the gut is able to respond to different types of stimuli and bring the system back to homeostasis after perturbations. The work done in this thesis uses several bioinformatic tools to improve our understanding of the functioning of the gut. This was achieved with data from model animals, mice and pigs which were subjected to changing environments before their gastrointestinal response was measured. Different types of stimuli were studied (eg, antibiotic exposure, changing diets and infection with pathogens) in order to understand the response of the gut to varying environments. This data was analysed using different data integration techniques that provide a holistic view of the gut response. Vertical data integration techniques look for associations between different types of ~omics data to highlight possible interactions between the measured variables. Lateral integration techniques allow the study of one type of ~omics data over several time points or several experimental conditions. Using these techniques, we show proof of interactions between different sub-systems of the gut and the functional plasticity of the gut. Of the several hypotheses generated in this thesis we have validated several using existing literature and one using an in-vitro system. Further validation of these hypotheses will increase understanding of the responses of the gut and the interactions involved.</p

    Visualization of modular structures in biological networks

    Get PDF

    Evolutionary genomics : statistical and computational methods

    Get PDF
    This open access book addresses the challenge of analyzing and understanding the evolutionary dynamics of complex biological systems at the genomic level, and elaborates on some promising strategies that would bring us closer to uncovering of the vital relationships between genotype and phenotype. After a few educational primers, the book continues with sections on sequence homology and alignment, phylogenetic methods to study genome evolution, methodologies for evaluating selective pressures on genomic sequences as well as genomic evolution in light of protein domain architecture and transposable elements, population genomics and other omics, and discussions of current bottlenecks in handling and analyzing genomic data. Written for the highly successful Methods in Molecular Biology series, chapters include the kind of detail and expert implementation advice that lead to the best results. Authoritative and comprehensive, Evolutionary Genomics: Statistical and Computational Methods, Second Edition aims to serve both novices in biology with strong statistics and computational skills, and molecular biologists with a good grasp of standard mathematical concepts, in moving this important field of study forward

    The application of cDNA and tissue microarray methods in the study of human carcinomas

    Get PDF
    Currently, numerous high-throughput technologies are available for the study of human carcinomas. In literature, many variations of these techniques have been described. The common denominator for these methodologies is the high amount of data obtained in a single experiment, in a short time period, and at a fairly low cost. However, these methods have also been described with several problems and limitations. The purpose of this study was to test the applicability of two selected high-throughput methods, cDNA and tissue microarrays (TMA), in cancer research. Two common human malignancies, breast and colorectal cancer, were used as examples. This thesis aims to present some practical considerations that need to be addressed when applying these techniques. cDNA microarrays were applied to screen aberrant gene expression in breast and colon cancers. Immunohistochemistry was used to validate the results and to evaluate the association of selected novel tumour markers with the outcome of the patients. The type of histological material used in immunohistochemistry was evaluated especially considering the applicability of whole tissue sections and different types of TMAs. Special attention was put on the methodological details in the cDNA microarray and TMA experiments. In conclusion, many potential tumour markers were identified in the cDNA microarray analyses. Immunohistochemistry could be applied to validate the observed gene expression changes of selected markers and to associate their expression change with patient outcome. In the current experiments, both TMAs and whole tissue sections could be used for this purpose. This study showed for the first time that securin and p120 catenin protein expression predict breast cancer outcome and the immunopositivity of carbonic anhydrase IX associates with the outcome of rectal cancer. The predictive value of these proteins was statistically evident also in multivariate analyses with up to a 13.1- fold risk for cancer specific death in a specific subgroup of patients.Siirretty Doriast

    The Role of Upstream Open Reading Frames in Regulating Neuronal Protein Synthesis

    Full text link
    Spatial and temporal control of protein synthesis in response to activity is required for neuronal function and plasticity. mRNA structure and sequence provide a powerful platform for such regulation, but how such information is utilized in neurons is incompletely understood. In my thesis, I explore how functional elements within 5’leaders (traditionally termed 5’UTR or untranslated region) of mRNAs act as cis-regulatory elements to influence basal and activity-dependent translation in neurons. First, I identified a specific role for upstream open reading frames (uORFs) in regulating mRNA translation during neuronal differentiation. uORFs are regions within the 5’ leader that undergo translation. Using ribosome profiling (RP), an emerging next-generation sequencing technique which utilizes a modified RNA-sequencing library preparation to detect regions of mRNA occupied by actively translating ribosomes, I identified thousands of uORFs in human neuroblastoma cells. A portion of these uORFs demonstrated clear usage shifts with differentiation. Highly conserved uORFs exhibited increased GC content and were associated with cumulatively repressed CDSs. Importantly, changes in the translational efficiency of these conserved uORFs across differentiation were inversely correlated with CDS translation on these same transcripts. These data demonstrate uORF usage is common in neuroblastoma cells and that specific uORFs act as regulators of cell state-specific translation in neuronal differentiation. Next, I investigated the function of CGG repeats in the 5’ leader of FMR1. All humans have a conserved CGG-trinucleotide repeat (typically 20-45 repeats) in FMR1 that can become unstable and expand intergenerationally. Large expansions (>200 CGG repeats) cause Fragile X Syndrome, a common cause of intellectual disability, by silencing FMR1, leading to loss of the fragile X protein, FMRP. Intermediate (55-200 CGGs) expansions, in contrast, are transcribed and cause an age-related neurodegenerative condition known as Fragile-X Associated Tremor/Ataxia Syndrome (FXTAS). Our lab discovered that this repeat facilitates Repeat Associated Non-AUG translation (RANT), whereby ribosomes initiate at non-AUG codons upstream of the repeat to produce toxic homopolymeric proteins that drive pathogenesis in FXTAS. FMR1 avidly supports RANT at normal repeat sizes, suggesting that it might serve as a regulatory uORF to control FMRP synthesis. To address this, I expressed nanoluciferase reporters in rat hippocampal neurons. Using this strategy, I found that RANT exhibits a strong negative effect on FMRP synthesis at both normal and expanded repeats. FMRP is a key synaptic protein that is rapidly synthesized in response to mGluR activity. Importantly, preventing RANT or removing the repeat itself blocked this mGluR-induced response. This suggests that FMR1 relies on these two elements to appropriately scale synaptic FMRP synthesis. Using non-cleaving antisense oligonucleotides (ASOs) that target the RANT initiation sites, I found that blocking RANT could decrease toxic protein production and prevent neuronal death. In a line of iPSC-derived neurons from a patient with a large CGG repeat (>200) that still generates FMR1 mRNA but has deficits in FMRP, treatment with the ASO increased endogenous FMRP expression by 50%. These findings define a native function for RANT and CGG repeats in regulating FMRP synthesis, and delineate RANT as a therapeutic target in Fragile X-associated disorders.PHDNeuroscienceUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143943/1/ctln_1.pd
    corecore