653 research outputs found

    The Environmental Conditions, Treatments, and Exposures Ontology (ECTO): connecting toxicology and exposure to human health and beyond.

    Get PDF
    BACKGROUND: Evaluating the impact of environmental exposures on organism health is a key goal of modern biomedicine and is critically important in an age of greater pollution and chemicals in our environment. Environmental health utilizes many different research methods and generates a variety of data types. However, to date, no comprehensive database represents the full spectrum of environmental health data. Due to a lack of interoperability between databases, tools for integrating these resources are needed. In this manuscript we present the Environmental Conditions, Treatments, and Exposures Ontology (ECTO), a species-agnostic ontology focused on exposure events that occur as a result of natural and experimental processes, such as diet, work, or research activities. ECTO is intended for use in harmonizing environmental health data resources to support cross-study integration and inference for mechanism discovery. METHODS AND FINDINGS: ECTO is an ontology designed for describing organismal exposures such as toxicological research, environmental variables, dietary features, and patient-reported data from surveys. ECTO utilizes the base model established within the Exposure Ontology (ExO). ECTO is developed using a combination of manual curation and Dead Simple OWL Design Patterns (DOSDP), and contains over 2700 environmental exposure terms, and incorporates chemical and environmental ontologies. ECTO is an Open Biological and Biomedical Ontology (OBO) Foundry ontology that is designed for interoperability, reuse, and axiomatization with other ontologies. ECTO terms have been utilized in axioms within the Mondo Disease Ontology to represent diseases caused or influenced by environmental factors, as well as for survey encoding for the Personalized Environment and Genes Study (PEGS). CONCLUSIONS: We constructed ECTO to meet Open Biological and Biomedical Ontology (OBO) Foundry principles to increase translation opportunities between environmental health and other areas of biology. ECTO has a growing community of contributors consisting of toxicologists, public health epidemiologists, and health care providers to provide the necessary expertise for areas that have been identified previously as gaps

    GOPHER, an HPC framework for large scale graph exploration and inference

    Get PDF
    Biological ontologies, such as the Human Phenotype Ontology (HPO) and the Gene Ontology (GO), are extensively used in biomedical research to investigate the complex relationship that exists between the phenome and the genome. The interpretation of the encoded information requires methods that efficiently interoperate between multiple ontologies providing molecular details of disease-related features. To this aim, we present GenOtype PHenotype ExplOrer (GOPHER), a framework to infer associations between HPO and GO terms harnessing machine learning and large-scale parallelism and scalability in High-Performance Computing. The method enables to map genotypic features to phenotypic features thus providing a valid tool for bridging functional and pathological annotations. GOPHER can improve the interpretation of molecular processes involved in pathological conditions, displaying a vast range of applications in biomedicine.This work has been developed with the support of the Severo Ochoa Program (SEV-2015-0493); the Spanish Ministry of Science and Innovation (TIN2015- 65316-P); and the Joint Study Agreement no. W156463 under the IBM/BSC Deep Learning Center agreement.Peer ReviewedPostprint (author's final draft

    From genomic variation to personalized medicine

    Get PDF

    Using population biobanks to understand complex traits, rare diseases, and their shared genetic architecture

    Get PDF
    The study of the role of genetic variability in common traits has led to a growing number of studies aimed at representing whole populations. These studies gather multiple layers of information on healthy and non-healthy individuals at large scales, constituting what is known as population biobanks.In this thesis I took advantage of the potential of these population biobanks to measure the influence of genetic variation in common and rare traits. I explored the mechanisms behind these by exploring their interaction with conditions, physiological measurements, and habits in general and healthy population. First, I used the Lifelines cohort, with genetic information of Dutch population. Here, my colleagues and I explored traits with different levels of genetic influence we uncovered associations between both Blood type and dairy consumption with human gut microbiome function and composition, and we identified a protective factor for a rare type of cardiomyopathy with potential use for diagnosis.Additionally, within a global collaboration across world-wide biobanks totaling > 2 million individuals, we demonstrated the robustness of the connections between genetic variation and 14 different diseases across the populations. We also provided methodological guidance for the combination of the effects of genetic variation to calculate the risk of disease in studies including biobanks with populations of different ethnic backgrounds.Overall, my PhD research contributed on identifying and validating which factors are relevant for potential clinical applications, and provided guidelines to be used in future genetic studies on common traits and diseases at a global scale

    Epidemiologic appraisal of single nucleotide polymorphisms & the relevance of epigenetic regulation in selected emphasis of the metabolic syndrome

    Get PDF
    Komplexe Krankheiten zeichnen sich durch das Zusammenwirken mehrerer genetischer und umweltbedingter Faktoren aus. In der Epigenetik werden VerĂ€nderungen des Chromatins untersucht, welche die DNA Sequenz nicht verĂ€ndern jedoch auf Umweltfaktoren und Alterungsprozesse reagieren und somit die Genexpression beeinflussen. Moderne epidemiologische Forschung versucht Möglichkeiten zu entwickeln um komplexe Gen-Umwelt Interaktionen zu verstehen. In der Vergangenheit konnten in Genom-weiten Studien einige bahnbrechende Erfolge erzielt werden. Trotzdem gab es bisher noch keine zufriedenstellenden AnsĂ€tze um stĂ€ndig wiederkehrende Probleme zu lösen. Zu den Schwierigkeiten zĂ€hlen die oft kleinen Effekte einzelner Polymorphismen, die heterogene Ätiologie einer komplexen Krankheit und zu kleine StudiengrĂ¶ĂŸen um Effekte zu detektieren. Eine vielversprechende Herangehensweise ist die gleichzeitige Untersuchung epigenetischer und genetischer Marker. In dieser Arbeit wurde ein Netzwerk der adaptiven Thermogenese gewĂ€hlt in dem das Zusammenspiel von Umwelt und Genetik klar ersichtlich ist. Die Ergebnisse einer systematischen Literatursuche sowohl nach Assoziationsstudien als auch nach Studien epigenetischer Modifikationen ergaben, dass Epigenetik eine nicht zu vernachlĂ€ssigende Rolle spielt, wogegen die Studien an Polymorphismen keine klare Evidenz fĂŒr deren Beteiligung an der Entstehung von Übergewicht lieferten. ZukĂŒnftige ForschungsansĂ€tze in gewissenhaft geplanten Studien werden verschiedene Biomarker einsetzen mĂŒssen. Jedoch stellen auch hier die StudiengrĂ¶ĂŸen weiterhin ein Problem dar. Den Erfordernissen eines modernen Public Health Sektors wird dies ebenfalls dienlich sein. Des Weiteren haben validierte epigenetische Marker großes Potential zur Steigerung der Prognosesicherheit genetischer Tests.A complex trait consists of the interplay of several networks, tissues, and environments. The Epigenome includes all mechanisms and marks that alter chromatin without changing the DNA sequence. It is tissue and cell-type specific and varies as a function of time, environmental factors, and random processes. Human genome epidemiology offers frameworks for understanding joint impacts of genes and environments regarding disease risk. Genome-wide association studies have identified many genetic variations and provided insights into the genetic architecture of diseases. However, modest effect sizes of common mutations and inadequate power to overcome the heterogeneity of genetic effects could not be revealed by stringent genome wide significance thresholds. This work approached epigenetic influence on transcriptional regulation in a case study on adaptive thermogenesis and obesity on a network level. Results from a systematic literature search were compared to research findings from genome association studies on SNPs common in Caucasians. Whether these polymorphisms diminish energy expenditure remains unclear and has to be warranted in further studies. Altogether, the results of the analysis let suggest that hereditary genetic- and environment responsive regulation is involved. Integration of epigenetic markers in epidemiologic research could thus help to unravel multi-gene-environment interactions in the development of obesity. Future research strategies that simultaneously study biomarkers of exposure, susceptibility, outcomes in appropriately planned, designed, and conducted studies will extend beyond current approaches. Nevertheless, sample size requirements remain an inescapable challenge. The integration of epigenetic markers in genetic testing should increase their predictive value for complex diseases with public health importance. Effective organization is vital for disseminating genetic tests to the health care system

    Fetal programming and parent-of-origin effects of type 2 diabetes and insulin secretion

    Get PDF
    Abstract Type 2 diabetes mellitus (T2DM) is a heterogeneous and a complex disease defined by hyperglycemia. Thepancreas and its islets are central for glucose homeostasis and healthy adipose tissue. In turn, lipid levels in the bloodare crucial for glucose level stability. Both genetic and environmental factors and their interaction play a pivotal role inthe risk and development of the disease. In this thesis we aim to better understand the effect of genetic andenvironmental factors by investigating parental effects manifesting from early life until adulthood.In papers I and II we examined gene expression alterations and associated epigenetic changes due to early pregnancyanemia and gestational diabetes (GDM). Moreover, we investigated associations between these changes and neonatalanthropometry. We identified several differentially expressed genes between early pregnancy anemia, GDM andcontrols. Most of these genes were accompanied by epigenetic changes that correlated with their expression patterns.Interestingly, we identified several differentially expressed genes associated with neonatal anthropometry indicatingtheir possible role in fetal programming and risk of T2DM in later life due to maternal exposure to early pregnancyanemia and GDM.In paper III we investigated whether genetic variants which were previously reported to be associated with lipid traitswill exert different effects on obesity and blood lipid traits based on their parental origin. We examined These variantsin two European family cohorts, where parental origin of each variant was inferred and parental-specific associationwith obesity and blood lipid traits was analyzed. Our results corroborated previous reports and indicated that specificgenetic variants show parent-of-origin specific effects. Moreover, our results indicate possible sex-specific parentaleffects on some blood lipid traits.In paper IV we questioned whether such parental specific effects observed in paper III also manifested in early life. Asa result, we explored parent-of-origin effects on cardiometabolic and anthropometric traits in a birth cohort which wasfollowed up from delivery until 18 years. Our results indicate that the parental specific effects of cardiometabolic andanthropometric traits and associated genetic variants manifested in early life. Interestingly, however, not all parentaleffects were found to be fixed, and they seemed to transition over time specifically during puberty.In paper V we have examined the expression of imprinted genes to better understand their role in insulin secretion,beta-cell development, and function. First, we scrutinized gene expression data from adult pancreas, adult pancreaticislets, fetal pancreas, and single cell expression data. Next, we analyzed the association of these genes with glycemictraits. We identified imprinted genes that were specifically expressed in fetal pancreas both on a tissue and single celllevel. Variants in two genes associated with indices of insulin secretion indicating their possible role in beta-celldevelopment. Additionally, we identified imprinted genes enriched in both fetal and adult pancreas and associated withglucose and insulin traits in a parent-of-origin manner. This suggests the possible role of these genes in beta-cellfunction.In summary, in this thesis we investigate paternal and maternal effects as a function of fetal programming and parentof-origin effects to better understand their influence on type 2 diabetes and insulin secretion

    Prioritising genetic findings for drug target identification and validation

    Get PDF
    The decreasing costs of high-throughput genetic sequencing and increasing abundance of sequenced genome data have paved the way for the use of genetic data in identifying and validating potential drug targets. However, the number of identified potential drug targets is often prohibitively large to experimentally evaluate in wet lab experiments, highlighting the need for systematic approaches for target prioritisation. In this review, we discuss principles of genetically guided drug development, specifically addressing loss-of-function analysis, colocalization and Mendelian randomisation (MR), and the contexts in which each may be most suitable. We subsequently present a range of biomedical resources which can be used to annotate and prioritise disease-associated proteins identified by these studies including 1) ontologies to map genes, proteins, and disease, 2) resources for determining the druggability of a potential target, 3) tissue and cell expression of the gene encoding the potential target, and 4) key biological pathways involving the potential target. We illustrate these concepts through a worked example, identifying a prioritised set of plasma proteins associated with non-alcoholic fatty liver disease (NAFLD). We identified five proteins with strong genetic support for involvement with NAFLD: CYB5A, NT5C, NCAN, TGFBI and DAPK2. All of the identified proteins were expressed in both liver and adipose tissues, with TGFBI and DAPK2 being potentially druggable. In conclusion, the current review provides an overview of genetic evidence for drug target identification, and how biomedical databases can be used to provide actionable prioritisation, fully informing downstream experimental validation

    Population Differences in Transcript-Regulator Expression Quantitative Trait Loci

    Get PDF
    Gene expression quantitative trait loci (eQTL) are useful for identifying single nucleotide polymorphisms (SNPs) associated with diseases. At times, a genetic variant may be associated with a master regulator involved in the manifestation of a disease. The downstream target genes of the master regulator are typically co-expressed and share biological function. Therefore, it is practical to screen for eQTLs by identifying SNPs associated with the targets of a transcript-regulator (TR). We used a multivariate regression with the gene expression of known targets of TRs and SNPs to identify TReQTLs in European (CEU) and African (YRI) HapMap populations. A nominal p-value of <1×10−6 revealed 234 SNPs in CEU and 154 in YRI as TReQTLs. These represent 36 independent (tag) SNPs in CEU and 39 in YRI affecting the downstream targets of 25 and 36 TRs respectively. At a false discovery rate (FDR) = 45%, one cis-acting tag SNP (within 1 kb of a gene) in each population was identified as a TReQTL. In CEU, the SNP (rs16858621) in Pcnxl2 was found to be associated with the genes regulated by CREM whereas in YRI, the SNP (rs16909324) was linked to the targets of miRNA hsa-miR-125a. To infer the pathways that regulate expression, we ranked TReQTLs by connectivity within the structure of biological process subtrees. One TReQTL SNP (rs3790904) in CEU maps to Lphn2 and is associated (nominal p-value = 8.1×10−7) with the targets of the X-linked breast cancer suppressor Foxp3. The structure of the biological process subtree and a gene interaction network of the TReQTL revealed that tumor necrosis factor, NF-kappaB and variants in G-protein coupled receptors signaling may play a central role as communicators in Foxp3 functional regulation. The potential pleiotropic effect of the Foxp3 TReQTLs was gleaned from integrating mRNA-Seq data and SNP-set enrichment into the analysis

    Human lifespan: recent trends and genetic determinants

    Get PDF
    Human lifespan is determined by a complex interplay of genetics, environment, lifestyle and chance. In the UK, life expectancy has increased by roughly three years every decade, but despite longer lives, individuals also spend more years living with chronic disease. With populations greying and periods of morbidity becoming more prolonged, the burden of ageing and age-related disease is set to become a major healthcare challenge. Understanding the factors underlying trends in human lifespan could guide policy interventions to mitigate the burden of disease, while an understanding of the genetics of lifespan could provide insight into the ageing process. The latter could in turn reveal potential therapeutic targets to delay age-related disease and inform which individuals to target based on their genetic risk. In this thesis, I explore human lifespan from these two perspectives. First, I examined trends in mortality and morbidity in two million Scots using hospital admission and death records and found recent improvements in lifespan could be largely explained by improvements in the incidence and survival after hospitalisation of cancers and heart disease. However, I also found recent deteriorations in infectious disease, especially for individuals from lower socioeconomic classes, suggesting a need for a renewed public health focus in this area. Next, I performed a genome-wide association study (GWAS) to find genetic determinants of lifespan using DNA from 27 European cohorts and the lifespans of their parents (one million total). I identified 12 genomic regions affecting survival and found genetic variants across the genome, when aggregated into polygenic scores, could distinguish up to five years of survival between score deciles. Combining the lifespan GWAS with two other GWAS of lifespan-related traits, I identified 78 genes—some of which delay ageing in model organisms— which putatively influence both human lifespan and healthy years of life and which are enriched for haem metabolism. These findings present the most promising targets for therapeutic interventions to date, which may help delay the onset of age-related disease and extend the healthy years of life for all
    • 

    corecore