58 research outputs found

    Creation and evaluation of full-text literature-derived, feature-weighted disease models of genetically determined developmental disorders

    Get PDF
    There are >2500 different genetically determined developmental disorders (DD), which, as a group, show very high levels of both locus and allelic heterogeneity. This has led to the wide-spread use of evidence-based filtering of genome-wide sequence data as a diagnostic tool in DD. Determining whether the association of a filtered variant at a specific locus is a plausible explanation of the phenotype in the proband is crucial and commonly requires extensive manual literature review by both clinical scientists and clinicians. Access to a database of weighted clinical features extracted from rigorously curated literature would increase the efficiency of this process and facilitate the development of robust phenotypic similarity metrics. However, given the large and rapidly increasing volume of published information, conventional biocuration approaches are becoming impractical. Here, we present a scalable, automated method for the extraction of categorical phenotypic descriptors from the full-text literature. Papers identified through literature review were downloaded and parsed using the Cadmus custom retrieval package. Human Phenotype Ontology terms were extracted using MetaMap, with 76–84% precision and 65–73% recall. Mean terms per paper increased from 9 in title + abstract, to 68 using full text. We demonstrate that these literature-derived disease models plausibly reflect true disease expressivity more accurately than widely used manually curated models, through comparison with prospectively gathered data from the Deciphering Developmental Disorders study. The area under the curve for receiver operating characteristic (ROC) curves increased by 5–10% through the use of literature-derived models. This work shows that scalable automated literature curation increases performance and adds weight to the need for this strategy to be integrated into informatic variant analysis pipelines. Database URL: https://doi.org/10.1093/database/baac03

    Creation and delivery of a complex 3D geological survey for the Glasgow area and its application to urban geology

    Get PDF
    The Glasgow area has a combination of highly variable superficial deposits and a legacy of heavy industry, quarrying and mining. These factors create complex foundation and hydrological conditions, influencing the movement of contaminants through the subsurface and giving rise locally to unstable ground conditions. Digital geological three-dimensional models developed by the British Geological Survey are helping to resolve the complex geology underlying Glasgow, providing a key tool for planning and environmental management. The models, covering an area of 3200km2 to a depth of 1.2km, include glacial and post-glacial deposits and the underlying, faulted Carboniferous igneous and sedimentary rocks. Control data, including 95,000 boreholes, digital mine plans and published geological maps, were used in model development. Digital outputs from the models include maps of depth to key horizons, such as rockhead or depth to mine workings. The models have formed the basis for the development of site-scale high-resolution geological models and provide input data for a wide range of other applications from groundwater modelling to stochastic lithological modelling

    Grain Surface Models and Data for Astrochemistry

    Get PDF
    AbstractThe cross-disciplinary field of astrochemistry exists to understand the formation, destruction, and survival of molecules in astrophysical environments. Molecules in space are synthesized via a large variety of gas-phase reactions, and reactions on dust-grain surfaces, where the surface acts as a catalyst. A broad consensus has been reached in the astrochemistry community on how to suitably treat gas-phase processes in models, and also on how to present the necessary reaction data in databases; however, no such consensus has yet been reached for grain-surface processes. A team of ∼25 experts covering observational, laboratory and theoretical (astro)chemistry met in summer of 2014 at the Lorentz Center in Leiden with the aim to provide solutions for this problem and to review the current state-of-the-art of grain surface models, both in terms of technical implementation into models as well as the most up-to-date information available from experiments and chemical computations. This review builds on the results of this workshop and gives an outlook for future directions

    The Science of Sungrazers, Sunskirters, and Other Near-Sun Comets

    Get PDF
    This review addresses our current understanding of comets that venture close to the Sun, and are hence exposed to much more extreme conditions than comets that are typically studied from Earth. The extreme solar heating and plasma environments that these objects encounter change many aspects of their behaviour, thus yielding valuable information on both the comets themselves that complements other data we have on primitive solar system bodies, as well as on the near-solar environment which they traverse. We propose clear definitions for these comets: We use the term near-Sun comets to encompass all objects that pass sunward of the perihelion distance of planet Mercury (0.307 AU). Sunskirters are defined as objects that pass within 33 solar radii of the Sun’s centre, equal to half of Mercury’s perihelion distance, and the commonly-used phrase sungrazers to be objects that reach perihelion within 3.45 solar radii, i.e. the fluid Roche limit. Finally, comets with orbits that intersect the solar photosphere are termed sundivers. We summarize past studies of these objects, as well as the instruments and facilities used to study them, including space-based platforms that have led to a recent revolution in the quantity and quality of relevant observations. Relevant comet populations are described, including the Kreutz, Marsden, Kracht, and Meyer groups, near-Sun asteroids, and a brief discussion of their origins. The importance of light curves and the clues they provide on cometary composition are emphasized, together with what information has been gleaned about nucleus parameters, including the sizes and masses of objects and their families, and their tensile strengths. The physical processes occurring at these objects are considered in some detail, including the disruption of nuclei, sublimation, and ionisation, and we consider the mass, momentum, and energy loss of comets in the corona and those that venture to lower altitudes. The different components of comae and tails are described, including dust, neutral and ionised gases, their chemical reactions, and their contributions to the near-Sun environment. Comet-solar wind interactions are discussed, including the use of comets as probes of solar wind and coronal conditions in their vicinities. We address the relevance of work on comets near the Sun to similar objects orbiting other stars, and conclude with a discussion of future directions for the field and the planned ground- and space-based facilities that will allow us to address those science topics

    Automated workflow-based exploitation of pathway databases provides new insights into genetic associations of metabolite profiles

    Get PDF
    Background: Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets. Results: Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression. Conclusions: We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS datasets of metabolomic phenotypes. We report novel loci and potential biochemical mechanisms that contribute to our understanding of the genetic basis of metabolic variation and its relationship to disease development and progression

    Unexplored therapeutic opportunities in the human genome

    Get PDF
    A large proportion of biomedical research and the development of therapeutics is focused on a small fraction of the human genome. In a strategic effort to map the knowledge gaps around proteins encoded by the human genome and to promote the exploration of currently understudied, but potentially druggable, proteins, the US National Institutes of Health launched the Illuminating the Druggable Genome (IDG) initiative in 2014. In this article, we discuss how the systematic collection and processing of a wide array of genomic, proteomic, chemical and disease-related resource data by the IDG Knowledge Management Center have enabled the development of evidence-based criteria for tracking the target development level (TDL) of human proteins, which indicates a substantial knowledge deficit for approximately one out of three proteins in the human proteome. We then present spotlights on the TDL categories as well as key drug target classes, including G protein-coupled receptors, protein kinases and ion channels, which illustrate the nature of the unexplored opportunities for biomedical research and therapeutic development. © 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved

    Search for jet extinction in the inclusive jet-pT spectrum from proton-proton collisions at s=8 TeV

    Get PDF
    Published by the American Physical Society under the terms of the Creative Commons Attribution 3.0 License. Further distribution of this work must maintain attribution to the author(s) and the published articles title, journal citation, and DOI.The first search at the LHC for the extinction of QCD jet production is presented, using data collected with the CMS detector corresponding to an integrated luminosity of 10.7  fb−1 of proton-proton collisions at a center-of-mass energy of 8 TeV. The extinction model studied in this analysis is motivated by the search for signatures of strong gravity at the TeV scale (terascale gravity) and assumes the existence of string couplings in the strong-coupling limit. In this limit, the string model predicts the suppression of all high-transverse-momentum standard model processes, including jet production, beyond a certain energy scale. To test this prediction, the measured transverse-momentum spectrum is compared to the theoretical prediction of the standard model. No significant deficit of events is found at high transverse momentum. A 95% confidence level lower limit of 3.3 TeV is set on the extinction mass scale

    The Influence of Age and Sex on Genetic Associations with Adult Body Size and Shape : A Large-Scale Genome-Wide Interaction Study

    Get PDF
    Genome-wide association studies (GWAS) have identified more than 100 genetic variants contributing to BMI, a measure of body size, or waist-to-hip ratio (adjusted for BMI, WHRadjBMI), a measure of body shape. Body size and shape change as people grow older and these changes differ substantially between men and women. To systematically screen for age-and/or sex-specific effects of genetic variants on BMI and WHRadjBMI, we performed meta-analyses of 114 studies (up to 320,485 individuals of European descent) with genome-wide chip and/or Metabochip data by the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. Each study tested the association of up to similar to 2.8M SNPs with BMI and WHRadjBMI in four strata (men 50y, women 50y) and summary statistics were combined in stratum-specific meta-analyses. We then screened for variants that showed age-specific effects (G x AGE), sex-specific effects (G x SEX) or age-specific effects that differed between men and women (G x AGE x SEX). For BMI, we identified 15 loci (11 previously established for main effects, four novel) that showed significant (FDR= 50y). No sex-dependent effects were identified for BMI. For WHRadjBMI, we identified 44 loci (27 previously established for main effects, 17 novel) with sex-specific effects, of which 28 showed larger effects in women than in men, five showed larger effects in men than in women, and 11 showed opposite effects between sexes. No age-dependent effects were identified for WHRadjBMI. This is the first genome-wide interaction meta-analysis to report convincing evidence of age-dependent genetic effects on BMI. In addition, we confirm the sex-specificity of genetic effects on WHRadjBMI. These results may providefurther insights into the biology that underlies weight change with age or the sexually dimorphism of body shape.Peer reviewe

    Searches for electroweak neutralino and chargino production in channels with Higgs, Z, and W bosons in pp collisions at 8 TeV

    Get PDF
    Searches for supersymmetry (SUSY) are presented based on the electroweak pair production of neutralinos and charginos, leading to decay channels with Higgs, Z, and W bosons and undetected lightest SUSY particles (LSPs). The data sample corresponds to an integrated luminosity of about 19.5 fb(-1) of proton-proton collisions at a center-of-mass energy of 8 TeV collected in 2012 with the CMS detector at the LHC. The main emphasis is neutralino pair production in which each neutralino decays either to a Higgs boson (h) and an LSP or to a Z boson and an LSP, leading to hh, hZ, and ZZ states with missing transverse energy (E-T(miss)). A second aspect is chargino-neutralino pair production, leading to hW states with E-T(miss). The decays of a Higgs boson to a bottom-quark pair, to a photon pair, and to final states with leptons are considered in conjunction with hadronic and leptonic decay modes of the Z and W bosons. No evidence is found for supersymmetric particles, and 95% confidence level upper limits are evaluated for the respective pair production cross sections and for neutralino and chargino mass values
    corecore