32 research outputs found
Delineation of groundwater potential zones by means of ensemble tree supervised classification methods in the Eastern Lake Chad basin
This paper presents a machine learning method to map groundwater potential in crystalline domains. First, a spatially-distributed set of explanatory variables for groundwater occurrence is compiled into a geographic information system. Twenty machine learning classifiers are subsequently trained on a sample of 488 boreholes and excavated wells for a region of eastern Chad. This process includes collinearity, cross-validation, feature elimination and parameter fitting routines. Random forest and extra trees classifiers outperformed other algorithms (test score > 0.80, balanced score > 0.80, AUC > 0.87). Fracture density, slope, SAR coherence (interferometric correlation), topographic wetness index, basement depth, distance to channels and slope aspect proved the most relevant explanatory variables. Three major conclusions stem from this work: (1) using a large number of supervised classification algorithms is advisable in groundwater potential studies; (2) the choice of performance metrics constrains the relevance of explanatory variables; and (3) seasonal variations from satellite images contribute to successful groundwater potential mapping
Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage
Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use meta-genomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AMR genes between Europe/North-America/Oceania and Africa/Asia/South-America. Antimicrobial use data and bacterial taxonomy only explains a minor part of the AMR variation that we observe. We find no evidence for cross-selection between antimicrobial classes, or for effect of air travel between sites. However, AMR gene abundance strongly correlates with socio-economic, health and environmental factors, which we use to predict AMR gene abundances in all countries in the world. Our findings suggest that global AMR gene diversity and abundance vary by region, and that improving sanitation and health could potentially limit the global burden of AMR. We propose metagenomic analysis of sewage as an ethically acceptable and economically feasible approach for continuous global surveillance and prediction of AMR.Peer reviewe
Coding Variation in ANGPTL4, LPL, and SVEP1 and the Risk of Coronary Disease.
BACKGROUND: The discovery of low-frequency coding variants affecting the risk of coronary artery disease has facilitated the identification of therapeutic targets. METHODS: Through DNA genotyping, we tested 54,003 coding-sequence variants covering 13,715 human genes in up to 72,868 patients with coronary artery disease and 120,770 controls who did not have coronary artery disease. Through DNA sequencing, we studied the effects of loss-of-function mutations in selected genes. RESULTS: We confirmed previously observed significant associations between coronary artery disease and low-frequency missense variants in the genes LPA and PCSK9. We also found significant associations between coronary artery disease and low-frequency missense variants in the genes SVEP1 (p.D2702G; minor-allele frequency, 3.60%; odds ratio for disease, 1.14; P=4.2×10(-10)) and ANGPTL4 (p.E40K; minor-allele frequency, 2.01%; odds ratio, 0.86; P=4.0×10(-8)), which encodes angiopoietin-like 4. Through sequencing of ANGPTL4, we identified 9 carriers of loss-of-function mutations among 6924 patients with myocardial infarction, as compared with 19 carriers among 6834 controls (odds ratio, 0.47; P=0.04); carriers of ANGPTL4 loss-of-function alleles had triglyceride levels that were 35% lower than the levels among persons who did not carry a loss-of-function allele (P=0.003). ANGPTL4 inhibits lipoprotein lipase; we therefore searched for mutations in LPL and identified a loss-of-function variant that was associated with an increased risk of coronary artery disease (p.D36N; minor-allele frequency, 1.9%; odds ratio, 1.13; P=2.0×10(-4)) and a gain-of-function variant that was associated with protection from coronary artery disease (p.S447*; minor-allele frequency, 9.9%; odds ratio, 0.94; P=2.5×10(-7)). CONCLUSIONS: We found that carriers of loss-of-function mutations in ANGPTL4 had triglyceride levels that were lower than those among noncarriers; these mutations were also associated with protection from coronary artery disease. (Funded by the National Institutes of Health and others.).Supported by a career development award from the National Heart, Lung, and Blood Institute, National Institutes of Health (NIH) (K08HL114642 to Dr. Stitziel) and by the Foundation for Barnes–Jewish Hospital. Dr. Peloso is supported by the National Heart, Lung, and Blood Institute of the NIH (award number K01HL125751). Dr. Kathiresan is supported by a Research Scholar award from the Massachusetts General Hospital, the Donovan Family Foundation, grants from the NIH (R01HL107816 and R01HL127564), a grant from Fondation Leducq, and an investigator-initiated grant from Merck. Dr. Merlini was supported by a grant from the Italian Ministry of Health (RFPS-2007-3-644382). Drs. Ardissino and Marziliano were supported by Regione Emilia Romagna Area 1 Grants. Drs. Farrall and Watkins acknowledge the support of the Wellcome Trust core award (090532/Z/09/Z), the British Heart Foundation (BHF) Centre of Research Excellence. Dr. Schick is supported in part by a grant from the National Cancer Institute (R25CA094880). Dr. Goel acknowledges EU FP7 & Wellcome Trust Institutional strategic support fund. Dr. Deloukas’s work forms part of the research themes contributing to the translational research portfolio of Barts Cardiovascular Biomedical Research Unit, which is supported and funded by the National Institute for Health Research (NIHR). Drs. Webb and Samani are funded by the British Heart Foundation, and Dr. Samani is an NIHR Senior Investigator. Dr. Masca was supported by the NIHR Leicester Cardiovascular Biomedical Research Unit (BRU), and this work forms part of the portfolio of research supported by the BRU. Dr. Won was supported by a postdoctoral award from the American Heart Association (15POST23280019). Dr. McCarthy is a Wellcome Trust Senior Investigator (098381) and an NIHR Senior Investigator. Dr. Danesh is a British Heart Foundation Professor, European Research Council Senior Investigator, and NIHR Senior Investigator. Drs. Erdmann, Webb, Samani, and Schunkert are supported by the FP7 European Union project CVgenes@ target (261123) and the Fondation Leducq (CADgenomics, 12CVD02). Drs. Erdmann and Schunkert are also supported by the German Federal Ministry of Education and Research e:Med program (e:AtheroSysMed and sysINFLAME), and Deutsche Forschungsgemeinschaft cluster of excellence “Inflammation at Interfaces” and SFB 1123. Dr. Kessler received a DZHK Rotation Grant. The analysis was funded, in part, by a Programme Grant from the BHF (RG/14/5/30893 to Dr. Deloukas). Additional funding is listed in the Supplementary Appendix.This is the author accepted manuscript. The final version is available from the Massachusetts Medical Society via http://dx.doi.org/10.1056/NEJMoa150765
Comprehensive Cancer-Predisposition Gene Testing in an Adult Multiple Primary Tumor Series Shows a Broad Range of Deleterious Variants and Atypical Tumor Phenotypes.
Multiple primary tumors (MPTs) affect a substantial proportion of cancer survivors and can result from various causes, including inherited predisposition. Currently, germline genetic testing of MPT-affected individuals for variants in cancer-predisposition genes (CPGs) is mostly targeted by tumor type. We ascertained pre-assessed MPT individuals (with at least two primary tumors by age 60 years or at least three by 70 years) from genetics centers and performed whole-genome sequencing (WGS) on 460 individuals from 440 families. Despite previous negative genetic assessment and molecular investigations, pathogenic variants in moderate- and high-risk CPGs were detected in 67/440 (15.2%) probands. WGS detected variants that would not be (or were not) detected by targeted resequencing strategies, including low-frequency structural variants (6/440 [1.4%] probands). In most individuals with a germline variant assessed as pathogenic or likely pathogenic (P/LP), at least one of their tumor types was characteristic of variants in the relevant CPG. However, in 29 probands (42.2% of those with a P/LP variant), the tumor phenotype appeared discordant. The frequency of individuals with truncating or splice-site CPG variants and at least one discordant tumor type was significantly higher than in a control population (χ2 = 43.642; p ≤ 0.0001). 2/67 (3%) probands with P/LP variants had evidence of multiple inherited neoplasia allele syndrome (MINAS) with deleterious variants in two CPGs. Together with variant detection rates from a previous series of similarly ascertained MPT-affected individuals, the present results suggest that first-line comprehensive CPG analysis in an MPT cohort referred to clinical genetics services would detect a deleterious variant in about a third of individuals.JW is supported by a Cancer Research UK Cambridge Cancer Centre Clinical Research Training Fellowship. Funding for the NIHR BioResource – Rare diseases project was provided by the National Institute for Health Research (NIHR, grant number RG65966). ERM acknowledges support from the European Research Council (Advanced Researcher Award), NIHR (Senior Investigator Award and Cambridge NIHR Biomedical Research Centre), Cancer Research UK Cambridge
Cancer Centre and Medical Research Council Infrastructure Award. The
University of Cambridge has received salary support in respect of EM from the NHS in the East of England through the Clinical Academic Reserve. The views expressed are those of the authors and not necessarily those of the NHS or Department of Health. DGE is an NIHR Senior Investigator and is supported by the all Manchester NIHR Biomedical Research Centre
Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease
Inherited retinal disease is a common cause of visual impairment and represents a highly heterogeneous group of conditions. Here, we present findings from a cohort of 722 individuals with inherited retinal disease, who have had whole-genome sequencing (n = 605), whole-exome sequencing (n = 72), or both (n = 45) performed, as part of the NIHR-BioResource Rare Diseases research study. We identified pathogenic variants (single-nucleotide variants, indels, or structural variants) for 404/722 (56%) individuals. Whole-genome sequencing gives unprecedented power to detect three categories of pathogenic variants in particular: structural variants, variants in GC-rich regions, which have significantly improved coverage compared to whole-exome sequencing, and variants in non-coding regulatory regions. In addition to previously reported pathogenic regulatory variants, we have identified a previously unreported pathogenic intronic variant in in two males with choroideremia. We have also identified 19 genes not previously known to be associated with inherited retinal disease, which harbor biallelic predicted protein-truncating variants in unsolved cases. Whole-genome sequencing is an increasingly important comprehensive method with which to investigate the genetic causes of inherited retinal disease.This work was supported by The National Institute for Health Research England (NIHR) for the NIHR BioResource – Rare Diseases project (grant number RG65966). The Moorfields Eye Hospital cohort of patients and clinical and imaging data were ascertained and collected with the support of grants from the National Institute for Health Research Biomedical Research Centre at Moorfields Eye Hospital, National Health Service Foundation Trust, and UCL Institute of Ophthalmology, Moorfields Eye Hospital Special Trustees, Moorfields Eye Charity, the Foundation Fighting Blindness (USA), and Retinitis Pigmentosa Fighting Blindness. M.M. is a recipient of an FFB Career Development Award. E.M. is supported by UCLH/UCL NIHR Biomedical Research Centre. F.L.R. and D.G. are supported by Cambridge NIHR Biomedical Research Centre
Phenotypic Characterization of EIF2AK4 Mutation Carriers in a Large Cohort of Patients Diagnosed Clinically With Pulmonary Arterial Hypertension.
BACKGROUND: Pulmonary arterial hypertension (PAH) is a rare disease with an emerging genetic basis. Heterozygous mutations in the gene encoding the bone morphogenetic protein receptor type 2 (BMPR2) are the commonest genetic cause of PAH, whereas biallelic mutations in the eukaryotic translation initiation factor 2 alpha kinase 4 gene (EIF2AK4) are described in pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis. Here, we determine the frequency of these mutations and define the genotype-phenotype characteristics in a large cohort of patients diagnosed clinically with PAH. METHODS: Whole-genome sequencing was performed on DNA from patients with idiopathic and heritable PAH and with pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis recruited to the National Institute of Health Research BioResource-Rare Diseases study. Heterozygous variants in BMPR2 and biallelic EIF2AK4 variants with a minor allele frequency of <1:10 000 in control data sets and predicted to be deleterious (by combined annotation-dependent depletion, PolyPhen-2, and sorting intolerant from tolerant predictions) were identified as potentially causal. Phenotype data from the time of diagnosis were also captured. RESULTS: Eight hundred sixty-four patients with idiopathic or heritable PAH and 16 with pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis were recruited. Mutations in BMPR2 were identified in 130 patients (14.8%). Biallelic mutations in EIF2AK4 were identified in 5 patients with a clinical diagnosis of pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis. Furthermore, 9 patients with a clinical diagnosis of PAH carried biallelic EIF2AK4 mutations. These patients had a reduced transfer coefficient for carbon monoxide (Kco; 33% [interquartile range, 30%-35%] predicted) and younger age at diagnosis (29 years; interquartile range, 23-38 years) and more interlobular septal thickening and mediastinal lymphadenopathy on computed tomography of the chest compared with patients with PAH without EIF2AK4 mutations. However, radiological assessment alone could not accurately identify biallelic EIF2AK4 mutation carriers. Patients with PAH with biallelic EIF2AK4 mutations had a shorter survival. CONCLUSIONS: Biallelic EIF2AK4 mutations are found in patients classified clinically as having idiopathic and heritable PAH. These patients cannot be identified reliably by computed tomography, but a low Kco and a young age at diagnosis suggests the underlying molecular diagnosis. Genetic testing can identify these misclassified patients, allowing appropriate management and early referral for lung transplantation
Recommended from our members
Phenotypic Characterization of <i>EIF2AK4</i> Mutation Carriers in a Large Cohort of Patients Diagnosed Clinically With Pulmonary Arterial Hypertension
Background:
Pulmonary arterial hypertension (PAH) is a rare disease with an emerging genetic basis. Heterozygous mutations in the gene encoding the bone morphogenetic protein receptor type 2 (
BMPR2
) are the commonest genetic cause of PAH, whereas biallelic mutations in the eukaryotic translation initiation factor 2 alpha kinase 4 gene (
EIF2AK4
) are described in pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis. Here, we determine the frequency of these mutations and define the genotype-phenotype characteristics in a large cohort of patients diagnosed clinically with PAH.
Methods:
Whole-genome sequencing was performed on DNA from patients with idiopathic and heritable PAH and with pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis recruited to the National Institute of Health Research BioResource–Rare Diseases study. Heterozygous variants in
BMPR2
and biallelic
EIF2AK4
variants with a minor allele frequency of <1:10 000 in control data sets and predicted to be deleterious (by combined annotation-dependent depletion, PolyPhen-2, and
sorting intolerant from tolerant
predictions) were identified as potentially causal. Phenotype data from the time of diagnosis were also captured.
Results:
Eight hundred sixty-four patients with idiopathic or heritable PAH and 16 with pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis were recruited. Mutations in
BMPR2
were identified in 130 patients (14.8%). Biallelic mutations in
EIF2AK4
were identified in 5 patients with a clinical diagnosis of pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis. Furthermore, 9 patients with a clinical diagnosis of PAH carried biallelic
EIF2AK4
mutations. These patients had a reduced transfer coefficient for carbon monoxide (K
co
; 33% [interquartile range, 30%–35%] predicted) and younger age at diagnosis (29 years; interquartile range, 23–38 years) and more interlobular septal thickening and mediastinal lymphadenopathy on computed tomography of the chest compared with patients with PAH without
EIF2AK4
mutations. However, radiological assessment alone could not accurately identify biallelic
EIF2AK4
mutation carriers. Patients with PAH with biallelic
EIF2AK4
mutations had a shorter survival.
Conclusions:
Biallelic
EIF2AK4
mutations are found in patients classified clinically as having idiopathic and heritable PAH. These patients cannot be identified reliably by computed tomography, but a low K
co
and a young age at diagnosis suggests the underlying molecular diagnosis. Genetic testing can identify these misclassified patients, allowing appropriate management and early referral for lung transplantation.
</jats:sec
Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.
Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype
Mitochondrial physiology
As the knowledge base and importance of mitochondrial physiology to evolution, health and disease expands, the necessity for harmonizing the terminology concerning mitochondrial respiratory states and rates has become increasingly apparent. The chemiosmotic theory establishes the mechanism of energy transformation and coupling in oxidative phosphorylation. The unifying concept of the protonmotive force provides the framework for developing a consistent theoretical foundation of mitochondrial physiology and bioenergetics. We follow the latest SI guidelines and those of the International Union of Pure and Applied Chemistry (IUPAC) on terminology in physical chemistry, extended by considerations of open systems and thermodynamics of irreversible processes. The concept-driven constructive terminology incorporates the meaning of each quantity and aligns concepts and symbols with the nomenclature of classical bioenergetics. We endeavour to provide a balanced view of mitochondrial respiratory control and a critical discussion on reporting data of mitochondrial respiration in terms of metabolic flows and fluxes. Uniform standards for evaluation of respiratory states and rates will ultimately contribute to reproducibility between laboratories and thus support the development of data repositories of mitochondrial respiratory function in species, tissues, and cells. Clarity of concept and consistency of nomenclature facilitate effective transdisciplinary communication, education, and ultimately further discovery
Mitochondrial physiology
As the knowledge base and importance of mitochondrial physiology to evolution, health and disease expands, the necessity for harmonizing the terminology concerning mitochondrial respiratory states and rates has become increasingly apparent. The chemiosmotic theory establishes the mechanism of energy transformation and coupling in oxidative phosphorylation. The unifying concept of the protonmotive force provides the framework for developing a consistent theoretical foundation of mitochondrial physiology and bioenergetics. We follow the latest SI guidelines and those of the International Union of Pure and Applied Chemistry (IUPAC) on terminology in physical chemistry, extended by considerations of open systems and thermodynamics of irreversible processes. The concept-driven constructive terminology incorporates the meaning of each quantity and aligns concepts and symbols with the nomenclature of classical bioenergetics. We endeavour to provide a balanced view of mitochondrial respiratory control and a critical discussion on reporting data of mitochondrial respiration in terms of metabolic flows and fluxes. Uniform standards for evaluation of respiratory states and rates will ultimately contribute to reproducibility between laboratories and thus support the development of data repositories of mitochondrial respiratory function in species, tissues, and cells. Clarity of concept and consistency of nomenclature facilitate effective transdisciplinary communication, education, and ultimately further discovery