12 research outputs found

    Proteomic signatures improve risk prediction for common and rare diseases

    Get PDF
    For many diseases there are delays in diagnosis due to a lack of objective biomarkers for disease onset. Here, in 41,931 individuals from the United Kingdom Biobank Pharma Proteomics Project, we integrated measurements of ~3,000 plasma proteins with clinical information to derive sparse prediction models for the 10-year incidence of 218 common and rare diseases (81–6,038 cases). We then compared prediction models developed using proteomic data with models developed using either basic clinical information alone or clinical information combined with data from 37 clinical assays. The predictive performance of sparse models including as few as 5 to 20 proteins was superior to the performance of models developed using basic clinical information for 67 pathologically diverse diseases (median delta C-index = 0.07; range = 0.02–0.31). Sparse protein models further outperformed models developed using basic information combined with clinical assay data for 52 diseases, including multiple myeloma, non-Hodgkin lymphoma, motor neuron disease, pulmonary fibrosis and dilated cardiomyopathy. For multiple myeloma, single-cell RNA sequencing from bone marrow in newly diagnosed patients showed that four of the five predictor proteins were expressed specifically in plasma cells, consistent with the strong predictive power of these proteins. External replication of sparse protein models in the EPIC-Norfolk study showed good generalizability for prediction of the six diseases tested. These findings show that sparse plasma protein signatures, including both disease-specific proteins and protein predictors shared across several diseases, offer clinically useful prediction of common and rare diseases

    Genome-wide association study of chronic sputum production implicates loci involved in mucus production and infection

    Get PDF
    Background: chronic sputum production impacts on quality of life and is a feature of many respiratory diseases. Identification of the genetic variants associated with chronic sputum production in a disease agnostic sample could improve understanding of its causes and identify new molecular targets for treatment.Methods: we conducted a genome-wide association study (GWAS) of chronic sputum production in UK Biobank. Signals meeting genome-wide significance (p<5×10−8) were investigated in additional independent studies, were fine-mapped and putative causal genes identified by gene expression analysis. GWASs of respiratory traits were interrogated to identify whether the signals were driven by existing respiratory disease among the cases and variants were further investigated for wider pleiotropic effects using phenome-wide association studies (PheWASs).Results: from a GWAS of 9714 cases and 48 471 controls, we identified six novel genome-wide significant signals for chronic sputum production including signals in the human leukocyte antigen (HLA) locus, chromosome 11 mucin locus (containing MUC2, MUC5AC and MUC5B) and FUT2 locus. The four common variant associations were supported by independent studies with a combined sample size of up to 2203 cases and 17 627 controls. The mucin locus signal had previously been reported for association with moderate-to-severe asthma. The HLA signal was fine-mapped to an amino acid change of threonine to arginine (frequency 36.8%) in HLA-DRB1 (HLA-DRB1*03:147). The signal near FUT2 was associated with expression of several genes including FUT2, for which the direction of effect was tissue dependent. Our PheWAS identified a wide range of associations including blood cell traits, liver biomarkers, infections, gastrointestinal and thyroid-associated diseases, and respiratory disease.Conclusions: novel signals at the FUT2 and mucin loci suggest that mucin fucosylation may be a driver of chronic sputum production even in the absence of diagnosed respiratory disease and provide genetic support for this pathway as a target for therapeutic intervention

    Genome-wide association study of chronic sputum production implicates loci involved in mucus production and infection

    Get PDF
    Background Chronic sputum production impacts on quality of life and is a feature of many respiratory diseases. Identification of the genetic variants associated with chronic sputum production in a disease agnostic sample could improve understanding of its causes and identify new molecular targets for treatment.Methods We conducted a genome-wide association study (GWAS) of chronic sputum production in UK Biobank. Signals meeting genome-wide significance (P<5×10−8) were investigated in additional independent studies, were fine-mapped, and putative causal genes identified by gene expression analysis. GWAS of respiratory traits were interrogated to identify whether the signals were driven by existing respiratory disease amongst the cases and variants were further investigated for wider pleiotropic effects using phenome-wide association studies (PheWAS).Findings From a GWAS of 9,714 cases and 48,471 controls, we identified six novel genome-wide significant signals for chronic sputum production including signals in the Human Leukocyte Antigen (HLA) locus, chromosome 11 mucin locus (containing MUC2, MUC5AC and MUC5B) and the FUT2 locus. The four common variant associations were supported by independent studies with a combined sample size of up to 2,203 cases and 17,627 controls. The mucin locus signal had previously been reported for association with moderate-to-severe asthma. The HLA signal was fine-mapped to an amino-acid change of threonine to arginine (frequency 36.8%) in HLA-DRB1 (HLA-DRB1*03:147). The signal near FUT2 was associated with expression of several genes including FUT2, for which the direction of effect was tissue dependent. Our PheWAS identified a wide range of associations.Interpretation Novel signals at the FUT2 and mucin loci highlight mucin fucosylation as a driver of chronic sputum production even in the absence of diagnosed respiratory disease and provide genetic support for this pathway as a target for therapeutic intervention

    Proteomic signatures improve risk prediction for common and rare diseases

    Get PDF
    For many diseases there are delays in diagnosis due to a lack of objective biomarkers for disease onset. Here, in 41,931 individuals from the United Kingdom Biobank Pharma Proteomics Project, we integrated measurements of ~3,000 plasma proteins with clinical information to derive sparse prediction models for the 10-year incidence of 218 common and rare diseases (81-6,038 cases). We then compared prediction models developed using proteomic data with models developed using either basic clinical information alone or clinical information combined with data from 37 clinical assays. The predictive performance of sparse models including as few as 5 to 20 proteins was superior to the performance of models developed using basic clinical information for 67 pathologically diverse diseases (median delta C-index = 0.07; range = 0.02-0.31). Sparse protein models further outperformed models developed using basic information combined with clinical assay data for 52 diseases, including multiple myeloma, non-Hodgkin lymphoma, motor neuron disease, pulmonary fibrosis and dilated cardiomyopathy. For multiple myeloma, single-cell RNA sequencing from bone marrow in newly diagnosed patients showed that four of the five predictor proteins were expressed specifically in plasma cells, consistent with the strong predictive power of these proteins. External replication of sparse protein models in the EPIC-Norfolk study showed good generalizability for prediction of the six diseases tested. These findings show that sparse plasma protein signatures, including both disease-specific proteins and protein predictors shared across several diseases, offer clinically useful prediction of common and rare diseases
    corecore