70 research outputs found

    Knowledge Unlearning for Mitigating Privacy Risks in Language Models

    Full text link
    Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. We show that simply applying the unlikelihood training objective to target token sequences is effective at forgetting them with little to no degradation of general language modeling performances; it sometimes even substantially improves the underlying LM with just a few iterations. We also find that sequential unlearning is better than trying to unlearn all the data at once and that unlearning is highly dependent on which kind of data (domain) is forgotten. By showing comparisons with a previous data preprocessing method known to mitigate privacy risks for LMs, we show that unlearning can give a stronger empirical privacy guarantee in scenarios where the data vulnerable to extraction attacks are known a priori while being orders of magnitude more computationally efficient. We release the code and dataset needed to replicate our results at https://github.com/joeljang/knowledge-unlearning

    The Type 2 Diabetes Knowledge Portal: an Open access Genetic Resource Dedicated to Type 2 Diabetes and Related Traits

    Get PDF
    Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP\u27s comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results

    Parallel genome-scale loss of function screens in 216 cancer cell lines for the identification of context-specific genetic dependencies

    Get PDF
    Using a genome-scale, lentivirally delivered shRNA library, we performed massively parallel pooled shRNA screens in 216 cancer cell lines to identify genes that are required for cell proliferation and/or viability. Cell line dependencies on 11,000 genes were interrogated by 5 shRNAs per gene. The proliferation effect of each shRNA in each cell line was assessed by transducing a population of 11M cells with one shRNA-virus per cell and determining the relative enrichment or depletion of each of the 54,000 shRNAs after 16 population doublings using Next Generation Sequencing. All the cell lines were screened using standardized conditions to best assess differential genetic dependencies across cell lines. When combined with genomic characterization of these cell lines, this dataset facilitates the linkage of genetic dependencies with specific cellular contexts (e.g., gene mutations or cell lineage). To enable such comparisons, we developed and provided a bioinformatics tool to identify linear and nonlinear correlations between these features

    Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis

    Get PDF
    Abstract Background Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3–5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk

    Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis

    Get PDF
    Funding GMP, PN, and CW are supported by NHLBI R01HL127564. GMP and PN are supported by R01HL142711. AG acknowledge support from the Wellcome Trust (201543/B/16/Z), European Union Seventh Framework Programme FP7/2007–2013 under grant agreement no. HEALTH-F2-2013–601456 (CVGenes@Target) & the TriPartite Immunometabolism Consortium [TrIC]-Novo Nordisk Foundation’s Grant number NNF15CC0018486. JMM is supported by American Diabetes Association Innovative and Clinical Translational Award 1–19-ICTS-068. SR was supported by the Academy of Finland Center of Excellence in Complex Disease Genetics (Grant No 312062), the Finnish Foundation for Cardiovascular Research, the Sigrid Juselius Foundation, and University of Helsinki HiLIFE Fellow and Grand Challenge grants. EW was supported by the Finnish innovation fund Sitra (EW) and Finska Läkaresällskapet. CNS was supported by American Heart Association Postdoctoral Fellowships 15POST24470131 and 17POST33650016. Charles N Rotimi is supported by Z01HG200362. Zhe Wang, Michael H Preuss, and Ruth JF Loos are supported by R01HL142302. NJT is a Wellcome Trust Investigator (202802/Z/16/Z), is the PI of the Avon Longitudinal Study of Parents and Children (MRC & WT 217065/Z/19/Z), is supported by the University of Bristol NIHR Biomedical Research Centre (BRC-1215–2001) and the MRC Integrative Epidemiology Unit (MC_UU_00011), and works within the CRUK Integrative Cancer Epidemiology Programme (C18281/A19169). Ruth E Mitchell is a member of the MRC Integrative Epidemiology Unit at the University of Bristol funded by the MRC (MC_UU_00011/1). Simon Haworth is supported by the UK National Institute for Health Research Academic Clinical Fellowship. Paul S. de Vries was supported by American Heart Association grant number 18CDA34110116. Julia Ramierz acknowledges support by the People Programme of the European Union’s Seventh Framework Programme grant n° 608765 and Marie Sklodowska-Curie grant n° 786833. Maria Sabater-Lleal is supported by a Miguel Servet contract from the ISCIII Spanish Health Institute (CP17/00142) and co-financed by the European Social Fund. Jian Yang is funded by the Westlake Education Foundation. Olga Giannakopoulou has received funding from the British Heart Foundation (BHF) (FS/14/66/3129). CHARGE Consortium cohorts were supported by R01HL105756. Study-specific acknowledgements are available in the Additional file 32: Supplementary Note. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S. Department of Health and Human Services.Peer reviewedPublisher PD

    Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis

    Get PDF
    Publisher Copyright: © 2022, The Author(s).Background: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3–5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk.Peer reviewe

    Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis

    Get PDF
    Funding Information: GMP, PN, and CW are supported by NHLBI R01HL127564. GMP and PN are supported by R01HL142711. AG acknowledge support from the Wellcome Trust (201543/B/16/Z), European Union Seventh Framework Programme FP7/2007–2013 under grant agreement no. HEALTH-F2-2013–601456 (CVGenes@Target) & the TriPartite Immunometabolism Consortium [TrIC]-Novo Nordisk Foundation’s Grant number NNF15CC0018486. JMM is supported by American Diabetes Association Innovative and Clinical Translational Award 1–19-ICTS-068. SR was supported by the Academy of Finland Center of Excellence in Complex Disease Genetics (Grant No 312062), the Finnish Foundation for Cardiovascular Research, the Sigrid Juselius Foundation, and University of Helsinki HiLIFE Fellow and Grand Challenge grants. EW was supported by the Finnish innovation fund Sitra (EW) and Finska Läkaresällskapet. CNS was supported by American Heart Association Postdoctoral Fellowships 15POST24470131 and 17POST33650016. Charles N Rotimi is supported by Z01HG200362. Zhe Wang, Michael H Preuss, and Ruth JF Loos are supported by R01HL142302. NJT is a Wellcome Trust Investigator (202802/Z/16/Z), is the PI of the Avon Longitudinal Study of Parents and Children (MRC & WT 217065/Z/19/Z), is supported by the University of Bristol NIHR Biomedical Research Centre (BRC-1215–2001) and the MRC Integrative Epidemiology Unit (MC_UU_00011), and works within the CRUK Integrative Cancer Epidemiology Programme (C18281/A19169). Ruth E Mitchell is a member of the MRC Integrative Epidemiology Unit at the University of Bristol funded by the MRC (MC_UU_00011/1). Simon Haworth is supported by the UK National Institute for Health Research Academic Clinical Fellowship. Paul S. de Vries was supported by American Heart Association grant number 18CDA34110116. Julia Ramierz acknowledges support by the People Programme of the European Union’s Seventh Framework Programme grant n° 608765 and Marie Sklodowska-Curie grant n° 786833. Maria Sabater-Lleal is supported by a Miguel Servet contract from the ISCIII Spanish Health Institute (CP17/00142) and co-financed by the European Social Fund. Jian Yang is funded by the Westlake Education Foundation. Olga Giannakopoulou has received funding from the British Heart Foundation (BHF) (FS/14/66/3129). CHARGE Consortium cohorts were supported by R01HL105756. Study-specific acknowledgements are available in the Additional file : Supplementary Note. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S. Department of Health and Human Services. Publisher Copyright: © 2022, The Author(s).Background: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3–5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk.Peer reviewe
    corecore