70 research outputs found
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Pretrained Language Models (LMs) memorize a vast amount of knowledge during
initial pretraining, including information that may violate the privacy of
personal lives and identities. Previous work addressing privacy issues for
language models has mostly focused on data preprocessing and differential
privacy methods, both requiring re-training the underlying LM. We propose
knowledge unlearning as an alternative method to reduce privacy risks for LMs
post hoc. We show that simply applying the unlikelihood training objective to
target token sequences is effective at forgetting them with little to no
degradation of general language modeling performances; it sometimes even
substantially improves the underlying LM with just a few iterations. We also
find that sequential unlearning is better than trying to unlearn all the data
at once and that unlearning is highly dependent on which kind of data (domain)
is forgotten. By showing comparisons with a previous data preprocessing method
known to mitigate privacy risks for LMs, we show that unlearning can give a
stronger empirical privacy guarantee in scenarios where the data vulnerable to
extraction attacks are known a priori while being orders of magnitude more
computationally efficient. We release the code and dataset needed to replicate
our results at https://github.com/joeljang/knowledge-unlearning
The Type 2 Diabetes Knowledge Portal: an Open access Genetic Resource Dedicated to Type 2 Diabetes and Related Traits
Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP\u27s comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results
Parallel genome-scale loss of function screens in 216 cancer cell lines for the identification of context-specific genetic dependencies
Using a genome-scale, lentivirally delivered shRNA library, we performed massively parallel pooled shRNA screens in 216 cancer cell lines to identify genes that are required for cell proliferation and/or viability. Cell line dependencies on 11,000 genes were interrogated by 5 shRNAs per gene. The proliferation effect of each shRNA in each cell line was assessed by transducing a population of 11M cells with one shRNA-virus per cell and determining the relative enrichment or depletion of each of the 54,000 shRNAs after 16 population doublings using Next Generation Sequencing. All the cell lines were screened using standardized conditions to best assess differential genetic dependencies across cell lines. When combined with genomic characterization of these cell lines, this dataset facilitates the linkage of genetic dependencies with specific cellular contexts (e.g., gene mutations or cell lineage). To enable such comparisons, we developed and provided a bioinformatics tool to identify linear and nonlinear correlations between these features
Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis
Abstract Background Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3–5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk
Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis
Funding GMP, PN, and CW are supported by NHLBI R01HL127564. GMP and PN are supported by R01HL142711. AG acknowledge support from the Wellcome Trust (201543/B/16/Z), European Union Seventh Framework Programme FP7/2007–2013 under grant agreement no. HEALTH-F2-2013–601456 (CVGenes@Target) & the TriPartite Immunometabolism Consortium [TrIC]-Novo Nordisk Foundation’s Grant number NNF15CC0018486. JMM is supported by American Diabetes Association Innovative and Clinical Translational Award 1–19-ICTS-068. SR was supported by the Academy of Finland Center of Excellence in Complex Disease Genetics (Grant No 312062), the Finnish Foundation for Cardiovascular Research, the Sigrid Juselius Foundation, and University of Helsinki HiLIFE Fellow and Grand Challenge grants. EW was supported by the Finnish innovation fund Sitra (EW) and Finska Läkaresällskapet. CNS was supported by American Heart Association Postdoctoral Fellowships 15POST24470131 and 17POST33650016. Charles N Rotimi is supported by Z01HG200362. Zhe Wang, Michael H Preuss, and Ruth JF Loos are supported by R01HL142302. NJT is a Wellcome Trust Investigator (202802/Z/16/Z), is the PI of the Avon Longitudinal Study of Parents and Children (MRC & WT 217065/Z/19/Z), is supported by the University of Bristol NIHR Biomedical Research Centre (BRC-1215–2001) and the MRC Integrative Epidemiology Unit (MC_UU_00011), and works within the CRUK Integrative Cancer Epidemiology Programme (C18281/A19169). Ruth E Mitchell is a member of the MRC Integrative Epidemiology Unit at the University of Bristol funded by the MRC (MC_UU_00011/1). Simon Haworth is supported by the UK National Institute for Health Research Academic Clinical Fellowship. Paul S. de Vries was supported by American Heart Association grant number 18CDA34110116. Julia Ramierz acknowledges support by the People Programme of the European Union’s Seventh Framework Programme grant n° 608765 and Marie Sklodowska-Curie grant n° 786833. Maria Sabater-Lleal is supported by a Miguel Servet contract from the ISCIII Spanish Health Institute (CP17/00142) and co-financed by the European Social Fund. Jian Yang is funded by the Westlake Education Foundation. Olga Giannakopoulou has received funding from the British Heart Foundation (BHF) (FS/14/66/3129). CHARGE Consortium cohorts were supported by R01HL105756. Study-specific acknowledgements are available in the Additional file 32: Supplementary Note. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S. Department of Health and Human Services.Peer reviewedPublisher PD
Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis
Publisher Copyright: © 2022, The Author(s).Background: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3–5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk.Peer reviewe
Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis
Funding Information: GMP, PN, and CW are supported by NHLBI R01HL127564. GMP and PN are supported by R01HL142711. AG acknowledge support from the Wellcome Trust (201543/B/16/Z), European Union Seventh Framework Programme FP7/2007–2013 under grant agreement no. HEALTH-F2-2013–601456 (CVGenes@Target) & the TriPartite Immunometabolism Consortium [TrIC]-Novo Nordisk Foundation’s Grant number NNF15CC0018486. JMM is supported by American Diabetes Association Innovative and Clinical Translational Award 1–19-ICTS-068. SR was supported by the Academy of Finland Center of Excellence in Complex Disease Genetics (Grant No 312062), the Finnish Foundation for Cardiovascular Research, the Sigrid Juselius Foundation, and University of Helsinki HiLIFE Fellow and Grand Challenge grants. EW was supported by the Finnish innovation fund Sitra (EW) and Finska Läkaresällskapet. CNS was supported by American Heart Association Postdoctoral Fellowships 15POST24470131 and 17POST33650016. Charles N Rotimi is supported by Z01HG200362. Zhe Wang, Michael H Preuss, and Ruth JF Loos are supported by R01HL142302. NJT is a Wellcome Trust Investigator (202802/Z/16/Z), is the PI of the Avon Longitudinal Study of Parents and Children (MRC & WT 217065/Z/19/Z), is supported by the University of Bristol NIHR Biomedical Research Centre (BRC-1215–2001) and the MRC Integrative Epidemiology Unit (MC_UU_00011), and works within the CRUK Integrative Cancer Epidemiology Programme (C18281/A19169). Ruth E Mitchell is a member of the MRC Integrative Epidemiology Unit at the University of Bristol funded by the MRC (MC_UU_00011/1). Simon Haworth is supported by the UK National Institute for Health Research Academic Clinical Fellowship. Paul S. de Vries was supported by American Heart Association grant number 18CDA34110116. Julia Ramierz acknowledges support by the People Programme of the European Union’s Seventh Framework Programme grant n° 608765 and Marie Sklodowska-Curie grant n° 786833. Maria Sabater-Lleal is supported by a Miguel Servet contract from the ISCIII Spanish Health Institute (CP17/00142) and co-financed by the European Social Fund. Jian Yang is funded by the Westlake Education Foundation. Olga Giannakopoulou has received funding from the British Heart Foundation (BHF) (FS/14/66/3129). CHARGE Consortium cohorts were supported by R01HL105756. Study-specific acknowledgements are available in the Additional file : Supplementary Note. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S. Department of Health and Human Services. Publisher Copyright: © 2022, The Author(s).Background: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. Results: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3–5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. Conclusions: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk.Peer reviewe
Recommended from our members
Leveraging type 1 diabetes human genetic and genomic data in the T1D knowledge portal.
To address the challenge of translating genetic discoveries for type 1 diabetes (T1D) into mechanistic insight, we have developed the T1D Knowledge Portal (T1DKP), an open-access resource for hypothesis development and target discovery in T1D
Recommended from our members
A glomerular transcriptomic landscape of apolipoprotein L1 in Black patients with focal segmental glomerulosclerosis.
Apolipoprotein L1 (APOL1)-associated focal segmental glomerulosclerosis (FSGS) is the dominant form of FSGS in Black individuals. There are no targeted therapies for this condition, in part because the molecular mechanisms underlying APOL1\u27s pathogenic contribution to FSGS are incompletely understood. Studying the transcriptomic landscape of APOL1 FSGS in patient kidneys is an important way to discover genes and molecular behaviors that are unique or most relevant to the human disease. With the hypothesis that the pathology driven by the high-risk APOL1 genotype is reflected in alteration of gene expression across the glomerular transcriptome, we compared expression and co-expression profiles of 15,703 genes in 16 Black patients with FSGS at high-risk vs 14 Black patients with a low-risk APOL1 genotype. Expression data from APOL1-inducible HEK293 cells and normal human glomeruli were used to pursue genes and molecular pathways uncovered in these studies. We discovered increased expression of APOL1 and nine other significant differentially expressed genes in high-risk patients. This included stanniocalcin, which has a role in mitochondrial and calcium-related processes along with differential correlations between high- and low-risk APOL1 and metabolism pathway genes. There were similar correlations with extracellular matrix- and immune-related genes, but significant loss of co-expression of mitochondrial genes in high-risk FSGS, and an NF-κB-down regulating gene, NKIRAS1, as the most significant hub gene with strong differential correlations with NDUF family (mitochondrial respiratory genes) and immune-related (JAK-STAT) genes. Thus, differences in mitochondrial gene regulation appear to underlie many differences observed between high- and low-risk Black patients with FSGS
- …