54 research outputs found
A proposed architecture and method of operation for improving the protection of privacy and confidentiality in disease registers
BACKGROUND: Disease registers aim to collect information about all instances of a disease or condition in a defined population of individuals. Traditionally methods of operating disease registers have required that notifications of cases be identified by unique identifiers such as social security number or national identification number, or by ensembles of non-unique identifying data items, such as name, sex and date of birth. However, growing concern over the privacy and confidentiality aspects of disease registers may hinder their future operation. Technical solutions to these legitimate concerns are needed. DISCUSSION: An alternative method of operation is proposed which involves splitting the personal identifiers from the medical details at the source of notification, and separately encrypting each part using asymmetrical (public key) cryptographic methods. The identifying information is sent to a single Population Register, and the medical details to the relevant disease register. The Population Register uses probabilistic record linkage to assign a unique personal identification (UPI) number to each person notified to it, although not necessarily everyone in the entire population. This UPI is shared only with a single trusted third party whose sole function is to translate between this UPI and separate series of personal identification numbers which are specific to each disease register. SUMMARY: The system proposed would significantly improve the protection of privacy and confidentiality, while still allowing the efficient linkage of records between disease registers, under the control and supervision of the trusted third party and independent ethics committees. The proposed architecture could accommodate genetic databases and tissue banks as well as a wide range of other health and social data collections. It is important that proposals such as this are subject to widespread scrutiny by information security experts, researchers and interested members of the general public, alike
Global Biobank Meta-analysis Initiative : Powering genetic discovery across human disease
Funding Information: The work of the contributing biobanks was supported by numerous grants from governmental and charitable bodies. Biobank-specific acknowledgments and more detailed acknowledgments are included in Data S2. Initiative management, S.B.C. J.C. N.J.C. M.J.D. E.E.K. A.R.M. B.M.N. Y.O. A.V.P. D.A.v.H. R.G.W. C.J.W. W.Z. and S.Z.; individual biobank analysis, A.B. Y.B. B.M.B. C.D.B. S.C. T.-T.C. K.C. S.M.D. M.D. G.H.d.B. Y.D. N.J.D. M.-J.F. Y.-C.A.F. S.F. V.L.F. L.G.F. E.R.G. T.R.G. D.H.G. C.R.G. G.G.-A. S.E.G. L.A.G. C.H. J.B.H. W.E.H. H.H. K.H. N.I. A.I. R.J. M. Kurki, J.K. N.K. E.E.K. J.T.K. M. Kanai, T.L. K.L. M.H.L. S.L. K.L. Y.-F.L. V.L.F. R.J.F.L. E.A.L.-M. A.R.-M. S.M.-G. R.M. R.E.M. H.C.M. A.R.M. Y.M. H.M. S.E.M. I.Y.M. B.M. S.M. K.N. S.N. M.A.N.-A. K.N. Y.O. P.P. A.L.-P. A.P. B.P. S.P. M.H.P. D.J.R. N.R. M.D.R. A.R. C.S. S.S. S.S.S. J.A.S. P.S. I.S. T.T. R.T. K.T. J.U. D.A.v.H. B.V. M.V. Y.V. J.M.V. R.G.W. Y.W. S.J.W. B.N.W. K.-H.H.W. M.Z. X.Z. and S.Z.; individual biobank management, N.A. A.A.T. K.M.A.-D. P.A. K.C.B. M. Boehnke, M. Boezen, C.D.B. A.C. Z.C. C.-Y.C. J.C. N.J.C. S.M.D. S.F. Y.-C.A.F. S.F. E.F. T.G. C.R.G. C.J.G. Y.G. H.H. K.A.H. K.H. S.I.I. N.M.J. N.K. E.E.K. J.T.K. C.L. M.H.L. M.T.M.L. L.L. K.L. Y.-F.L. R.J.F.L. J.L. S.M. Y.M. K.M. I.Y.M. Y.O. C.M.O. A.V.P. B.P. D.J.P. D.J.R. M.D.R. S.S. J.W.S. H.S. K.S. T.T. U.T. R.C.T. D.A.v.H. M.V. R.G.W. D.C.W. C.W. J.W. M.Z. X.Z. and S.Z.; study design and interpretation of results, A.B. M. Boehnke, M. Boezen, B.M.B. T.-T.C. C.-Y.C. M.J.D. G.D.S. N.J.D. S.F. M.-J.F. H.K.F. E.R.G. A.G. T.G. J.B.H. J.H. K.H. R.J. M.K. E.E.K. T.K. C.M.L. V.L.F. E.A.L.-M. A.R.M. S.N. B.M.N. C.M.O. J.J.P. B.P. N.R. H.R. J.A.S. I.S. K.T. D.A.v.H. R.G.W. Y.W. D.C.W. S.J.W. C.J.W. B.N.W. J.W. K.-H.H.W. M.Z. H.Z. J.Z. W.Z. X.Z. and S.Z.; drafted and edited the paper, A.B. M. Boehnke, M. Boezen, M.J.D. G.H.d.B. N.J.D. T.R.G. J.B.H. N.I. N.M.J. M.K. V.L.F. S.M. A.R.M. H.M. S.N. B.M.N. C.M.O. B.P. H.R. C.S. J.A.S. J.W.S. K.T. Y.W. D.C.W. C.J.W. K.-H.H.W. H.Z. J.Z. W.Z. and S.Z.; primary meta-analysis and quality control, M.J.D. H.K.F. M. Kanai, J.K. J.T.K. M. Kurki, M.M. B.M.N. C.J.W. K.-H.H.W. and W.Z.; drug discovery: S.N. T.K. K.-H.H.W. W.Z. and Y.O.; fine mapping, M. Kanai, W.Z. M.J.D. and H.K.F.; polygenic risk score, Y.W. S.N. E.A.L.-M. S.K. K.T. K.L. M. Kanai, W.Z. K.W. M.-J.F. L.B. P.A. P.D. V.L.F. R.M. Y.M. B.B. S.S. J.U. E.R.G. N.J.C. I.S. Y.O. A.R.M. and J.B.H.; proteome-wide Mendelian randomization, H.Z. H.R. A.B. G.H. G.D.S. B.M.B. W.Z. B.M.N. T.R.G. and J.Z.; transcriptome-wide association study, A.B. J.B.H. W.Z. J.Z. M. Kanai, B.P. E.R.G. and N.J.C.; asthma, K.T. W.Z. Y.W. M. Kanai, S.N. Y.O. B.M.N. M.J.D. and A.R.M.; heart failure, K.-H.H.W. N.J.D. B.N.W. I.S. S.E.G. J.B.H. N.J.C. M.P. R.J.F.L. M.J.D. B.M.N. W.Z. W.E.H. and C.J.W.; idiopathic pulmonary fibrosis, J.J.P. W.Z. M.J.D. J.T.K. N.J.C. and J.B.H.; primary open-angle glaucoma, V.L.F. A.B. W.Z. Y.W. K.L. M. Kanai, E.A.L.-M. P.S. R.T. X.Z. S.N. S.S. Y.O. N.I. S.M. H.S. I.S. C.W. A.R.M. E.R.G. N.M.J. N.J.C. and J.B.H.; stroke, I.S. K.-H.H.W. W.H. B.N.W. W.Z. J.E.H. A.P. B.B. A.H.S. M.E.G. R.G.W. K.H. C.K. S.Z. M.J.D. B.M.N. and C.J.W.; venous thromboembolism, B.N.W. I.S. K.-H.H.W. B.B. V.L.F. K.T. M.D. B.N. W.Z. J.A.S. and C.J.W. All authors reviewed the manuscript. M.J.D. is a founder of Maze Therapeutics. B.M.N. is a member of the scientific advisory board at Deep Genomics and a consultant for Camp4 Therapeutics, Takeda Pharmaceutical, and Biogen. The spouse of C.J.W. works at Regeneron Pharmaceuticals. C.-Y.C. is employed by Biogen. C.R.G. owns stock in 23andMe, Inc. T.R.G. has received research funding from various pharmaceutical companies to support the application of Mendelian randomization to drug target prioritization. E.E.K. has received speaker fees from Regeneron, Illumina, and 23andMe and is a member of the advisory board for Galateo Bio. R.E.M. has received speaker fees from Illumina and is a scientific advisor to the Epigenetic Clock Development Foundation. G.D.S. has received research funding from various pharmaceutical companies to support the application of Mendelian randomization to drug target prioritization. K.S. and U.T. are employed by deCODE Genetics/Amgen, Inc. J.Z. has received research funding from various pharmaceutical companies to support the application of Mendelian randomization to drug target prioritization. S.M. is a co-founder of and holds stock in Seonix Bio. Publisher Copyright: © 2022Biobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)—a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits.Peer reviewe
Additional file 2 of Predicting the presence of coronary plaques featuring high-risk characteristics using polygenic risk scores and targeted proteomics in patients with suspected coronary artery disease
Additional file 2: Table S4. Summary level data of number of risk factors, proteomics and GPSMult stratified by case/control status
Additional file 2 of Predicting the presence of coronary plaques featuring high-risk characteristics using polygenic risk scores and targeted proteomics in patients with suspected coronary artery disease
Additional file 2: Table S4. Summary level data of number of risk factors, proteomics and GPSMult stratified by case/control status
Predicting the presence of coronary plaques featuring high-risk characteristics using polygenic risk scores and targeted proteomics in patients with suspected coronary artery disease
Abstract Background The presence of coronary plaques with high-risk characteristics is strongly associated with adverse cardiac events beyond the identification of coronary stenosis. Testing by coronary computed tomography angiography (CCTA) enables the identification of high-risk plaques (HRP). Referral for CCTA is presently based on pre-test probability estimates including clinical risk factors (CRFs); however, proteomics and/or genetic information could potentially improve patient selection for CCTA and, hence, identification of HRP. We aimed to (1) identify proteomic and genetic features associated with HRP presence and (2) investigate the effect of combining CRFs, proteomics, and genetics to predict HRP presence. Methods Consecutive chest pain patients (n = 1462) undergoing CCTA to diagnose obstructive coronary artery disease (CAD) were included. Coronary plaques were assessed using a semi-automatic plaque analysis tool. Measurements of 368 circulating proteins were obtained with targeted Olink panels, and DNA genotyping was performed in all patients. Imputed genetic variants were used to compute a multi-trait multi-ancestry genome-wide polygenic score (GPSMult). HRP presence was defined as plaques with two or more high-risk characteristics (low attenuation, spotty calcification, positive remodeling, and napkin ring sign). Prediction of HRP presence was performed using the glmnet algorithm with repeated fivefold cross-validation, using CRFs, proteomics, and GPSMult as input features. Results HRPs were detected in 165 (11%) patients, and 15 input features were associated with HRP presence. Prediction of HRP presence based on CRFs yielded a mean area under the receiver operating curve (AUC) ± standard error of 73.2 ± 0.1, versus 69.0 ± 0.1 for proteomics and 60.1 ± 0.1 for GPSMult. Combining CRFs with GPSMult increased prediction accuracy (AUC 74.8 ± 0.1 (P = 0.004)), while the inclusion of proteomics provided no significant improvement to either the CRF (AUC 73.2 ± 0.1, P = 1.00) or the CRF + GPSMult (AUC 74.6 ± 0.1, P = 1.00) models, respectively. Conclusions In patients with suspected CAD, incorporating genetic data with either clinical or proteomic data improves the prediction of high-risk plaque presence. Trial registration https://clinicaltrials.gov/ct2/show/NCT02264717 (September 2014)
Predicting the presence of coronary plaques featuring high-risk characteristics using polygenic risk scores and targeted proteomics in patients with suspected coronary artery disease
Abstract Background The presence of coronary plaques with high-risk characteristics is strongly associated with adverse cardiac events beyond the identification of coronary stenosis. Testing by coronary computed tomography angiography (CCTA) enables the identification of high-risk plaques (HRP). Referral for CCTA is presently based on pre-test probability estimates including clinical risk factors (CRFs); however, proteomics and/or genetic information could potentially improve patient selection for CCTA and, hence, identification of HRP. We aimed to (1) identify proteomic and genetic features associated with HRP presence and (2) investigate the effect of combining CRFs, proteomics, and genetics to predict HRP presence. Methods Consecutive chest pain patients (n = 1462) undergoing CCTA to diagnose obstructive coronary artery disease (CAD) were included. Coronary plaques were assessed using a semi-automatic plaque analysis tool. Measurements of 368 circulating proteins were obtained with targeted Olink panels, and DNA genotyping was performed in all patients. Imputed genetic variants were used to compute a multi-trait multi-ancestry genome-wide polygenic score (GPSMult). HRP presence was defined as plaques with two or more high-risk characteristics (low attenuation, spotty calcification, positive remodeling, and napkin ring sign). Prediction of HRP presence was performed using the glmnet algorithm with repeated fivefold cross-validation, using CRFs, proteomics, and GPSMult as input features. Results HRPs were detected in 165 (11%) patients, and 15 input features were associated with HRP presence. Prediction of HRP presence based on CRFs yielded a mean area under the receiver operating curve (AUC) ± standard error of 73.2 ± 0.1, versus 69.0 ± 0.1 for proteomics and 60.1 ± 0.1 for GPSMult. Combining CRFs with GPSMult increased prediction accuracy (AUC 74.8 ± 0.1 (P = 0.004)), while the inclusion of proteomics provided no significant improvement to either the CRF (AUC 73.2 ± 0.1, P = 1.00) or the CRF + GPSMult (AUC 74.6 ± 0.1, P = 1.00) models, respectively. Conclusions In patients with suspected CAD, incorporating genetic data with either clinical or proteomic data improves the prediction of high-risk plaque presence. Trial registration https://clinicaltrials.gov/ct2/show/NCT02264717 (September 2014)
Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts
Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.Peer reviewe
Recommended from our members
Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease.
Funder: BiogenBiobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)-a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits
- …
