Search CORE

41 research outputs found

Identification of specificity determining residues in peptide recognition domains using an information theoretic approach applied to large-scale binding maps

Author: Gerstein Mark
Hu Xihao
Kim Philip M
Sidhu Sachdev S
Sitwell Simon
Turk Benjamin E
Utz Lukas
Yip Kevin Y
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Peptide Recognition Domains (PRDs) are commonly found in signaling proteins. They mediate protein-protein interactions by recognizing and binding short motifs in their ligands. Although a great deal is known about PRDs and their interactions, prediction of PRD specificities remains largely an unsolved problem. Results We present a novel approach to identifying these Specificity Determining Residues (SDRs). Our algorithm generalizes earlier information theoretic approaches to coevolution analysis, to become applicable to this problem. It leverages the growing wealth of binding data between PRDs and large numbers of random peptides, and searches for PRD residues that exhibit strong evolutionary covariation with some positions of the statistical profiles of bound peptides. The calculations involve only information from sequences, and thus can be applied to PRDs without crystal structures. We applied the approach to PDZ, SH3 and kinase domains, and evaluated the results using both residue proximity in co-crystal structures and verified binding specificity maps from mutagenesis studies. Discussion Our predictions were found to be strongly correlated with the physical proximity of residues, demonstrating the ability of our approach to detect physical interactions of the binding partners. Some high-scoring pairs were further confirmed to affect binding specificity using previous experimental results. Combining the covariation results also allowed us to predict binding profiles with higher reliability than two other methods that do not explicitly take residue covariation into account. Conclusions The general applicability of our approach to the three different domain families demonstrated in this paper suggests its potential in predicting binding targets and assisting the exploration of binding mechanisms.</p

University of Toronto Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Whole-Genome Sequencing analysis of Human Metabolome in Multi-Ethnic Populations

Author: Alkis Taryn
Brown Michael R
Chen Han
Feofanova Elena V
Gerszten Robert E
Grieser Charles
Kelly Rachel S
Larson Martin G
Lasky-Su Jessica
Lemaitre Rozenn N
Li Xihao
Li Zilin
Manuel Astrid M
Mendez Kevin M
Morrison Alanna C
Qi Qibin
Tahir Usman A
Wong Kari E
Yu Bing
Zhao Zhongming
Publication venue: DigitalCommons@TMC
Publication date: 30/05/2023
Field of study

Circulating metabolite levels may reflect the state of the human organism in health and disease, however, the genetic architecture of metabolites is not fully understood. We have performed a whole-genome sequencing association analysis of both common and rare variants in up to 11,840 multi-ethnic participants from five studies with up to 1666 circulating metabolites. We have discovered 1985 novel variant-metabolite associations, and validated 761 locus-metabolite associations reported previously. Seventy-nine novel variant-metabolite associations have been replicated, including three genetic loci located on the X chromosome that have demonstrated its involvement in metabolic regulation. Gene-based analysis have provided further support for seven metabolite-replicated loci pairs and their biologically plausible genes. Among those novel replicated variant-metabolite pairs, follow-up analyses have revealed that 26 metabolites have colocalized with 21 tissues, seven metabolite-disease outcome associations have been putatively causal, and 7 metabolites might be regulated by plasma protein levels. Our results have depicted the genetic contribution to circulating metabolite levels, providing additional insights into understanding human disease

DigitalCommons@The Texas Medical Center

Type 2 Diabetes Modifies the association of Cad Genomic Risk Variants With Subclinical atherosclerosis

Author: Becker Lewis C
Bertoni Alain G
Bielak Lawrence F
Bis Joshua C
Blangero John
Bon Jessica
Bowden Donald W
Brody Jennifer A
Carr John J
Carson April P
Chen Han
Curran Joanne E
de Vries Paul S
Di Corpo Daniel
Duggirala Ravindranath
Fornage Myriam
Freedman Barry I
Gabriel Stacey
Gibbs Richard A
Guo Xiuqing
Gupta Namrata
Hasbani Natalie R
Heard-Costa Nancy
Jacobs David R
Kalyani Rita R
Kardia Sharon L R
Kinney Gregory L
Kral Brian G
Kwak Soo Heon
Lange Leslie A
Li Xihao
Lin Xihong
Mahaney Michael C
Malhotra Rajeev
Manning Alisa K
Meigs James B
Mitchell Braxton D
Momin Zeineen
Montasser May E
Morrison Alanna C
Newman Anne B
Palmer Nicholette D
Peyser Patricia A
Post Wendy S
Pratte Katherine
Psaty Bruce M
Raffield Laura M
Rich Stephen S
Rotter Jerome I
Sarnowski Chloè
Smith Jennifer A
Taylor Kent D
Terry James G
Vasan Ramachandran S
Viaud-Martinez Karine A
Wessel Jennifer
Westerman Kenneth E
Wu Joseph C
Wu Peitao
Yanek Lisa R
Young Kendra A
Publication venue: DigitalCommons@TMC
Publication date: 01/12/2023
Field of study

BACKGROUND: Individuals with type 2 diabetes (T2D) have an increased risk of coronary artery disease (CAD), but questions remain about the underlying pathology. Identifying which CAD loci are modified by T2D in the development of subclinical atherosclerosis (coronary artery calcification [CAC], carotid intima-media thickness, or carotid plaque) may improve our understanding of the mechanisms leading to the increased CAD in T2D. METHODS: We compared the common and rare variant associations of known CAD loci from the literature on CAC, carotid intima-media thickness, and carotid plaque in up to 29 670 participants, including up to 24 157 normoglycemic controls and 5513 T2D cases leveraging whole-genome sequencing data from the Trans-Omics for Precision Medicine program. We included first-order T2D interaction terms in each model to determine whether CAD loci were modified by T2D. The genetic main and interaction effects were assessed using a joint test to determine whether a CAD variant, or gene-based rare variant set, was associated with the respective subclinical atherosclerosis measures and then further determined whether these loci had a significant interaction test. RESULTS: Using a Bonferroni-corrected significance threshold of CONCLUSIONS: These results highlight T2D as an important modifier of rare variant associations in CAD loci with CAC

DigitalCommons@The Texas Medical Center

Powerful, Scalable and Resource-Efficient Meta-Analysis of Rare Variant Associations in Large Whole Genome Sequencing Studies

Author: Arnett Donna K
Bielak Lawrence F
Bis Joshua C
Blangero John
Boerwinkle Eric
Bowden Donald W
Brody Jennifer A
Cade Brian E
Chen Han
Correa Adolfo
Cupples L Adrienne
Curran Joanne E
de Vries Paul S
Dey Rounak
Duggirala Ravindranath
Freedman Barry I
Gaynor Sheila M
Guo Xiuqing
Göring Harald H H
Haessler Jeffrey
Kalyani Rita R
Kooperberg Charles
Kral Brian G
Lange Leslie A
Li Xihao
Li Zilin
Lin Xihong
Liu Yaowu
Manichaikul Ani
Martin Lisa W
McGarvey Stephen T
Mitchell Braxton D
Montasser May E
Morrison Alanna C
Naseri Take
Natarajan Pradeep
NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium TOPMed Lipids Working Group
O\u27Connell Jeffrey R
Palmer Nicholette D
Peloso Gina M
Peyser Patricia A
Psaty Bruce M
Quick Corbin
Raffield Laura M
Redline Susan
Reiner Alexander P
Reupena Muagututi\u27a Sefuiva
Rice Kenneth M
Rich Stephen S
Rotter Jerome I
Selvaraj Margaret Sunitha
Sitlani Colleen M
Smith Jennifer A
Sun Ryan
Taylor Kent D
Vasan Ramachandran S
Willer Cristen J
Wilson James G
Yanek Lisa R
Zhao Wei
Zhou Hufeng
Publication venue: DigitalCommons@TMC
Publication date: 01/01/2023
Field of study

Meta-analysis of whole genome sequencing/whole exome sequencing (WGS/WES) studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Existing rare variant meta-analysis approaches are not scalable to biobank-scale WGS data. Here we present MetaSTAAR, a powerful and resource-efficient rare variant meta-analysis framework for large-scale WGS/WES studies. MetaSTAAR accounts for relatedness and population structure, can analyze both quantitative and dichotomous traits and boosts the power of rare variant tests by incorporating multiple variant functional annotations. Through meta-analysis of four lipid traits in 30,138 ancestrally diverse samples from 14 studies of the Trans Omics for Precision Medicine (TOPMed) Program, we show that MetaSTAAR performs rare variant meta-analysis at scale and produces results comparable to using pooled data. Additionally, we identified several conditionally significant rare variant associations with lipid traits. We further demonstrate that MetaSTAAR is scalable to biobank-scale cohorts through meta-analysis of TOPMed WGS data and UK Biobank WES data of ~200,000 samples

DigitalCommons@The Texas Medical Center

A Framework For Detecting Noncoding Rare-Variant associations of Large-Scale Whole-Genome Sequencing Studies

Author: Arapoglou Theodore
Arnett Donna K
Auer Paul L
Bielak Lawrence F
Bis Joshua C
Blackwell Thomas W
Blangero John
Boerwinkle Eric
Bowden Donald W
Brody Jennifer A
Cade Brian E
Chen Han
Conomos Matthew P
Correa Adolfo
Cupples L Adrienne
Curran Joanne E
de Vries Paul S
Dey Rounak
Duggirala Ravindranath
Franceschini Nora
Freedman Barry I
Gaynor Sheila M
Guo Xiuqing
Göring Harald H H
Kalyani Rita R
Kooperberg Charles
Kral Brian G
Lange Leslie A
Li Xihao
Li Zilin
Lin Bridget M
Lin Xihong
Liu Yaowu
Manichaikul Ani
Manning Alisa K
Martin Lisa W
Mathias Rasika A
Meigs James B
Mitchell Braxton D
Montasser May E
Morrison Alanna C
Naseri Take
Natarajan Pradeep
O\u27Connell Jeffrey R
Palmer Nicholette D
Peloso Gina M
Peyser Patricia A
Psaty Bruce M
Quick Corbin
Raffield Laura M
Redline Susan
Reiner Alexander P
Reupena Muagututi\u27a Sefuiva
Rice Kenneth M
Rich Stephen S
Rotter Jerome I
Selvaraj Margaret Sunitha
Smith Jennifer A
Sun Ryan
Taub Margaret A
Taylor Kent D
Vasan Ramachandran S
Weeks Daniel E
Willer Cristen J
Wilson James G
Yanek Lisa R
Zhao Wei
Zhou Hufeng
Publication venue: DigitalCommons@TMC
Publication date: 01/12/2022
Field of study

Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 toPMed samples. We also analyze five non-lipid toPMed traits

DigitalCommons@The Texas Medical Center

Rare Variants in Long Non-Coding RNAs Are Associated With Blood Lipid Levels in the TOPMed Whole-Genome Sequencing Study

Author: Arnett Donna K
Bis Joshua C
Blangero John
Boerwinkle Eric
Bowden Donald W
Cade Brian E
Carlson Jenna C
Carson April P
Chen Yii-Der Ida
Curran Joanne E
de Vries Paul S
Dutcher Susan K
Ellinor Patrick T
Floyd James S
Fornage Myriam
Freedman Barry I
Gabriel Stacey
Germer Soren
Gibbs Richard A
Guo Xiuqing
He Jiang
Heard-Costa Nancy
Hildalgo Bertha
Holdcraft Jacob A
Hou Lifang
Irvin Marguerite R
Joehanes Roby
Kaplan Robert C
Kardia Sharon Lr
Kelly Tanika N
Kim Ryan
Kooperberg Charles
Kral Brian G
Levy Daniel
Li Changwei
Li Xihao
Li Zilin
Lin Xihong
Liu Chunyu
Lloyd-Jone Don
Loos Ruth Jf
Mahaney Michael C
Martin Lisa W
Mathias Rasika A
Minster Ryan L
Mitchell Braxton D
Montasser May E
Morrison Alanna C
Murabito Joanne M
Naseri Take
Natarajan Pradeep
NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium
O\u27Connell Jeffrey R
Palmer Nicholette D
Peloso Gina M
Preuss Michael H
Psaty Bruce M
Raffield Laura M
Rao Dabeeru C
Redline Susan
Reiner Alexander P
Rich Stephen S
Rotter Jerome I
Ruepena Muagututi\u27a Sefuiva
Selvaraj Margaret Sunitha
Sheu Wayne H-H
Smith Albert
Smith Jennifer A
Tiwari Hemant K
Tsai Michael Y
Viaud-Martinez Karine A
Wang Yuxuan
Wang Zhe
Yanek Lisa R
Zhao Wei
Publication venue: DigitalCommons@TMC
Publication date: 05/10/2023
Field of study

Long non-coding RNAs (lncRNAs) are known to perform important regulatory functions in lipid metabolism. Large-scale whole-genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess more associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with measurement of blood lipids and lipoproteins (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare-variant aggregate association tests using the STAAR (variant-set test for association using annotation information) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare-coding variants in nearby protein-coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500-kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variation and rare protein-coding variation at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNAs

DigitalCommons@The Texas Medical Center