Search CORE

190 research outputs found

The EBI RDF platform: linked open data for the life sciences

Author: Birney Ewan
Bolleman Jerven
Brandizi Marco
Davies Mark
Garcia Leyla
Gaulton Anna
Gehant Sebastien
Jenkinson Andrew M.
Jupp Simon
Laibe Camille
Le Novère Nicolas
Malone James
Martin Maria
Parkinson Helen
Redaschi Nicole
Wimalaratne Sarala M.
Publication venue
Publication date: 02/08/2017
Field of study

Motivation: Resource description framework (RDF) is an emerging technology for describing, publishing and linking life science data. As a major provider of bioinformatics data and services, the European Bioinformatics Institute (EBI) is committed to making data readily accessible to the community in ways that meet existing demand. The EBI RDF platform has been developed to meet an increasing demand to coordinate RDF activities across the institute and provides a new entry point to querying and exploring integrated resources available at the EBI. Availability: http://www.ebi.ac.uk/rdf Contact: [email protected]

RERO DOC Digital Library

Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes.

Author: Almeida Marcio
Bandinelli Stefania
Blangero John
Curran Joanne E
Duggirala Ravindranath
Ferrucci Luigi
Frayling Timothy M
Gaulton Kyle
Gibbs J Raphael
Göring Harald HH
Hernandez Dena
Johnson Matthew P
Jun Goo
Li Qibin
Lin Haoxiang
Mccarthy Mark I
Melzer David
Murray Anna
Nalls Mike
Pearson Richard
Perry John RB
Rivas Manny
Shen Juan
Singleton Andrew
Tanaka Toshiko
Tuke Marcus A
Weedon Michael N
Wood Andrew R
Xu Christopher S
Publication venue: eScholarship, University of California
Publication date: 06/11/2014
Field of study

Initial results from sequencing studies suggest that there are relatively few low-frequency (<5%) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency-large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant-common phenotype associations-11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008 were common and low frequency (<5%), respectively, low frequency-large effect associations comprised 7% of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10(-06) (false discovery rate ∼5%)] and one of eight biomarker associations at P < 8 × 10(-10). Very few (30 of 1232; 2%) common variant associations were fully explained by low-frequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect

PubMed Central

eScholarship - University of California

Developing a network view of type 2 diabetes risk pathways through integration of genetic, genomic and functional data

Author: Fernández-Tajes J
Gaulton KJ
Gloyn Anna
Lage K
Mahajan A
McCarthy Mark
Thurner M
Torres Jason
Van De Bunt M
Publication venue: BioMed Central
Publication date: 26/03/2019
Field of study

BACKGROUND:Genome-wide association studies (GWAS) have identified several hundred susceptibility loci for type 2 diabetes (T2D). One critical, but unresolved, issue concerns the extent to which the mechanisms through which these diverse signals influencing T2D predisposition converge on a limited set of biological processes. However, the causal variants identified by GWAS mostly fall into a non-coding sequence, complicating the task of defining the effector transcripts through which they operate. METHODS:Here, we describe implementation of an analytical pipeline to address this question. First, we integrate multiple sources of genetic, genomic and biological data to assign positional candidacy scores to the genes that map to T2D GWAS signals. Second, we introduce genes with high scores as seeds within a network optimization algorithm (the asymmetric prize-collecting Steiner tree approach) which uses external, experimentally confirmed protein-protein interaction (PPI) data to generate high-confidence sub-networks. Third, we use GWAS data to test the T2D association enrichment of the "non-seed" proteins introduced into the network, as a measure of the overall functional connectivity of the network. RESULTS:We find (a) non-seed proteins in the T2D protein-interaction network so generated (comprising 705 nodes) are enriched for association to T2D (p = 0.0014) but not control traits, (b) stronger T2D-enrichment for islets than other tissues when we use RNA expression data to generate tissue-specific PPI networks and (c) enhanced enrichment (p = 3.9 × 10- 5) when we combine the analysis of the islet-specific PPI network with a focus on the subset of T2D GWAS loci which act through defective insulin secretion. CONCLUSIONS:These analyses reveal a pattern of non-random functional connectivity between candidate causal genes at T2D GWAS loci and highlight the products of genes including YWHAG, SMAD4 or CDK2 as potential contributors to T2D-relevant islet dysfunction. The approach we describe can be applied to other complex genetic and genomic datasets, facilitating integration of diverse data types into disease-associated networks

Oxford University Research Archive

Identification and functional characterization of G6PC2 coding variants influencing glycemic traits define an effector transcript at the G6PC2-ABCB11 locus.

Author: Abecasis Goncalo
Altshuler David
Beer Nicola L.
Bell Graeme I.
Bergman Richard N.
Blancher Christine
Blangero John
Boehnke Michael
Bork-Jensen Jette
Brandslund Ivan
Buck David
Buck Gemma
Burtt Noël P.
Christensen Cramer
Cingolani Pablo
Collins Francis S.
Cox Nancy J.
Doney Alex S. F.
Duggirala Ravindranath
Dupuis Josee
Flannick Jason
Florez Jose C.
Fontanillas Pierre
Fuchsberger Christian
Gabriel Stacey
Gaulton Kyle J.
Gjesing Anette P.
Gloyn Anna L.
GoT2D consortium
Grarup Niels
Groop Leif
Groves Christopher J.
Hanis Craig L.
Hansen Torben
Highland Heather M.
Hollensted Mette
Huyghe Jeroen R.
Im Hae Kyung
Ingelsson Erik
Isomaa Bo
Jackson Anne U.
Jun Goo
Justesen Johanne Marie
Jørgensen Marit E.
Jørgensen Torben
Karpe Fredrik
Kuusisto Johanna
Laakso Markku
Ladenvall Claes
Lannfelt Lars
Lind Lars
Lindgren Cecilia M.
Linneberg Allan
Locke Adam E.
Mahajan Anubha
Mangino Massimo
Manning Alisa
McCarthy Mark I.
Meigs James B.
Mohlke Karen L.
Morris Andrew D.
Morris Andrew P.
Murphy Jacquelyn
Neville Matt
Ng Hui Jin
Onofrio Robert
Palmer Colin N. A.
Pedersen Oluf
Prokopenko Inga
Rauramaa Rainer
Rayner N. William
Rivas Manuel A.
Robertson Neil R.
Roden Michael
Rundle Jana K.
Salomaa Veikko
Seielstad Mark
Sim Xueling
Small Kerrin S.
Spector Timothy D.
Stringham Heather M.
Surdulescu Gabriela L.
Syvänen Ann-Christine
T2D-GENES consortium
Teslovich Tanya M.
Trakalo Joseph
Tuomi Tiinamaija
Tuomilehto Jaakko
Uusitupa Matti
Watanabe Richard M.
Wilson James G.
Publication venue: PLoS Genet
Publication date: 01/01/2015
Field of study

Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

The Francis Crick Institute

Lund University Publications

Crossref

Harvard University - DASH

Publikationer från Uppsala Universitet

PubMed Central

eScholarship - University of California

Apollo (Cambridge)

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Swepub

The miRNA Profile of Human Pancreatic Islets and Beta-Cells and Relationship to Type 2 Diabetes Pathogenesis

Author: Ferrer Jorge
Gaulton Kyle J
Gloyn Anna L
Johnson Paul R
Lindgren Cecilia M
McCarthy Mark I
Moran Ignasi
Parts Leopold
van de Bunt Martijn
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Recent advances in the understanding of the genetics of type 2 diabetes (T2D) susceptibility have focused attention on the regulation of transcriptional activity within the pancreatic beta-cell. MicroRNAs (miRNAs) represent an important component of regulatory control, and have proven roles in the development of human disease and control of glucose homeostasis. We set out to establish the miRNA profile of human pancreatic islets and of enriched beta-cell populations, and to explore their potential involvement in T2D susceptibility. We used Illumina small RNA sequencing to profile the miRNA fraction in three preparations each of primary human islets and of enriched beta-cells generated by fluorescence-activated cell sorting. In total, 366 miRNAs were found to be expressed (i.e. >100 cumulative reads) in islets and 346 in beta-cells; of the total of 384 unique miRNAs, 328 were shared. A comparison of the islet-cell miRNA profile with those of 15 other human tissues identified 40 miRNAs predominantly expressed (i.e. >50% of all reads seen across the tissues) in islets. Several highly-expressed islet miRNAs, such as miR-375, have established roles in the regulation of islet function, but others (e.g. miR-27b-3p, miR-192-5p) have not previously been described in the context of islet biology. As a first step towards exploring the role of islet-expressed miRNAs and their predicted mRNA targets in T2D pathogenesis, we looked at published T2D association signals across these sites. We found evidence that predicted mRNA targets of islet-expressed miRNAs were globally enriched for signals of T2D association (p-values <0.01, q-values <0.1). At six loci with genome-wide evidence for T2D association (AP3S2, KCNK16, NOTCH2, SCL30A8, VPS26A, and WFS1) predicted mRNA target sites for islet-expressed miRNAs overlapped potentially causal variants. In conclusion, we have described the miRNA profile of human islets and beta-cells and provide evidence linking islet miRNAs to T2D pathogenesis

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Oxford University Research Archive

The Francis Crick Institute

A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data

Author: Abecasis G.R. (Gonçalo)
Altshuler D. (David)
Asimit J.L. (Jennifer L.)
Atzmon G. (Gil)
Barber M. (Mathew)
Barzilai A. (Ari)
Beer N.L. (Nicola L.)
Bell G.I. (Graeme I.)
Below J. (Jennifer)
Blackwell T. (Tom)
Blangero J. (John)
Boehnke M. (Michael)
Bowden D.W. (Donald W.)
Burtt N.P. (Noël)
Chambers J.C. (John)
Chen H. (Han)
Chen P. (Ping)
Chines P.S. (Peter)
Cho Y.S. (Yoon Shin)
Choi S. (Sungkyoung)
Churchhouse C. (Claire)
Cingolani P. (Pablo)
Cornes B.K. (Belinda)
Cox N.J. (Nancy)
Day-Williams A.G. (Aaron)
De Almeida M.A.A. (Marcio)
Duggirala A. (Aparna)
Dupuis J. (Josée)
Dyer T. (Thomas)
Feng S. (Shuang)
Fernandez-Tajes J. (Juan)
Ferreira T. (Teresa)
Fingerlin T.E. (Tasha E.)
Flannick J. (Jason)
Florez J.C. (Jose)
Fontanillas P. (Pierre)
Frayling T.M. (Timothy)
Fuchsberger C. (Christian)
Gamazon E. (Eric)
Gaulton K. (Kyle)
Ghosh S. (Saurabh)
Glaser B. (Benjamin)
Gloyn A.L. (Anna)
Grossman R.L. (Robert L.)
Grundstad J. (Jason)
Hanis C. (Craig)
Heath A. (Allison)
Highland H. (Heather)
Horikoshi M. (Momoko)
Huh I.-S. (Ik-Soo)
Huyghe J.R. (Jeroen R.)
Ikram M.K. (Kamran)
Im H.K. (Hae Kyung)
Jablonski K.A. (Kathleen)
Jun Y. (Yang)
Kato N. (Norihiro)
Kim B.-J. (Bong-Jo)
Kim B.-J. (Bong-Jo)
Kim J. (Jayoun)
Kim Y.J. (Young Jin)
Kim Y.J. (Young Jin)
King C.R. (C. Ryan)
Kooner J.S. (Jaspal S.)
Kwon M.-S. (Min-Seok)
Laakso M. (Markku)
Lam K.K.-Y. (Kevin Koi-Yau)
Lee J. (Jaehoon)
Lee J. (Juyoung)
Lee J. (Juyoung)
Lee S. (Selyeong)
Lee S. (Sungyoung)
Lehman D.M. (Donna M.)
Li H. (Heng)
Lindgren C.M. (Cecilia)
Liu X. (Xuanyao)
Livne O.E. (Oren E.)
Locke A.E. (Adam E.)
Mahajan A. (Anubha)
Maller J.B. (Julian B.)
Manning A.K. (Alisa K.)
Maxwell T.J. (Taylor J.)
Mazoure A. (Alexander)
McCarthy M.I. (Mark)
Meigs J.B. (James B.)
Min B. (Byungju)
Mohlke K.L. (Karen)
Morris A.P. (Andrew)
Musani S. (Solomon)
Nagai Y. (Yoshihiko)
Ng M.C.Y. (Maggie C.Y.)
Nicolae D. (Dan)
Oh S. (Sohee)
Palmer N.D. (Nicholette)
Park T. (Taesung)
Park T. (Taesung)
Pollin T.I. (Toni I.)
Prokopenko I. (Inga)
Reich D. (David)
Rivas M.A. (Manuel)
Scott L.J. (Laura)
Seielstad M. (Mark)
Sim X. (Xueling)
Sladek R. (Rob)
Smith P. (Philip)
Tachmazidou I. (Ioanna)
Tai E.S. (Shyong)
Teo Y.Y. (Yik Ying)
Teslovich T.M. (Tanya M.)
Torres J. (Jason)
Trubetskoy V. (Vasily)
Willems S.M. (Sara)
Williams A.L. (Amy L.)
Wilson J.G. (James)
Wiltshire S. (Steven)
Won S. (Sungho)
Wood A.R. (Andrew)
Xu W. (Wang)
Yoon J. (Joon)
Zawistowski M. (Matthew)
Zeggini E. (Eleftheria)
Zhang W. (Weihua)
Zöllner S. (Sebastian)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/12/2015
Field of study

Background: Rare variants have gathered increasing attention as a possible alternative source of missing heritability. Since next generation sequencing technology is not yet cost-effective for large-scale genomic studies, a widely used alternative approach is imputation. However, the imputation approach may be limited by the low accuracy of the imputed rare variants. To improve imputation accuracy of rare variants, various approaches have been suggested, including increasing the sample size of the reference panel, using sequencing data from study-specific samples (i.e., specific populations), and using local reference panels by genotyping or sequencing a subset of study samples. While these approaches mainly utilize reference panels, imputation accuracy of rare variants can also be increased by using exome chips containing rare variants. The exome chip contains 250 K rare variants selected from the discovered variants of about 12,000 sequenced samples. If exome chip data are available for previously genotyped samples, the combined approach using a genotype panel of merged data, including exome chips and SNP chips, should increase the imputation accuracy of rare variants. Results: In this study, we describe a combined imputation which uses both exome chip and SNP chip data simultaneously as a genotype panel. The effectiveness and performance of the combined approach was demonstrated using a reference panel of 848 samples constructed using exome sequencing data from the T2D-GENES consortium and 5,349 sample genotype panels consisting of an exome chip and SNP chip. As a result, the combined approach increased imputation quality up to 11 %, and genomic coverage for rare variants up to 117.7 % (MAF < 1 %), compared to imputation using the SNP chip alone. Also, we investigated the systematic effect of reference panels on imputation quality using five reference panels and three genotype panels. The best performing approach was the combination of the study specific reference panel and the genotype panel of combined data. Conclusions: Our study demonstrates that combined datasets, including SNP chips and exome chips, enhances both the imputation quality and genomic coverage of rare variants

Erasmus University Digital Repository

Improving the odds of drug development success through human genomics: modelling study.

Author: Casas Juan Pablo
Chopade Sandesh
Denaxas Spiros
Finan Chris
Gaulton Anna
Hemingway Harry
Hingorani Aroon D
Kruger Felix A
Kuan Valerie
MacAllister Raymond J
Overington John P
Prieto David
Sofat Reecha
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/12/2019
Field of study

Lack of efficacy in the intended disease indication is the major cause of clinical phase drug development failure. Explanations could include the poor external validity of pre-clinical (cell, tissue, and animal) models of human disease and the high false discovery rate (FDR) in preclinical science. FDR is related to the proportion of true relationships available for discovery (γ), and the type 1 (false-positive) and type 2 (false negative) error rates of the experiments designed to uncover them. We estimated the FDR in preclinical science, its effect on drug development success rates, and improvements expected from use of human genomics rather than preclinical studies as the primary source of evidence for drug target identification. Calculations were based on a sample space defined by all human diseases - the 'disease-ome' - represented as columns; and all protein coding genes - 'the protein-coding genome'- represented as rows, producing a matrix of unique gene- (or protein-) disease pairings. We parameterised the space based on 10,000 diseases, 20,000 protein-coding genes, 100 causal genes per disease and 4000 genes encoding druggable targets, examining the effect of varying the parameters and a range of underlying assumptions, on the inferences drawn. We estimated γ, defined mathematical relationships between preclinical FDR and drug development success rates, and estimated improvements in success rates based on human genomics (rather than orthodox preclinical studies). Around one in every 200 protein-disease pairings was estimated to be causal (γ = 0.005) giving an FDR in preclinical research of 92.6%, which likely makes a major contribution to the reported drug development failure rate of 96%. Observed success rate was only slightly greater than expected for a random pick from the sample space. Values for γ back-calculated from reported preclinical and clinical drug development success rates were also close to the a priori estimates. Substituting genome wide (or druggable genome wide) association studies for preclinical studies as the major information source for drug target identification was estimated to reverse the probability of late stage failure because of the more stringent type 1 error rate employed and the ability to interrogate every potential druggable target in the same experiment. Genetic studies conducted at much larger scale, with greater resolution of disease end-points, e.g. by connecting genomics and electronic health record data within healthcare systems has the potential to produce radical improvement in drug development success rate

LSHTM Research Online

UCL Discovery