Search CORE

64 research outputs found

Caipirini: using gene sets to rank literature

Author: A Barbosa-Silva
Adriano Barbosa-Silva
AM Cohen
Ana Carolina Wanderley-Nogueira
C Nobata
GB Martin
Georgios A Pavlopoulos
GL Poulter
H Kessman
H Kilicoglu
J Lewis
J-B Morel
JF Fontaine
KA Pattin
LJ Jensen
N Polavarapu
Nina Mota Soares-Cavalcanti
PK Shah
R Altman
R Rodriguez-Esteban
Reinhard Schneider
S Yu
Seán I O'Donoghue
T Etzold
T Goetz
T Soldatos
T Tuchler
Theodoros G Soldatos
Venkata P Satagopam
W Yu
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists of abstracts that the user judges to be 'interesting'. Some methods go further by allowing the user to provide a second input set of 'uninteresting' abstracts; these two input sets are then used to search and rank literature by relevance. In this work we present the service 'Caipirini' (<url>http://caipirini.org</url>) that also allows two input sets, but takes the novel approach of allowing ranking of literature based on one or more sets of genes. Results To evaluate the usefulness of Caipirini, we used two test cases, one related to the human cell cycle, and a second related to disease defense mechanisms in <it>Arabidopsis thaliana</it>. In both cases, the new method achieved high precision in finding literature related to the biological mechanisms underlying the input data sets. Conclusions To our knowledge Caipirini is the first service enabling literature search directly based on biological relevance to gene sets; thus, Caipirini gives the research community a new way to unlock hidden knowledge from gene sets derived via high-throughput experiments.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UNSWorks

MDC Repository

Open Repository and Bibliography - Luxembourg

Bioinformatics challenges for genome-wide association studies

Author: Ahmed
Altshuler
Amundadottir
Askland
Bureau
Bush
Calle
Chang
Chanock
Cook
Culverhouse
Donnelly
Easton
Eiberg
Elbers
Emily
F. W. Asselbergs
Greene
Hahn
Hahn
Hirschhorn
Holmans
Infante
J. H. Moore
Jakobsdottir
Kooperberg
Kraft
Lewontin
Lou
Lunetta
Manolio
Manolio
Marchini
McKinney
McKinney
Mei
Millstein
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Motsinger
Namkung
Nelson
Pan
Pattin
Reich
Reif
Ripperger
Ritchie
Ritchie
Ritchie
S. M. Williams
Schork
Sinnott-Armstrong
Spencer
Thornton-Wells
Torkamani
Velez
Wang
Wilke
Williams
Wongseree
Yu
Yu
Zhang
Publication venue: Oxford University Press
Publication date: 15/02/2010
Field of study

Motivation: The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype–phenotype relationship that is characterized by significant heterogeneity and gene–gene and gene–environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods

CiteSeerX

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

UCL Discovery

Dissertations of the University of Groningen

A New Methodology to Associate SNPs with Human Diseases According to Their Pathway Related Context

Genome-wide association studies (GWAS) with hundreds of żthousands of single nucleotide polymorphisms (SNPs) are popular strategies to reveal the genetic basis of human complex diseases. Despite many successes of GWAS, it is well recognized that new analytical approaches have to be integrated to achieve their full potential. Starting with a list of SNPs, found to be associated with disease in GWAS, here we propose a novel methodology to devise functionally important KEGG pathways through the identification of genes within these pathways, where these genes are obtained from SNP analysis. Our methodology is based on functionalization of important SNPs to identify effected genes and disease related pathways. We have tested our methodology on WTCCC Rheumatoid Arthritis (RA) dataset and identified: i) previously known RA related KEGG pathways (e.g., Toll-like receptor signaling, Jak-STAT signaling, Antigen processing, Leukocyte transendothelial migration and MAPK signaling pathways); ii) additional KEGG pathways (e.g., Pathways in cancer, Neurotrophin signaling, Chemokine signaling pathways) as associated with RA. Furthermore, these newly found pathways included genes which are targets of RA-specific drugs. Even though GWAS analysis identifies 14 out of 83 of those drug target genes; newly found functionally important KEGG pathways led to the discovery of 25 out of 83 genes, known to be used as drug targets for the treatment of RA. Among the previously known pathways, we identified additional genes associated with RA (e.g. Antigen processing and presentation, Tight junction). Importantly, within these pathways, the associations between some of these additionally found genes, such as HLA-C, HLA-G, PRKCQ, PRKCZ, TAP1, TAP2 and RA were verified by either OMIM database or by literature retrieved from the NCBI PubMed module. With the whole-genome sequencing on the horizon, we show that the full potential of GWAS can be achieved by integrating pathway and network-oriented analysis and prior knowledge from functional properties of a SNP

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Sabanci University Research Database

A General Framework for Formal Tests of Interaction after Exhaustive Search Methods with Applications to MDR and MDR-PDT

Author: A Templeton
AA Motsinger
AA Motsinger
CS Coffey
DB Hancock
DM Evans
DR Velez
DW Hosmer
Eden R. Martin
ER Martin
ER Martin
Eric S. Torstenson
J Marchini
J Millstein
JD Owens
JH Moore
JH Moore
JH Moore
JH Moore
KA Pattin
KD Siegmund
KY Liang
LW Hahn
M Schmidt
Marylyn D. Ritchie
MD Ritchie
MD Ritchie
MF Baksh
MP Bass
N Risch
R Culverhouse
RE Bellman
RL Milne
RS Michalsky
Schlicting
Scott M. Dudek
SL Zeger
SM Dudek
Stephen D. Turner
T Hastie
Thorkild I. A. Sorensen
TL Edwards
TL Edwards
TL Edwards
TL Edwards
Todd L. Edwards
WS Bush
WS Bush
Z Feng
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The initial presentation of multifactor dimensionality reduction (MDR) featured cross-validation to mitigate over-fitting, computationally efficient searches of the epistatic model space, and variable construction with constructive induction to alleviate the curse of dimensionality. However, the method was unable to differentiate association signals arising from true interactions from those due to independent main effects at individual loci. This issue leads to problems in inference and interpretability for the results from MDR and the family-based compliment the MDR-pedigree disequilibrium test (PDT). A suggestion from previous work was to fit regression models post hoc to specifically evaluate the null hypothesis of no interaction for MDR or MDR-PDT models. We demonstrate with simulation that fitting a regression model on the same data as that analyzed by MDR or MDR-PDT is not a valid test of interaction. This is likely to be true for any other procedure that searches for models, and then performs an uncorrected test for interaction. We also show with simulation that when strong main effects are present and the null hypothesis of no interaction is true, that MDR and MDR-PDT reject at far greater than the nominal rate. We also provide a valid regression-based permutation test procedure that specifically tests the null hypothesis of no interaction, and does not reject the null when only main effects are present. The regression-based permutation test implemented here conducts a valid test of interaction after a search for multilocus models, and can be applied to any method that conducts a search to find a multilocus model representing an interaction

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship Miami

A Network-Based Approach to Prioritize Results from Genome-Wide Association Studies

Author: A Langville
A Torkamani
AL Barabási
Ancha Baranova
Andrew Singleton
BM Neale
Bok-Ghee Han
C O'Dushlaine
CC Elbers
D Altshuler
D Ballard
D Hwang
D Vise
DF Gudbjartsson
DG Clayton
DM Evans
Donald Seto
DS Hardin
EE Schadt
Francis J. McMahon
G Biolo
G Biolo
G Dennis Jr
G Golub
G Lettre
G Lohmann
G Peng
G Strang
GJ Filion
H Eleftherohorinou
H Fu
H Zhong
HJ Cordell
HJ Schirra
I Feldman
I Iossifov
J Marchini
J Sun
JD Han
Jeffrey Solka
JK Wittke-Thompson
JL Morrison
JM Vink
Jong-Young Lee
JZ Liu
K Askland
K Avrachenkov
K Bryan
K Lage
K Wang
KA Pattin
L Chen
L Ferrucci
L Hosking
Luigi Ferrucci
M Emily
M Krauthammer
M Yamaguchi
MEJ Newman
MG Hong
Michael A. Nalls
MN Weedon
N Risch
NA Davis
Nirmala Akula
O Carlborg
P Holmans
P Jia
P Kraft
P Kraft
R Huang
R Majeti
S Brin
S Draghici
S Kohler
S Peri
S Purcell
S Raychaudhuri
S Suthram
SE Baranzini
SF Saccone
SM Purcell
SR Setlur
Stefania Bandinelli
SY Rhee
T Ideker
T Inada
TG Lesnick
Thomas Mailund
TK Gandhi
Toshiko Tanaka
W Huang da
Y Li
Yoon Shin Cho
Young Jin Kim
YS Cho
Z Tu
ZJ Ma
Publication venue: Public Library of Science
Publication date: 06/09/2011
Field of study

Genome-wide association studies (GWAS) are a valuable approach to understanding the genetic basis of complex traits. One of the challenges of GWAS is the translation of genetic association results into biological hypotheses suitable for further investigation in the laboratory. To address this challenge, we introduce Network Interface Miner for Multigenic Interactions (NIMMI), a network-based method that combines GWAS data with human protein-protein interaction data (PPI). NIMMI builds biological networks weighted by connectivity, which is estimated by use of a modification of the Google PageRank algorithm. These weights are then combined with genetic association p-values derived from GWAS, producing what we call ‘trait prioritized sub-networks.’ As a proof of principle, NIMMI was tested on three GWAS datasets previously analyzed for height, a classical polygenic trait. Despite differences in sample size and ancestry, NIMMI captured 95% of the known height associated genes within the top 20% of ranked sub-networks, far better than what could be achieved by a single-locus approach. The top 2% of NIMMI height-prioritized sub-networks were significantly enriched for genes involved in transcription, signal transduction, transport, and gene expression, as well as nucleic acid, phosphate, protein, and zinc metabolism. All of these sub-networks were ranked near the top across all three height GWAS datasets we tested. We also tested NIMMI on a categorical phenotype, Crohn’s disease. NIMMI prioritized sub-networks involved in B- and T-cell receptor, chemokine, interleukin, and other pathways consistent with the known autoimmune nature of Crohn’s disease. NIMMI is a simple, user-friendly, open-source software tool that efficiently combines genetic association data with biological networks, translating GWAS findings into biological hypotheses

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

High-Order SNP Combinations Associated with Complex Diseases: Efficient Discovery, Statistical Power and Functional Interactions

Author: A Gao
A Motsinger-Reif
A Subramanian
B Maher
B Van Ness
Brian Van Ness
C Greene
C Greene
C Herold
C Huttenhower
D Anastassiou
D Brinza
D Evans
D Goldstein
D Rabinowitz
D Stram
E Bey
E Eichler
E Schadt
G Dong
G Fang
G Fang
G Grahne
G Thorisson
Gang Fang
H Cordell
H He
H Wang
Haoyu Yu
J Hirschhorn
J Huang
J Lehár
J Marchini
J Moore
J Storey
J Storey
K Christensen
K Pattin
K Small
K Van Steen
K Wang
K Wang
L Cardon
L Ma
L Tentori
M Ashburner
M Carrasquillo
M Costanzo
M Nelson
M Norris
M Ritchie
M Steinbach
M Van Der Deen
Majda Haznadar
Michael Steinbach
N Yosef
P Kraft
R Agrawal
R Bayardo
R Cantor
R Dowell
R Gupta
S Baranzini
S Bay
S Purcell
S Vicent
T Church
T Church
T Howard
T Kam-Thong
T Manolio
Timothy R. Church
V Varadan
V Varadan
Vipin Kumar
W Zhang
Wen Wang
William S. Oetting
X Hua
X Lou
X Lou
X Wan
X Zhang
Y Oji
Y Zhang
Yu Zhang
Z Wang
Publication venue: Public Library of Science
Publication date: 19/04/2012
Field of study

There has been increased interest in discovering combinations of single-nucleotide polymorphisms (SNPs) that are strongly associated with a phenotype even if each SNP has little individual effect. Efficient approaches have been proposed for searching two-locus combinations from genome-wide datasets. However, for high-order combinations, existing methods either adopt a brute-force search which only handles a small number of SNPs (up to few hundreds), or use heuristic search that may miss informative combinations. In addition, existing approaches lack statistical power because of the use of statistics with high degrees-of-freedom and the huge number of hypotheses tested during combinatorial search. Due to these challenges, functional interactions in high-order combinations have not been systematically explored. We leverage discriminative-pattern-mining algorithms from the data-mining community to search for high-order combinations in case-control datasets. The substantially improved efficiency and scalability demonstrated on synthetic and real datasets with several thousands of SNPs allows the study of several important mathematical and statistical properties of SNP combinations with order as high as eleven. We further explore functional interactions in high-order combinations and reveal a general connection between the increase in discriminative power of a combination over its subsets and the functional coherence among the genes comprising the combination, supported by multiple datasets. Finally, we study several significant high-order combinations discovered from a lung-cancer dataset and a kidney-transplant-rejection dataset in detail to provide novel insights on the complex diseases. Interestingly, many of these associations involve combinations of common variations that occur in small fractions of population. Thus, our approach is an alternative methodology for exploring the genetics of rare diseases for which the current focus is on individually rare variations

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

The Impact of Phenocopy on the Genetic Analysis of Complex Traits

Author: A Hinney
AA Motsinger-Reif
AD Skol
B Rannala
C Dong
C Kooperberg
C Kooperberg
C Li
C Wider
Claudio Franceschi
D Curtis
DC Rao
DG Clayton
E Zeggini
Francesco Lescai
G De Benedictis
GM Clarke
GS Zubenko
H Hakonarson
H Schwender
HJ Cordell
I Ionita-Laza
I Tomlinson
J Pritchard
J Xu
JA Todd
JB Wilk
JC Florez
JH Moore
JH Moore
JH Moore
JH Moore
JP Ioannidis
JP Ioannidis
JV Raelson
KA Pattin
Klaus F. X. Mayer
LE Reich DE
LM Butcher
LW Hahn
M Choi
M Kayser
M Li
M Schmidt
MD Ritchie
MD Ritchie
MP Bass
PC Phillips
Q Li
R Culverhouse
S Macgregor
SB Guthery SL
SF Kingsmore
SM Dudek
SM Singh
TA Pearson
TL Edwards
TL Edwards
W Wongseree
Publication venue: Public Library of Science
Publication date: 27/08/2009
Field of study

A consistent debate is ongoing on genome-wide association studies (GWAs). A key point is the capability to identify low-penetrance variations across the human genome. Among the phenomena reducing the power of these analyses, phenocopy level (PE) hampers very seriously the investigation of complex diseases, as well known in neurological disorders, cancer, and likely of primary importance in human ageing. PE seems to be the norm, rather than the exception, especially when considering the role of epigenetics and environmental factors towards phenotype. Despite some attempts, no recognized solution has been proposed, particularly to estimate the effects of phenocopies on the study planning or its analysis design. We present a simulation, where we attempt to define more precisely how phenocopy impacts on different analytical methods under different scenarios. With our approach the critical role of phenocopy emerges, and the more the PE level increases the more the initial difficulty in detecting gene-gene interactions is amplified. In particular, our results show that strong main effects are not hampered by the presence of an increasing amount of phenocopy in the study sample, despite progressively reducing the significance of the association, if the study is sufficiently powered. On the opposite, when purely epistatic effects are simulated, the capability of identifying the association depends on several parameters, such as the strength of the interaction between the polymorphic variants, the penetrance of the polymorphism and the alleles (minor or major) which produce the combined effect and their frequency in the population. We conclude that the neglect of the possible presence of phenocopies in complex traits heavily affects the analysis of their genetic data

Crossref

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Directory of Open Access Journals

PubMed Central

UCL Discovery

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Three-Dimensional Geometric Analysis of Felid Limb Bone Allometry

Studies of bone allometry typically use simple measurements taken in a small number of locations per bone; often the midshaft diameter or joint surface area is compared to body mass or bone length. However, bones must fulfil multiple roles simultaneously with minimum cost to the animal while meeting the structural requirements imposed by behaviour and locomotion, and not exceeding its capacity for adaptation and repair. We use entire bone volumes from the forelimbs and hindlimbs of Felidae (cats) to investigate regional complexities in bone allometry.Computed tomographic (CT) images (16435 slices in 116 stacks) were made of 9 limb bones from each of 13 individuals of 9 feline species ranging in size from domestic cat (Felis catus) to tiger (Panthera tigris). Eleven geometric parameters were calculated for every CT slice and scaling exponents calculated at 5% increments along the entire length of each bone. Three-dimensional moments of inertia were calculated for each bone volume, and spherical radii were measured in the glenoid cavity, humeral head and femoral head. Allometry of the midshaft, moments of inertia and joint radii were determined. Allometry was highly variable and related to local bone function, with joint surfaces and muscle attachment sites generally showing stronger positive allometry than the midshaft.Examining whole bones revealed that bone allometry is strongly affected by regional variations in bone function, presumably through mechanical effects on bone modelling. Bone's phenotypic plasticity may be an advantage during rapid evolutionary divergence by allowing exploitation of the full size range that a morphotype can occupy. Felids show bone allometry rather than postural change across their size range, unlike similar-sized animals

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Genetic regulation of Nrnx1 expression: an integrative cross-species analysis of schizophrenia candidate genes

Author: AC Need
AC Need
AG Petrenko
BA Taylor
C Johnson
C Kellendonk
C O'Dushlaine
C Vaillend
C Zhang
CE McOmish
CR Marshall
CS Coffey
D Rujescu
D Warde-Farley
DA Hosack
DC Airey
DE Adkins
DR Velez
EM Hur
ET Cirulli
G Dennis Jr
G Kattenstroth
G Novak
GR Uhl
HC Morse III
Hsin-Chou Yang
HY Koh
I Dudanova
I Israely
ISC
J Blundell
J McCaughran Jr
J McClellan
J Nussbaum
J Shi
JC Barrett
JC Crabbe
JH Moore
JL Peirce
JL Peirce
K Bhalla
K Ichtchenko
K Mozhui
K Mozhui
K Tabuchi
K Wang
KA Pattin
KS Pollard
L Rowen
LJ Bierut
LQ Zhu
LW Hahn
M Bucan
M Burmeister
M Fukasawa
M Gratacos
M Luciano
M Missler
M Missler
MD Fallin
MD Ritchie
MD Ritchie
ML Hamshere
MR Etherton
NJ Bray
NM Viquez
P Jia
P Langfelder
R Tabares-Seisdedos
RJ Anney
RW Overall
RW Williams
RW Williams
S Domene
S Purcell
S Sugita
SG Potkin
SS Moy
T Biederer
T Kimura
T Sakurai
T Vrijenhoek
T Walsh
VM Philip
W Zhang
WS Liang
WS Liang
WT O'Brien
X Wang
Y Hata
YA Ushkaryov
Z Freyberg
Publication venue: Nature Publishing Group
Publication date
Field of study

Neurexin 1 (NRXN1) is a large presynaptic transmembrane protein that has complex and variable patterns of expression in the brain. Sequence variants in NRXN1 are associated with differences in cognition, and with schizophrenia and autism. The murine Nrxn1 gene is also highly polymorphic and is associated with significant variation in expression that is under strong genetic control. Here, we use co-expression analysis, high coverage genomic sequence, and expression quantitative trait locus (eQTL) mapping to study the regulation of this gene in the brain. We profiled a family of 72 isogenic progeny strains of a cross between C57BL/6J and DBA/2J (the BXD family) using exon arrays and massively parallel RNA sequencing. Expression of most Nrxn1 exons have high genetic correlation (r>0.6) because of the segregation of a common trans eQTL on chromosome (Chr) 8 and a common cis eQTL on Chr 17. These two loci are also linked to murine phenotypes relevant to schizophrenia and to a novel human schizophrenia candidate gene with high neuronal expression (Pleckstrin and Sec7 domain containing 3). In both human and mice, NRXN1 is co-expressed with numerous synaptic and cell signaling genes, and known schizophrenia candidates. Cross-species co-expression and protein interaction network analyses identified glycogen synthase kinase 3 beta (GSK3B) as one of the most consistent and conserved covariates of NRXN1. By using the Molecular Genetics of Schizophrenia data set, we were able to test and confirm that markers in NRXN1 and GSK3B have epistatic interactions in human populations that can jointly modulate risk of schizophrenia

Crossref

PubMed Central

Genetic variants and their interactions in disease risk prediction – machine learning and network perspectives

Author: 1000 Genomes Project
A Ashworth
A Burga
A Califano
A Galvan
A Gyenesei
A Statnikov
A Torkamani
A Torkamani
AL Barabási
AL Hopkins
B Lehner
B Lehner
B Maher
B Rakitsch
BA McKinney
BA McKinney
BS Srinivasan
C Ambroise
C Kooperberg
C Tian
C Winter
CG Lambert
CS Greene
D Merico
D Urbach
DJ Balding
DM Evans
DW Aha
DW Huang
DW Huang
E Lee
EA Ashley
EE Eichler
EE Schadt
ES Lander
F Barrenäs
G Bebek
G Gibson
G Hannum
G Peng
GK Chen
GM Clarke
H Eleftherohorinou
H Holm
H Zhong
HJ Cordell
HY Chuang
I Feldman
I Guyon
I König
I Surakka
J Corander
J Jakobsdottir
J Kruppa
J Tuikkala
J Yang
JD Iglehart
JH Moore
JH Moore
K Askland
K Wang
KA Pattin
KS Reynolds
L Luo
M Ladouceur
M Michaut
M Mooney
M Smoot
M Vidal
MA Heiskanen
MD Ritchie
MJ Sillanpää
NA Lavender
NF Marko
O Lavi
O Zuk
P Beltrao
P Donnelly
P Kraft
P Sebastiani
P Smialowski
PC Phillips
PJ Castaldi
Q He
R Braun
R Jelier
R Makowsky
R Simon
RO Lindén
S Lee
S Okser
S Ripatti
S Varma
SE Baranzini
Sebastian Okser
SJ Dixon
SW Hartley
T Hu
T Ideker
T Pahikkala
T Peltola
T Schupbach
TA Manolio
Tapio Pahikkala
Tero Aittokallio
TS Deisboeck
TT Wu
U Ober
U Ober
V Bansal
VK Ramanan
W Huang
Wellcome Trust Case Control Consortium
WG Kaelin Jr
Y Saeys
Z Wang
Z Wei
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref