Search CORE

153 research outputs found

Heterogeneous network embedding enabling accurate disease association predictions.

Author: A Bordes
A Grover
A Ruepp
C Hailin
C Shi
C-H Chou
Chunlei Tang
D Botstein
D Wang
DA Peter
H Chen
J Piñero
J Tang
J Tang
L Bottou
Lu Ruan
M Campillos
M Lu
MA Van Driel
Mengjie Guo
O Vanunu
Q Jiang
Q Zou
Q Zou
S Chang
T Keshava Prasad
T Mikolov
U. M Singh-Blom
V McKusick
W Wang
Wei Wang
X Wu
X Zeng
X Zeng
Xiangnan Kong
Y Fan
Y LeCun
Y Shi
Y Sun
Yangyong Zhu
Yun Xiong
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

BackgroundIt is significant to identificate complex biological mechanisms of various diseases in biomedical research. Recently, the growing generation of tremendous amount of data in genomics, epigenomics, metagenomics, proteomics, metabolomics, nutriomics, etc., has resulted in the rise of systematic biological means of exploring complex diseases. However, the disparity between the production of the multiple data and our capability of analyzing data has been broaden gradually. Furthermore, we observe that networks can represent many of the above-mentioned data, and founded on the vector representations learned by network embedding methods, entities which are in close proximity but at present do not actually possess direct links are very likely to be related, therefore they are promising candidate subjects for biological investigation.ResultsWe incorporate six public biological databases to construct a heterogeneous biological network containing three categories of entities (i.e., genes, diseases, miRNAs) and multiple types of edges (i.e., the known relationships). To tackle the inherent heterogeneity, we develop a heterogeneous network embedding model for mapping the network into a low dimensional vector space in which the relationships between entities are preserved well. And in order to assess the effectiveness of our method, we conduct gene-disease as well as miRNA-disease associations predictions, results of which show the superiority of our novel method over several state-of-the-arts. Furthermore, many associations predicted by our method are verified in the latest real-world dataset.ConclusionsWe propose a novel heterogeneous network embedding method which can adequately take advantage of the abundant contextual information and structures of heterogeneous network. Moreover, we illustrate the performance of the proposed method on directing studies in biology, which can assist in identifying new hypotheses in biological investigation

Crossref

eScholarship - University of California

Bridging topological and functional information in protein interaction networks by short loops profiling

Author: A Annibale
A Brady
A Lancichinetti
A-L Barabási
CL Will
D Weidensdorfer
E Yeger-Lotem
F Cheng
FV Fuller-Pace
GD Bader
H Yu
H-C Lu
J-F Rual
K Tan
L Bonetta
L Hakes
L Royer
L Shi
L Yang
LM Carlin
M Ashburner
M Dreze
M Duran-Frigola
M Girvan
M Varjosalo
M Vidal
M Vidal
MJ Meyer
MR Muller
NH Tran
O Kuchaiev
P Shannon
P Uetz
PC Havugimana
R Milo
R Mosca
R Sharan
RC Gentleman
RM Ewing
S Alaimo
S Charbonnier
T Ideker
T Michoel
T Nepusz
TR Hartman
TS Keshava Prasad
TW Reichman
U Alon
U Stelzl
V Janjic
V Janjić
X-L Li
Y Pei
Z Liang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/02/2015
Field of study

Protein-protein interaction networks (PPINs) have been employed to identify potential novel interconnections between proteins as well as crucial cellular functions. In this study we identify fundamental principles of PPIN topologies by analysing network motifs of short loops, which are small cyclic interactions of between 3 and 6 proteins. We compared 30 PPINs with corresponding randomised null models and examined the occurrence of common biological functions in loops extracted from a cross-validated high-confidence dataset of 622 human protein complexes. We demonstrate that loops are an intrinsic feature of PPINs and that specific cell functions are predominantly performed by loops of different lengths. Topologically, we find that loops are strongly related to the accuracy of PPINs and define a core of interactions with high resilience. The identification of this core and the analysis of loop composition are promising tools to assess PPIN quality and to uncover possible biases from experimental detection methods. More than 96% of loops share at least one biological function, with enrichment of cellular functions related to mRNA metabolic processing and the cell cycle. Our analyses suggest that these motifs can be used in the design of targeted experiments for functional phenotype detection.This research was supported by the Biotechnology and Biological Sciences Research Council (BB/H018409/1 to AP, ACCC and FF, and BB/J016284/1 to NSBT) and by the Leukaemia & Lymphoma Research (to NSBT and FF). SSC is funded by a Leukaemia & Lymphoma Research Gordon Piller PhD Studentship

Crossref

PubMed Central

King's Research Portal

Brunel University Research Archive

Rare coding SNP in DZIP1 gene associated with late-onset sporadic Parkinson's disease

Author: A Lerner
A Ruepp
AA Merchant
AM Glazer
B Bakir-Gungor
B Dass
C Wolff
CB Do
CH Hawkes
D Subramaniam
DAD Monte
DI Chasman
E Eskin
FL Moore
H-C Fung
HR Kim
J Simón-Sánchez
K Lai
K Roeder
K Roeder
K Roeder
K Sekimizu
K Tsuboi
L Lum
LA Hindorff
LM Bekris
M Bak
M Plaisant
M Saad
M-X Li
N Miao
O Bragina
P Mill
P Whitton
PA Beachy
PW Ingham
RE Lamont
S Purcell
SM Chambers
SY Tay
TA Manolio
TH Hamza
TL Edwards
TS Keshava Prasad
V Palma
VF Rafuse
W Satake
Y Katoh
Publication venue
Publication date: 11/09/2011
Field of study

We present the first application of the hypothesis-rich mathematical theory to genome-wide association data. The Hamza et al. late-onset sporadic Parkinson's disease genome-wide association study dataset was analyzed. We found a rare, coding, non-synonymous SNP variant in the gene DZIP1 that confers increased susceptibility to Parkinson's disease. The association of DZIP1 with Parkinson's disease is consistent with a Parkinson's disease stem-cell ageing theory.Comment: 14 page

arXiv.org e-Print Archive

Crossref

PubMed Central

Estudo Geral

Human Protein Reference Database—2009 update

Author: A. Marimuthu
A. Pandey
A. Sebastian
A. Venugopal
Amanchy
B. A. Rahiman
B. Shafreen
Berger
C. J. Harrys Kishore
D. S. Somanathan
D. Telikicherla
Gandhi
K. Kandasamy
L. Balakrishnan
M. Ahmed
M. K. Kashyap
Mishra
P. Ranganathan
Peri
Peri
R. Chaerkady
R. Goel
R. Mohmood
R. Raju
S. Banerjee
S. Kanth
S. Keerthikumar
S. Kumar
S. Mathivanan
S. Mohan
S. Ramabadran
S. Rani
S. Ray
Shtatland
T. S. Keshava Prasad
V. Krishna
Y. L. Ramachandra
Publication venue: Oxford University Press
Publication date
Field of study

Human Protein Reference Database (HPRD—http://www.hprd.org/), initially described in 2003, is a database of curated proteomic information pertaining to human proteins. We have recently added a number of new features in HPRD. These include PhosphoMotif Finder, which allows users to find the presence of over 320 experimentally verified phosphorylation motifs in proteins of interest. Another new feature is a protein distributed annotation system—Human Proteinpedia (http://www.humanproteinpedia.org/)—through which laboratories can submit their data, which is mapped onto protein entries in HPRD. Over 75 laboratories involved in proteomics research have already participated in this effort by submitting data for over 15 000 human proteins. The submitted data includes mass spectrometry and protein microarray-derived data, among other data types. Finally, HPRD is also linked to a compendium of human signaling pathways developed by our group, NetPath (http://www.netpath.org/), which currently contains annotations for several cancer and immune signaling pathways. Since the last update, more than 5500 new protein sequences have been added, making HPRD a comprehensive resource for studying the human proteome

Crossref

PubMed Central

Domain Altering SNPs in the Human Proteome and Their Impact on Signaling Pathways

Author: A Bateman
A Hamosh
A Kelemen
AA Motsinger
AJ Brookes
Aydin Tozeren
B Raghavachari
BS Shastry
CM Gould
DE Arking
E de Castro
E Krieger
F Kronenberg
FS Collins
H Ogata
J McClellan
J Reumers
KA Frazer
M Wjst
MA Harris
MD Ritchie
N Hulo
P Evans
P Yue
PE Bourne
RM Eglen
RM Hubley
RP Bhattacharyya
S Cichon
ST Sherry
TG Clark
Thomas Mailund
TS Keshava Prasad
V Ramensky
VN Maiorov
W Huang da
WE Evans
Y Zhang
Yichuan Liu
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Single nucleotide polymorphisms (SNPs) constitute an important mode of genetic variations observed in the human genome. A small fraction of SNPs, about four thousand out of the ten million, has been associated with genetic disorders and complex diseases. The present study focuses on SNPs that fall on protein domains, 3D structures that facilitate connectivity of proteins in cell signaling and metabolic pathways. We scanned the human proteome using the PROSITE web tool and identified proteins with SNP containing domains. We showed that SNPs that fall on protein domains are highly statistically enriched among SNPs linked to hereditary disorders and complex diseases. Proteins whose domains are dramatically altered by the presence of an SNP are even more likely to be present among proteins linked to hereditary disorders. Proteins with domain-altering SNPs comprise highly connected nodes in cellular pathways such as the focal adhesion, the axon guidance pathway and the autoimmune disease pathways. Statistical enrichment of domain/motif signatures in interacting protein pairs indicates extensive loss of connectivity of cell signaling pathways due to domain-altering SNPs, potentially leading to hereditary disorders

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

A systematic analysis of host factors reveals a Med23-interferon-λ regulatory axis against herpes simplex virus type 1 replication

Author: A Casamassimi
A Chatr-aryamontri
A Gennerich
A Hayman
A Karlas
AL Brass
AL Brass
AL Cunningham
Antonio Volpi
AP de Oliveira
B de Chassey
B Langhans
B Snijder
BD Griffin
BJ Breitkreutz
BK Tischer
C Boutell
C Sommereyns
C Stegen
CA Yang
Caroline C. Friedel
CB Coyne
CC Friedel
Chris Boutell
Colin M. Crump
D Ge
E Porcellini
Even Fossum
F Pica
FD Bushman
Francesca Pica
G Joshi-Tope
G Mittler
G Sivan
GA Karasneh
GC Carter
Gerald Barry
H Gu
H Zhou
Helen L. Zenner
I Ioannidis
I Vastrik
J Li
J Lund
J Melchjorsen
J Mercer
JF Lau
JL Arthur
JL Stevens
John K. Fazakerley
JW Torseth
Jürgen Haas
K Dohner
K Kiiver
K Radtke
Kim Martin
L Salwinski
L Zerboni
L Zhang
L Zhou
Lakshmi N. Kaza
M Albers
M Ashburner
M Barrios-Rodiles
M Gustems
M Ito
M Kanehisa
M Kanehisa
MA Calderwood
Manfred Koegl
Marie H. Craigon
MD Dyer
MN Krishnan
MW Douglas
N Ank
OM Sessions
Orland Gonzalez
P Bandi
P Braun
P Hillyer
P Lamesch
P Sheppard
P Uetz
Peter Ghazal
PG Arduino
PY Bochud
Q Li
R Antrobus
R Konig
R Konig
R Kumar
Ralf Zimmer
RD Everett
RP Donnelly
Rui Chen
S Hashimoto
S Malik
S McCraith
S Pfefferle
Samantha J. Griffiths
SD Shapira
Shou-Jiang Gao
SM Bailer
Stacey Efstathiou
SV Kotenko
T Driscoll
TS Keshava Prasad
V Navratil
V Sancho-Shimizu
W Fu
W Huang da
WW Gierasch
Z Mikloska
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Herpes simplex virus type 1 (HSV-1) is a neurotropic virus causing vesicular oral or genital skin lesions, meningitis and other diseases particularly harmful in immunocompromised individuals. To comprehensively investigate the complex interaction between HSV-1 and its host we combined two genome-scale screens for host factors (HFs) involved in virus replication. A yeast two-hybrid screen for protein interactions and a RNA interference (RNAi) screen with a druggable genome small interfering RNA (siRNA) library confirmed existing and identified novel HFs which functionally influence HSV-1 infection. Bioinformatic analyses found the 358 HFs were enriched for several pathways and multi-protein complexes. Of particular interest was the identification of Med23 as a strongly anti-viral component of the largely pro-viral Mediator complex, which links specific transcription factors to RNA polymerase II. The anti-viral effect of Med23 on HSV-1 replication was confirmed in gain-of-function gene overexpression experiments, and this inhibitory effect was specific to HSV-1, as a range of other viruses including Vaccinia virus and Semliki Forest virus were unaffected by Med23 depletion. We found Med23 significantly upregulated expression of the type III interferon family (IFN-λ) at the mRNA and protein level by directly interacting with the transcription factor IRF7. The synergistic effect of Med23 and IRF7 on IFN-λ induction suggests this is the major transcription factor for IFN-λ expression. Genotypic analysis of patients suffering recurrent orofacial HSV-1 outbreaks, previously shown to be deficient in IFN-λ secretion, found a significant correlation with a single nucleotide polymorphism in the IFN-λ3 (IL28b) promoter strongly linked to Hepatitis C disease and treatment outcome. This paper describes a link between Med23 and IFN-λ, provides evidence for the crucial role of IFN-λ in HSV-1 immune control, and highlights the power of integrative genome-scale approaches to identify HFs critical for disease progression and outcome

Public Library of Science (PLOS)

Crossref

Online Research @ Cardiff

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Enlighten

ART

University of Melbourne Institutional Repository

FigShare

Algorithms for effective querying of compound graph-based pathway databases

Author: A Funahashi
AH Bild
Ahmet Cetintas
AL Barabasi
BioPAX
D Croes
DJ Wong
E Demir
EM Reingold
Emek Demir
G Bader
HPJ Bonarius
JA Bondy
JA Engelman
K Fukuda
K Wang
KY Yip
L Matthews
M Baitaluk
N Yeung
O Babur
Ozgun Babur
Pathway Commons
R Caspi
R Gting
R Hofestädt
R Sharan
S Brohe
S Okuda
SBGN
T Aittokallio
T Shlomi
TH Cormen
The Cancer Genome Atlas Research Network
TS Keshava Prasad
U Dogrusoz
U Dogrusoz
U Leser
Ugur Dogrusoz
V Danos
VN Reddy
Y Tian
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Graph-based pathway ontologies and databases are widely used to represent data about cellular processes. This representation makes it possible to programmatically integrate cellular networks and to investigate them using the well-understood concepts of graph theory in order to predict their structural and dynamic properties. An extension of this graph representation, namely hierarchically structured or compound graphs, in which a member of a biological network may recursively contain a sub-network of a somehow logically similar group of biological objects, provides many additional benefits for analysis of biological pathways, including reduction of complexity by decomposition into distinct components or modules. In this regard, it is essential to effectively query such integrated large compound networks to extract the sub-networks of interest with the help of efficient algorithms and software tools. Results Towards this goal, we developed a querying framework, along with a number of graph-theoretic algorithms from simple neighborhood queries to shortest paths to feedback loops, that is applicable to all sorts of graph-based pathway databases, from PPIs (protein-protein interactions) to metabolic and signaling pathways. The framework is unique in that it can account for compound or nested structures and ubiquitous entities present in the pathway data. In addition, the queries may be related to each other through "AND" and "OR" operators, and can be recursively organized into a tree, in which the result of one query might be a source and/or target for another, to form more complex queries. The algorithms were implemented within the querying component of a new version of the software tool P<smcaps>ATIKA</smcaps><it>web </it>(Pathway Analysis Tool for Integration and Knowledge Acquisition) and have proven useful for answering a number of biologically significant questions for large graph-based pathway databases. Conclusion The P<smcaps>ATIKA</smcaps> Project Web site is <url>http://www.patika.org</url>. P<smcaps>ATIKA</smcaps><it>web </it>version 2.1 is available at <url>http://web.patika.org</url>.</p

Crossref

Bilkent University Institutional Repository

Directory of Open Access Journals

PubMed Central

A Systematic Analysis of Eluted Fraction of Plasma Post Immunoaffinity Depletion: Implications in Biomarker Discovery

Author: A Cabre
AI Nesvizhskii
AK Yadav
AL Krasnoselsky
Amit Kumar Yadav
Ashish Kumar Singh
B Muthusamy
BL Nicholas
C Tu
CF Taylor
D Kessner
Debasis Dash
Dhirendra Kumar
DJ States
E Bellei
Gourav Bhardwaj
GR Mishra
GS Omenn
J Hernandez-Toro
JK Eng
K Bjorhall
KR Coombes
L Huang
L Kall
LA Echan
LA Liotta
LG Costa
LY Geer
M Zhou
MPW Smith
N Gupta
Niyaz Ahmed
NL Anderson
NL Anderson
NL Anderson
P Juhasz
P Shannon
PJ Kersey
Q Zhang
R Craig
R Stempfer
RK Dagda
RL Gundry
RL Gundry
Ruby Priyadarshini
RX Sun
S Peri
S Peri
Shadab Ahmad
Shantanu Sengupta
T Liu
Trayambak Basak
TS Keshava Prasad
TS Prasad
V Faca
V Faca
W Derer
WY Lo
Y Gong
Publication venue: Public Library of Science
Publication date: 07/09/2011
Field of study

Plasma is the most easily accessible source for biomarker discovery in clinical proteomics. However, identifying potential biomarkers from plasma is a challenge given the large dynamic range of proteins. The potential biomarkers in plasma are generally present at very low abundance levels and hence identification of these low abundance proteins necessitates the depletion of highly abundant proteins. Sample pre-fractionation using immuno-depletion of high abundance proteins using multi-affinity removal system (MARS) has been a popular method to deplete multiple high abundance proteins. However, depletion of these abundant proteins can result in concomitant removal of low abundant proteins. Although there are some reports suggesting the removal of non-targeted proteins, the predominant view is that number of such proteins is small. In this study, we identified proteins that are removed along with the targeted high abundant proteins. Three plasma samples were depleted using each of the three MARS (Hu-6, Hu-14 and Proteoprep 20) cartridges. The affinity bound fractions were subjected to gelC-MS using an LTQ-Orbitrap instrument. Using four database search algorithms including MassWiz (developed in house), we selected the peptides identified at <1% FDR. Peptides identified by at least two algorithms were selected for protein identification. After this rigorous bioinformatics analysis, we identified 101 proteins with high confidence. Thus, we believe that for biomarker discovery and proper quantitation of proteins, it might be better to study both bound and depleted fractions from any MARS depleted plasma sample

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

HIV Protein Sequence Hotspots for Crosstalk with Host Hub Proteins

Author: A Greenway
A Henschel
AC Vendel
AK Dunker
Aydin Tozeren
B Ackerson
CL Ruegg
CM Gould
D Ekman
Denis Dupuy
DR Borger
EE Hill
EF Pettersen
F Cardarelli
F Diella
F Meggio
F Meggio
G Baier-Bitterlich
H Jian
H Li
H Wang
HM Berman
HM Craig
J Friborg
JE Dickerson
K Harada
K Kadaveru
K Saksela
KV Prasad
L Deng
M Hiipakka
M Matsubara
M Schindler
MA Dimattia
MA Perez
Mahdi Sarmady
MD Dyer
MR Schaefer
N Arhel
NE Davey
NE Davey
O Haffar
O Tastan
O Tastan
P Abada
P Bayer
P Beauparlant
P Evans
P Evans
RJ Edwards
S Balakrishnan
S Betzi
S Grzesiek
S Sei
SH Tan
SK Srinivas
SS Chen
T Ammosova
T Kino
TH Tahirov
TS Keshava Prasad
V Neduva
V Neduva
W Fu
W Radding
William Dampier
X Yang
X Yang
Y He
Y Liu
Z Nie
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

HIV proteins target host hub proteins for transient binding interactions. The presence of viral proteins in the infected cell results in out-competition of host proteins in their interaction with hub proteins, drastically affecting cell physiology. Functional genomics and interactome datasets can be used to quantify the sequence hotspots on the HIV proteome mediating interactions with host hub proteins. In this study, we used the HIV and human interactome databases to identify HIV targeted host hub proteins and their host binding partners (H2). We developed a high throughput computational procedure utilizing motif discovery algorithms on sets of protein sequences, including sequences of HIV and H2 proteins. We identified as HIV sequence hotspots those linear motifs that are highly conserved on HIV sequences and at the same time have a statistically enriched presence on the sequences of H2 proteins. The HIV protein motifs discovered in this study are expressed by subsets of H2 host proteins potentially outcompeted by HIV proteins. A large subset of these motifs is involved in cleavage, nuclear localization, phosphorylation, and transcription factor binding events. Many such motifs are clustered on an HIV sequence in the form of hotspots. The sequential positions of these hotspots are consistent with the curated literature on phenotype altering residue mutations, as well as with existing binding site data. The hotspot map produced in this study is the first global portrayal of HIV motifs involved in altering the host protein network at highly connected hub nodes

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Characterising and Predicting Haploinsufficiency in the Human Genome

Author: A Chatr-aryamontri
A Hamosh
AI Su
AM Deutschbauer
AOM Wilkie
AR Boyko
BE Madsen
C Lee
CD Bustamante
CM Smith
D-Q Nguyen
DF Conrad
Edward M. Marcotte
FA Kondrashov
GJ van der Heijden
GM Cooper
HV Firth
I Lee
I Lee
I Vastrik
Insuk Lee
J Korn
J-F Rual
JA Blake
JG Seidman
KE Lohmueller
KR Brown
Matthew E. Hurles
Mikkel H. Schierup
Ni Huang
P Baldi
PC Ng
PC Ng
PC Ng
R Blekhman
RA Veitia
S Assou
S Forbes
S Girirajan
S McCarroll
S Sunyaev
S van Dongen
SB Ng
T Fawcett
TJ Hubbard
TS Keshava Prasad
V Dang
W Ji
Y Xue
Publication venue: Public Library of Science
Publication date: 01/10/2010
Field of study

Haploinsufficiency, wherein a single functional copy of a gene is insufficient to maintain normal function, is a major cause of dominant disease. Human disease studies have identified several hundred haploinsufficient (HI) genes. We have compiled a map of 1,079 haplosufficient (HS) genes by systematic identification of genes unambiguously and repeatedly compromised by copy number variation among 8,458 apparently healthy individuals and contrasted the genomic, evolutionary, functional, and network properties between these HS genes and known HI genes. We found that HI genes are typically longer and have more conserved coding sequences and promoters than HS genes. HI genes exhibit higher levels of expression during early development and greater tissue specificity. Moreover, within a probabilistic human functional interaction network HI genes have more interaction partners and greater network proximity to other known HI genes. We built a predictive model on the basis of these differences and annotated 12,443 genes with their predicted probability of being haploinsufficient. We validated these predictions of haploinsufficiency by demonstrating that genes with a high predicted probability of exhibiting haploinsufficiency are enriched among genes implicated in human dominant diseases and among genes causing abnormal phenotypes in heterozygous knockout mice. We have transformed these gene-based haploinsufficiency predictions into haploinsufficiency scores for genic deletions, which we demonstrate to better discriminate between pathogenic and benign deletions than consideration of the deletion size or numbers of genes deleted. These robust predictions of haploinsufficiency support clinical interpretation of novel loss-of-function variants and prioritization of variants and genes for follow-up studies

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Texas ScholarWorks