Search CORE

6,304 research outputs found

Metric learning pairwise kernel for graph inference

Author: Noble William Stafford
Qiu Jian
Vert Jean-Philippe
Publication venue
Publication date: 20/10/2006
Field of study

Much recent work in bioinformatics has focused on the inference of various types of biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. A common setting involves inferring network edges in a supervised fashion from a set of high-confidence edges, possibly characterized by multiple, heterogeneous data sets (protein sequence, gene expression, etc.). Here, we distinguish between two modes of inference in this setting: direct inference based upon similarities between nodes joined by an edge, and indirect inference based upon similarities between one pair of nodes and another pair of nodes. We propose a supervised approach for the direct case by translating it into a distance metric learning problem. A relaxation of the resulting convex optimization problem leads to the support vector machine (SVM) algorithm with a particular kernel for pairs, which we call the metric learning pairwise kernel (MLPK). We demonstrate, using several real biological networks, that this direct approach often improves upon the state-of-the-art SVM for indirect inference with the tensor product pairwise kernel

arXiv.org e-Print Archive

HAL-MINES ParisTech

Composite structural motifs of binding sites for delineating biological functions of proteins

Author: A Bairoch
A Fiorillo
A Rausell
A Stark
AC Joerger
AC Wallace
AG Murzin
Akira R. Kinjo
AM Schnoes
AR Kinjo
AR Kinjo
AR Kinjo
B Bollobás
B Dasgupta
B Louie
B Rost
BH Dessailly
C Branden
C Winter
CV Robinson
D Petrey
DJ Schuller
DM Chipman
E Krissinel
E Toyota
FP Davis
FP Davis
GM Santos
H Berman
H Kettenberger
Haruki Nakamura
I Friedberg
J Janin
J Shi
J Westbrook
JI Yeh
K Chen
K Henrick
K Kinoshita
K Kinoshita
K Kinoshita
K Okazaki
K Stenberg
L Xie
M Bashton
M Brylinski
M Kitayner
M Levitt
M Moertl
M Nardini
M Tyagi
M Yang
N Nagano
N Tuncbag
N Tuncbag
N Zhao
ND Gold
O Keskin
O Keskin
OC Redfern
Ozlem Keskin
P Cramer
P Shannon
PD Pawelek
R Koike
R Koike
R Rentzsch
R Sinha
RR Thangudu
S Kadono
SF Altschul
T Amemiya
T Kawabata
T Kawabata
TA Holland
TC Terwilliger
Y Loewenstein
Z Aung
ZX Xia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs which represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures.Comment: 34 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Topological network alignment uncovers biological function and phylogeny

Author: Cook S.
Flannick J.
Kuchaiev O.
Kuchaiev O.
Memišević V.
Nataša Pržulj
Oleksii Kuchaiev
Pržulj N.
Singh R.
Singh R.
Snijders T. A.
Tijana Milenković
Vesna Memišević
Wayne Hayes
Wentz-Hunter K.
Zhang Y.
Publication venue
Publication date: 07/10/2009
Field of study

Sequence comparison and alignment has had an enormous impact on our understanding of evolution, biology, and disease. Comparison and alignment of biological networks will likely have a similar impact. Existing network alignments use information external to the networks, such as sequence, because no good algorithm for purely topological alignment has yet been devised. In this paper, we present a novel algorithm based solely on network topology, that can be used to align any two networks. We apply it to biological networks to produce by far the most complete topological alignments of biological networks to date. We demonstrate that both species phylogeny and detailed biological function of individual proteins can be extracted from our alignments. Topology-based alignments have the potential to provide a completely new, independent source of phylogenetic information. Our alignment of the protein-protein interaction networks of two very different species--yeast and human--indicate that even distant species share a surprising amount of network topology with each other, suggesting broad similarities in internal cellular wiring across all life on Earth.Comment: Algorithm explained in more details. Additional analysis adde

arXiv.org e-Print Archive

Crossref

PubMed Central

UCL Discovery

Triangle network motifs predict complexes by complementing high-error interactomes with structural information

Author: A Ben-Hur
A Beyer
A Beyer
A Cakmak
A Chinnasamy
A Clauset
A Edwards
A Gavin
A Patil
A Patil
A Rives
A Schlicker
A Schlicker
A Tong
A Valencia
A Vázquez
AA Lambert
B Andreopoulos
B Andreopoulos
B Galletta
B Pina
B Schuster-Bockler
Bill Andreopoulos
BT Bettinger
C von Mering
C Winter
C Yanagihara
Christof Winter
CM Deane
D Emig
D Pruyne
D Rhodes
D Scholtens
Dirk Labudde
DJ Watts
DS Han
E Lecuyer
E Sprinzak
E Sprinzak
E Zakrzewska
F Alber
F Luo
F Ramirez
G Jin
GD Bader
GD Bader
Goh
GT Hart
H Li
H Lu
H Mewes
H Park
H Toi
H Yu
HN Chua
HN Chua
HN Chua
I Albert
I Cohen-Gihon
IA Olave
J Bader
J Espadaler
J Gagneur
J Nilsson
J Poyatos
J Wojcik
J Yu
J Yu
JF Rual
JL Holden
JL Morrison
K Okada
K Venkatesan
K Xia
KS Guimaraes
KY Yip
L Zhang
LJ McGuffin
M Deng
M Evangelista
M Evangelista
M Evangelista
M Iqbal
M Singhal
ME Cusick
Michael Schroeder
MPH Stumpf
MS Cline
MS Scott
MY Galperin
N Kashtan
N Lin
NJ Krogan
NJ Mulder
O Resendis-Antonio
P Aloy
P Aloy
P Aloy
P Braun
P Chen
P D'haeseleer
P Pagel
PdL Lanerolle
PM Kim
R Aragues
R Hoffmann
R Jansen
R Jothi
R Jothi
R Milo
R Milo
R Riley
R Singh
RM Long
RS Wang
S Mathivanan
S Ng
S Pitre
S Shen-Orr
S Wuchty
S Wuchty
S Zhang
SK Ng
SL Lo
T Chiang
T Huang
T Pawson
T Pederson
T Stearns
TMW Nye
TMW Nye
U Stelzl
W Kim
WA Hofmann
WJ Nelson
WK Kim
WW Franke
X Chen
X Wu
Y Guo
Y Liu
Y Qi
Y Qi
Y Zhang
Z Itzhaki
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

BackgroundA lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles.ResultsWe find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes.ConclusionGiven high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN

Crossref

Springer - Publisher Connector

PubMed Central

SJSU ScholarWorks