21,505 research outputs found
Metric learning pairwise kernel for graph inference
Much recent work in bioinformatics has focused on the inference of various
types of biological networks, representing gene regulation, metabolic
processes, protein-protein interactions, etc. A common setting involves
inferring network edges in a supervised fashion from a set of high-confidence
edges, possibly characterized by multiple, heterogeneous data sets (protein
sequence, gene expression, etc.). Here, we distinguish between two modes of
inference in this setting: direct inference based upon similarities between
nodes joined by an edge, and indirect inference based upon similarities between
one pair of nodes and another pair of nodes. We propose a supervised approach
for the direct case by translating it into a distance metric learning problem.
A relaxation of the resulting convex optimization problem leads to the support
vector machine (SVM) algorithm with a particular kernel for pairs, which we
call the metric learning pairwise kernel (MLPK). We demonstrate, using several
real biological networks, that this direct approach often improves upon the
state-of-the-art SVM for indirect inference with the tensor product pairwise
kernel
Network Model Selection for Task-Focused Attributed Network Inference
Networks are models representing relationships between entities. Often these
relationships are explicitly given, or we must learn a representation which
generalizes and predicts observed behavior in underlying individual data (e.g.
attributes or labels). Whether given or inferred, choosing the best
representation affects subsequent tasks and questions on the network. This work
focuses on model selection to evaluate network representations from data,
focusing on fundamental predictive tasks on networks. We present a modular
methodology using general, interpretable network models, task neighborhood
functions found across domains, and several criteria for robust model
selection. We demonstrate our methodology on three online user activity
datasets and show that network model selection for the appropriate network task
vs. an alternate task increases performance by an order of magnitude in our
experiments
- …