66,460 research outputs found
A Harmonic Extension Approach for Collaborative Ranking
We present a new perspective on graph-based methods for collaborative ranking
for recommender systems. Unlike user-based or item-based methods that compute a
weighted average of ratings given by the nearest neighbors, or low-rank
approximation methods using convex optimization and the nuclear norm, we
formulate matrix completion as a series of semi-supervised learning problems,
and propagate the known ratings to the missing ones on the user-user or
item-item graph globally. The semi-supervised learning problems are expressed
as Laplace-Beltrami equations on a manifold, or namely, harmonic extension, and
can be discretized by a point integral method. We show that our approach does
not impose a low-rank Euclidean subspace on the data points, but instead
minimizes the dimension of the underlying manifold. Our method, named LDM (low
dimensional manifold), turns out to be particularly effective in generating
rankings of items, showing decent computational efficiency and robust ranking
quality compared to state-of-the-art methods
Prioritized Metric Structures and Embedding
Metric data structures (distance oracles, distance labeling schemes, routing
schemes) and low-distortion embeddings provide a powerful algorithmic
methodology, which has been successfully applied for approximation algorithms
\cite{llr}, online algorithms \cite{BBMN11}, distributed algorithms
\cite{KKMPT12} and for computing sparsifiers \cite{ST04}. However, this
methodology appears to have a limitation: the worst-case performance inherently
depends on the cardinality of the metric, and one could not specify in advance
which vertices/points should enjoy a better service (i.e., stretch/distortion,
label size/dimension) than that given by the worst-case guarantee.
In this paper we alleviate this limitation by devising a suit of {\em
prioritized} metric data structures and embeddings. We show that given a
priority ranking of the graph vertices (respectively,
metric points) one can devise a metric data structure (respectively, embedding)
in which the stretch (resp., distortion) incurred by any pair containing a
vertex will depend on the rank of the vertex. We also show that other
important parameters, such as the label size and (in some sense) the dimension,
may depend only on . In some of our metric data structures (resp.,
embeddings) we achieve both prioritized stretch (resp., distortion) and label
size (resp., dimension) {\em simultaneously}. The worst-case performance of our
metric data structures and embeddings is typically asymptotically no worse than
of their non-prioritized counterparts.Comment: To appear at STOC 201
Sparse Fault-Tolerant BFS Trees
This paper addresses the problem of designing a sparse {\em fault-tolerant}
BFS tree, or {\em FT-BFS tree} for short, namely, a sparse subgraph of the
given network such that subsequent to the failure of a single edge or
vertex, the surviving part of still contains a BFS spanning tree for
(the surviving part of) . Our main results are as follows. We present an
algorithm that for every -vertex graph and source node constructs a
(single edge failure) FT-BFS tree rooted at with O(n \cdot
\min\{\Depth(s), \sqrt{n}\}) edges, where \Depth(s) is the depth of the BFS
tree rooted at . This result is complemented by a matching lower bound,
showing that there exist -vertex graphs with a source node for which any
edge (or vertex) FT-BFS tree rooted at has edges. We then
consider {\em fault-tolerant multi-source BFS trees}, or {\em FT-MBFS trees}
for short, aiming to provide (following a failure) a BFS tree rooted at each
source for some subset of sources . Again, tight bounds
are provided, showing that there exists a poly-time algorithm that for every
-vertex graph and source set of size constructs a
(single failure) FT-MBFS tree from each source , with
edges, and on the other hand there exist
-vertex graphs with source sets of cardinality , on
which any FT-MBFS tree from has edges.
Finally, we propose an approximation algorithm for constructing
FT-BFS and FT-MBFS structures. The latter is complemented by a hardness result
stating that there exists no approximation algorithm for these
problems under standard complexity assumptions
Forest Density Estimation
We study graph estimation and density estimation in high dimensions, using a
family of density estimators based on forest structured undirected graphical
models. For density estimation, we do not assume the true distribution
corresponds to a forest; rather, we form kernel density estimates of the
bivariate and univariate marginals, and apply Kruskal's algorithm to estimate
the optimal forest on held out data. We prove an oracle inequality on the
excess risk of the resulting estimator relative to the risk of the best forest.
For graph estimation, we consider the problem of estimating forests with
restricted tree sizes. We prove that finding a maximum weight spanning forest
with restricted tree size is NP-hard, and develop an approximation algorithm
for this problem. Viewing the tree size as a complexity parameter, we then
select a forest using data splitting, and prove bounds on excess risk and
structure selection consistency of the procedure. Experiments with simulated
data and microarray data indicate that the methods are a practical alternative
to Gaussian graphical models.Comment: Extended version of earlier paper titled "Tree density estimation
- β¦