Search CORE

66,460 research outputs found

A Harmonic Extension Approach for Collaborative Ranking

Author: Bertozzi Andrea
Kuang Da
Osher Stanley
Shi Zuoqiang
Publication venue
Publication date: 16/02/2016
Field of study

We present a new perspective on graph-based methods for collaborative ranking for recommender systems. Unlike user-based or item-based methods that compute a weighted average of ratings given by the nearest neighbors, or low-rank approximation methods using convex optimization and the nuclear norm, we formulate matrix completion as a series of semi-supervised learning problems, and propagate the known ratings to the missing ones on the user-user or item-item graph globally. The semi-supervised learning problems are expressed as Laplace-Beltrami equations on a manifold, or namely, harmonic extension, and can be discretized by a point integral method. We show that our approach does not impose a low-rank Euclidean subspace on the data points, but instead minimizes the dimension of the underlying manifold. Our method, named LDM (low dimensional manifold), turns out to be particularly effective in generating rankings of items, showing decent computational efficiency and robust ranking quality compared to state-of-the-art methods

arXiv.org e-Print Archive

eScholarship - University of California

Prioritized Metric Structures and Embedding

Author: Abraham Ittai
Abraham Ittai
Gavoille Cyril
Peleg David
Richard
Publication venue
Publication date: 07/04/2015
Field of study

Metric data structures (distance oracles, distance labeling schemes, routing schemes) and low-distortion embeddings provide a powerful algorithmic methodology, which has been successfully applied for approximation algorithms \cite{llr}, online algorithms \cite{BBMN11}, distributed algorithms \cite{KKMPT12} and for computing sparsifiers \cite{ST04}. However, this methodology appears to have a limitation: the worst-case performance inherently depends on the cardinality of the metric, and one could not specify in advance which vertices/points should enjoy a better service (i.e., stretch/distortion, label size/dimension) than that given by the worst-case guarantee. In this paper we alleviate this limitation by devising a suit of {\em prioritized} metric data structures and embeddings. We show that given a priority ranking

(x_1,x_2,\ldots,x_n)

of the graph vertices (respectively, metric points) one can devise a metric data structure (respectively, embedding) in which the stretch (resp., distortion) incurred by any pair containing a vertex

x_j

will depend on the rank

j

of the vertex. We also show that other important parameters, such as the label size and (in some sense) the dimension, may depend only on

j

. In some of our metric data structures (resp., embeddings) we achieve both prioritized stretch (resp., distortion) and label size (resp., dimension) {\em simultaneously}. The worst-case performance of our metric data structures and embeddings is typically asymptotically no worse than of their non-prioritized counterparts.Comment: To appear at STOC 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sparse Fault-Tolerant BFS Trees

Author: C. Demetrescu
D. Peleg
D. Peleg
D. Peleg
J. Hershberger
L. Roditty
M. Thorup
S. Baswana
S. Chechik
T. Lukovszki
Publication venue
Publication date: 01/01/2013
Field of study

This paper addresses the problem of designing a sparse {\em fault-tolerant} BFS tree, or {\em FT-BFS tree} for short, namely, a sparse subgraph

T

of the given network

G

such that subsequent to the failure of a single edge or vertex, the surviving part

T'

T

still contains a BFS spanning tree for (the surviving part of)

G

. Our main results are as follows. We present an algorithm that for every

n

-vertex graph

G

and source node

s

constructs a (single edge failure) FT-BFS tree rooted at

s

with O(n \cdot \min\{\Depth(s), \sqrt{n}\}) edges, where \Depth(s) is the depth of the BFS tree rooted at

s

. This result is complemented by a matching lower bound, showing that there exist

n

-vertex graphs with a source node

s

for which any edge (or vertex) FT-BFS tree rooted at

s

has

\Omega(n^{3/2})

edges. We then consider {\em fault-tolerant multi-source BFS trees}, or {\em FT-MBFS trees} for short, aiming to provide (following a failure) a BFS tree rooted at each source

s\in S

for some subset of sources

S\subseteq V

. Again, tight bounds are provided, showing that there exists a poly-time algorithm that for every

n

-vertex graph and source set

S \subseteq V

of size

\sigma

constructs a (single failure) FT-MBFS tree

T^*(S)

from each source

s_i \in S

, with

O(\sqrt{\sigma} \cdot n^{3/2})

edges, and on the other hand there exist

n

-vertex graphs with source sets

S \subseteq V

of cardinality

\sigma

, on which any FT-MBFS tree from

S

has

\Omega(\sqrt{\sigma}\cdot n^{3/2})

edges. Finally, we propose an

O(\log n)

approximation algorithm for constructing FT-BFS and FT-MBFS structures. The latter is complemented by a hardness result stating that there exists no

\Omega(\log n)

approximation algorithm for these problems under standard complexity assumptions

arXiv.org e-Print Archive

CiteSeerX

Crossref

Forest Density Estimation

Author: Gu Haijie
Gupta Anupam
Lafferty John
Liu Han
Wasserman Larry
Xu Min
Publication venue
Publication date: 01/01/2010
Field of study

We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.Comment: Extended version of earlier paper titled "Tree density estimation

arXiv.org e-Print Archive

CiteSeerX