66,460 research outputs found

    A Harmonic Extension Approach for Collaborative Ranking

    Full text link
    We present a new perspective on graph-based methods for collaborative ranking for recommender systems. Unlike user-based or item-based methods that compute a weighted average of ratings given by the nearest neighbors, or low-rank approximation methods using convex optimization and the nuclear norm, we formulate matrix completion as a series of semi-supervised learning problems, and propagate the known ratings to the missing ones on the user-user or item-item graph globally. The semi-supervised learning problems are expressed as Laplace-Beltrami equations on a manifold, or namely, harmonic extension, and can be discretized by a point integral method. We show that our approach does not impose a low-rank Euclidean subspace on the data points, but instead minimizes the dimension of the underlying manifold. Our method, named LDM (low dimensional manifold), turns out to be particularly effective in generating rankings of items, showing decent computational efficiency and robust ranking quality compared to state-of-the-art methods

    Prioritized Metric Structures and Embedding

    Full text link
    Metric data structures (distance oracles, distance labeling schemes, routing schemes) and low-distortion embeddings provide a powerful algorithmic methodology, which has been successfully applied for approximation algorithms \cite{llr}, online algorithms \cite{BBMN11}, distributed algorithms \cite{KKMPT12} and for computing sparsifiers \cite{ST04}. However, this methodology appears to have a limitation: the worst-case performance inherently depends on the cardinality of the metric, and one could not specify in advance which vertices/points should enjoy a better service (i.e., stretch/distortion, label size/dimension) than that given by the worst-case guarantee. In this paper we alleviate this limitation by devising a suit of {\em prioritized} metric data structures and embeddings. We show that given a priority ranking (x1,x2,…,xn)(x_1,x_2,\ldots,x_n) of the graph vertices (respectively, metric points) one can devise a metric data structure (respectively, embedding) in which the stretch (resp., distortion) incurred by any pair containing a vertex xjx_j will depend on the rank jj of the vertex. We also show that other important parameters, such as the label size and (in some sense) the dimension, may depend only on jj. In some of our metric data structures (resp., embeddings) we achieve both prioritized stretch (resp., distortion) and label size (resp., dimension) {\em simultaneously}. The worst-case performance of our metric data structures and embeddings is typically asymptotically no worse than of their non-prioritized counterparts.Comment: To appear at STOC 201

    Sparse Fault-Tolerant BFS Trees

    Full text link
    This paper addresses the problem of designing a sparse {\em fault-tolerant} BFS tree, or {\em FT-BFS tree} for short, namely, a sparse subgraph TT of the given network GG such that subsequent to the failure of a single edge or vertex, the surviving part Tβ€²T' of TT still contains a BFS spanning tree for (the surviving part of) GG. Our main results are as follows. We present an algorithm that for every nn-vertex graph GG and source node ss constructs a (single edge failure) FT-BFS tree rooted at ss with O(n \cdot \min\{\Depth(s), \sqrt{n}\}) edges, where \Depth(s) is the depth of the BFS tree rooted at ss. This result is complemented by a matching lower bound, showing that there exist nn-vertex graphs with a source node ss for which any edge (or vertex) FT-BFS tree rooted at ss has Ξ©(n3/2)\Omega(n^{3/2}) edges. We then consider {\em fault-tolerant multi-source BFS trees}, or {\em FT-MBFS trees} for short, aiming to provide (following a failure) a BFS tree rooted at each source s∈Ss\in S for some subset of sources SβŠ†VS\subseteq V. Again, tight bounds are provided, showing that there exists a poly-time algorithm that for every nn-vertex graph and source set SβŠ†VS \subseteq V of size Οƒ\sigma constructs a (single failure) FT-MBFS tree Tβˆ—(S)T^*(S) from each source si∈Ss_i \in S, with O(Οƒβ‹…n3/2)O(\sqrt{\sigma} \cdot n^{3/2}) edges, and on the other hand there exist nn-vertex graphs with source sets SβŠ†VS \subseteq V of cardinality Οƒ\sigma, on which any FT-MBFS tree from SS has Ξ©(Οƒβ‹…n3/2)\Omega(\sqrt{\sigma}\cdot n^{3/2}) edges. Finally, we propose an O(log⁑n)O(\log n) approximation algorithm for constructing FT-BFS and FT-MBFS structures. The latter is complemented by a hardness result stating that there exists no Ξ©(log⁑n)\Omega(\log n) approximation algorithm for these problems under standard complexity assumptions

    Forest Density Estimation

    Full text link
    We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.Comment: Extended version of earlier paper titled "Tree density estimation
    • …
    corecore