Search CORE

11,675 research outputs found

k-NN 검색 및 k-NN 그래프 생성을 위한 고속 근사 알고리즘

Author: Youngki Park
Publication venue: 서울대학교 대학원
Publication date: 01/02/2015
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 2. 이상구.Finding k-nearest neighbors (k-NN) is an essential part of recommeder systems, information retrieval, and many data mining and machine learning algorithms. However, there are two main problems in finding k-nearest neighbors: 1) Existing approaches require a huge amount of time when the number of objects or dimensions is scale up. 2) The k-NN computation methods do not show the consistent performance over different search tasks and types of data. In this dissertation, we present fast and versatile algorithms for finding k-nearest neighbors in order to cope with these problems. The main contributions are summarized as follows: first, we present an efficient and scalable algorithm for finding an approximate k-NN graph by filtering node pairs whose large value dimensions do not match at all. Second, a fast collaborative filtering algorithm that utilizes k-NN graph is presented. The main idea of this approach is to reverse the process of finding k-nearest neighbors in item-based collaborative filtering. Last, we propose a fast approximate algorithm for k-NN search by selecting query-specific signatures from a signature pool to pick high-quality k-NN candidates.The experimental results show that the proposed algorithms guarantee a high level of accuracy while also being much faster than the other algorithms over different types of search tasks and datasets.Abstract i Contents iii List of Figures vii List of Tables xi Chapter 1 Introduction 1 1.1 Motivation and Challenges . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Fast Approximation . . . . . . . . . . . . . . . . . . . . . 3 1.1.2 Versatility . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Our Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.1 Greedy Filtering . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2 Signature Selection LSH . . . . . . . . . . . . . . . . . . . 7 1.2.3 Reversed CF . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 2 Background and Related Work 14 2.1 k-NN Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.1 Locality Sensitive Hashing . . . . . . . . . . . . . . . . . . 15 2.1.2 LSH-based k-NN Search . . . . . . . . . . . . . . . . . . . 16 2.2 k-NN Graph Construction . . . . . . . . . . . . . . . . . . . . . . 17 2.2.1 LSH-based Approach . . . . . . . . . . . . . . . . . . . . . 19 2.2.2 Clustering-based Approach . . . . . . . . . . . . . . . . . 19 2.2.3 Heuristic-based Approach . . . . . . . . . . . . . . . . . . 20 2.2.4 Similarity Join . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Chapter 3 Fast Approximate k-NN Graph Construction 26 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3 Constructing a k-Nearest Neighbor Graph . . . . . . . . . . . . . 29 3.3.1 Greedy Filtering . . . . . . . . . . . . . . . . . . . . . . . 29 3.3.2 Prefix Selection Scheme . . . . . . . . . . . . . . . . . . . 32 3.3.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.4 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.4.2 Graph Construction Time . . . . . . . . . . . . . . . . . . 39 3.4.3 Graph Accuracy . . . . . . . . . . . . . . . . . . . . . . . 40 3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . 44 3.5.2 Performance Comparison . . . . . . . . . . . . . . . . . . 48 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Chapter 4 Fast Collaborative Filtering 53 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3 Fast Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . 58 4.3.1 Nearest Neighbor Graph Construction . . . . . . . . . . . 58 4.3.2 Fast Recommendation Algorithm . . . . . . . . . . . . . . 60 4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . 64 4.4.2 Overall Comparison . . . . . . . . . . . . . . . . . . . . . 65 4.4.3 Effects of Parameter Changes . . . . . . . . . . . . . . . . 68 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Chapter 5 Fast Approximate k-NN Search 72 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.2 Signature Selection LSH . . . . . . . . . . . . . . . . . . . . . . . 74 5.2.1 Data-dependent LSH . . . . . . . . . . . . . . . . . . . . . 75 5.2.2 Signature Pool Generation . . . . . . . . . . . . . . . . . . 76 5.2.3 Signature Selection . . . . . . . . . . . . . . . . . . . . . . 79 5.2.4 Optimization Techniques . . . . . . . . . . . . . . . . . . 83 5.3 S2LSH for Graph Construction . . . . . . . . . . . . . . . . . . . 84 5.3.1 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . 84 5.3.2 Signature Selection . . . . . . . . . . . . . . . . . . . . . . 84 5.3.3 Optimization Techniques . . . . . . . . . . . . . . . . . . 85 5.4 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . 87 5.5.2 Experimental Results . . . . . . . . . . . . . . . . . . . . 91 5.5.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . 97 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Chapter 6 Conclusion 103 Bibliography 105 초록 113Docto

SNU Open Repository and Archive

Graph Convolutional Matrix Completion

Author: Berg Rianne van den
Kipf Thomas N.
Welling Max
Publication venue
Publication date: 25/10/2017
Field of study

We consider matrix completion for recommender systems from the point of view of link prediction on graphs. Interaction data such as movie ratings can be represented by a bipartite user-item graph with labeled edges denoting observed ratings. Building on recent progress in deep learning on graph-structured data, we propose a graph auto-encoder framework based on differentiable message passing on the bipartite interaction graph. Our model shows competitive performance on standard collaborative filtering benchmarks. In settings where complimentary feature information or structured data such as a social network is available, our framework outperforms recent state-of-the-art methods.Comment: 9 pages, 3 figures, updated with additional experimental evaluatio

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

A Harmonic Extension Approach for Collaborative Ranking

Author: Bertozzi Andrea
Kuang Da
Osher Stanley
Shi Zuoqiang
Publication venue
Publication date: 16/02/2016
Field of study

We present a new perspective on graph-based methods for collaborative ranking for recommender systems. Unlike user-based or item-based methods that compute a weighted average of ratings given by the nearest neighbors, or low-rank approximation methods using convex optimization and the nuclear norm, we formulate matrix completion as a series of semi-supervised learning problems, and propagate the known ratings to the missing ones on the user-user or item-item graph globally. The semi-supervised learning problems are expressed as Laplace-Beltrami equations on a manifold, or namely, harmonic extension, and can be discretized by a point integral method. We show that our approach does not impose a low-rank Euclidean subspace on the data points, but instead minimizes the dimension of the underlying manifold. Our method, named LDM (low dimensional manifold), turns out to be particularly effective in generating rankings of items, showing decent computational efficiency and robust ranking quality compared to state-of-the-art methods

arXiv.org e-Print Archive

eScholarship - University of California

Knowledge Graph semantic enhancement of input data for improving AI

Author: Bhatt Shreyansh
Shalin Valerie
Sheth Amit
Zhao Jinjin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2020
Field of study

Intelligent systems designed using machine learning algorithms require a large number of labeled data. Background knowledge provides complementary, real world factual information that can augment the limited labeled data to train a machine learning algorithm. The term Knowledge Graph (KG) is in vogue as for many practical applications, it is convenient and useful to organize this background knowledge in the form of a graph. Recent academic research and implemented industrial intelligent systems have shown promising performance for machine learning algorithms that combine training data with a knowledge graph. In this article, we discuss the use of relevant KGs to enhance input data for two applications that use machine learning -- recommendation and community detection. The KG improves both accuracy and explainability

arXiv.org e-Print Archive

CORE

Social Collaborative Retrieval

Author: Bedi Punam
Massa Paolo
Shah Chirag
Weston Jason
Zheng Vincent W
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/04/2014
Field of study

Socially-based recommendation systems have recently attracted significant interest, and a number of studies have shown that social information can dramatically improve a system's predictions of user interests. Meanwhile, there are now many potential applications that involve aspects of both recommendation and information retrieval, and the task of collaborative retrieval---a combination of these two traditional problems---has recently been introduced. Successful collaborative retrieval requires overcoming severe data sparsity, making additional sources of information, such as social graphs, particularly valuable. In this paper we propose a new model for collaborative retrieval, and show that our algorithm outperforms current state-of-the-art approaches by incorporating information from social networks. We also provide empirical analyses of the ways in which cultural interests propagate along a social graph using a real-world music dataset.Comment: 10 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Graphical Model Formulation of Collaborative Filtering Neighbourhood Methods with Fast Maximum Entropy Training

Author: Caetano Tiberio
Defazio Aaron
Publication venue
Publication date: 01/01/2012
Field of study

Item neighbourhood methods for collaborative filtering learn a weighted graph over the set of items, where each item is connected to those it is most similar to. The prediction of a user's rating on an item is then given by that rating of neighbouring items, weighted by their similarity. This paper presents a new neighbourhood approach which we call item fields, whereby an undirected graphical model is formed over the item graph. The resulting prediction rule is a simple generalization of the classical approaches, which takes into account non-local information in the graph, allowing its best results to be obtained when using drastically fewer edges than other neighbourhood approaches. A fast approximate maximum entropy training method based on the Bethe approximation is presented, which uses a simple gradient ascent procedure. When using precomputed sufficient statistics on the Movielens datasets, our method is faster than maximum likelihood approaches by two orders of magnitude.Comment: ICML201

arXiv.org e-Print Archive

CiteSeerX