827 research outputs found

    Graph Convolutional Neural Networks for Web-Scale Recommender Systems

    Full text link
    Recent advancements in deep neural networks for graph-structured data have led to state-of-the-art performance on recommender system benchmarks. However, making these methods practical and scalable to web-scale recommendation tasks with billions of items and hundreds of millions of users remains a challenge. Here we describe a large-scale deep recommendation engine that we developed and deployed at Pinterest. We develop a data-efficient Graph Convolutional Network (GCN) algorithm PinSage, which combines efficient random walks and graph convolutions to generate embeddings of nodes (i.e., items) that incorporate both graph structure as well as node feature information. Compared to prior GCN approaches, we develop a novel method based on highly efficient random walks to structure the convolutions and design a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model. We also develop an efficient MapReduce model inference algorithm to generate embeddings using a trained model. We deploy PinSage at Pinterest and train it on 7.5 billion examples on a graph with 3 billion nodes representing pins and boards, and 18 billion edges. According to offline metrics, user studies and A/B tests, PinSage generates higher-quality recommendations than comparable deep learning and graph-based alternatives. To our knowledge, this is the largest application of deep graph embeddings to date and paves the way for a new generation of web-scale recommender systems based on graph convolutional architectures.Comment: KDD 201

    Active caching for recommender systems

    Get PDF
    Web users are often overwhelmed by the amount of information available while carrying out browsing and searching tasks. Recommender systems substantially reduce the information overload by suggesting a list of similar documents that users might find interesting. However, generating these ranked lists requires an enormous amount of resources that often results in access latency. Caching frequently accessed data has been a useful technique for reducing stress on limited resources and improving response time. Traditional passive caching techniques, where the focus is on answering queries based on temporal locality or popularity, achieve a very limited performance gain. In this dissertation, we are proposing an ‘active caching’ technique for recommender systems as an extension of the caching model. In this approach estimation is used to generate an answer for queries whose results are not explicitly cached, where the estimation makes use of the partial order lists cached for related queries. By answering non-cached queries along with cached queries, the active caching system acts as a form of query processor and offers substantial improvement over traditional caching methodologies. Test results for several data sets and recommendation techniques show substantial improvement in the cache hit rate, byte hit rate and CPU costs, while achieving reasonable recall rates. To ameliorate the performance of proposed active caching solution, a shared neighbor similarity measure is introduced which improves the recall rates by eliminating the dependence on monotinicity in the partial order lists. Finally, a greedy balancing cache selection policy is also proposed to select most appropriate data objects for the cache that help to improve the cache hit rate and recall further

    Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking

    Full text link
    This paper proposes a new neural architecture for collaborative ranking with implicit feedback. Our model, LRML (\textit{Latent Relational Metric Learning}) is a novel metric learning approach for recommendation. More specifically, instead of simple push-pull mechanisms between user and item pairs, we propose to learn latent relations that describe each user item interaction. This helps to alleviate the potential geometric inflexibility of existing metric learing approaches. This enables not only better performance but also a greater extent of modeling capability, allowing our model to scale to a larger number of interactions. In order to do so, we employ a augmented memory module and learn to attend over these memory blocks to construct latent relations. The memory-based attention module is controlled by the user-item interaction, making the learned relation vector specific to each user-item pair. Hence, this can be interpreted as learning an exclusive and optimal relational translation for each user-item interaction. The proposed architecture demonstrates the state-of-the-art performance across multiple recommendation benchmarks. LRML outperforms other metric learning models by 6%−7.5%6\%-7.5\% in terms of Hits@10 and nDCG@10 on large datasets such as Netflix and MovieLens20M. Moreover, qualitative studies also demonstrate evidence that our proposed model is able to infer and encode explicit sentiment, temporal and attribute information despite being only trained on implicit feedback. As such, this ascertains the ability of LRML to uncover hidden relational structure within implicit datasets.Comment: WWW 201

    Accelerating Linear Algebra and Machine Learning Kernels on a Massively Parallel Reconfigurable Architecture

    Get PDF
    abstract: This thesis presents efficient implementations of several linear algebra kernels, machine learning kernels and a neural network based recommender systems engine onto a massively parallel reconfigurable architecture, Transformer. The linear algebra kernels include Triangular Matrix Solver (TRSM), LU Decomposition (LUD), QR Decomposition (QRD), and Matrix Inversion. The machine learning kernels include an LSTM (Long Short Term Memory) cell, and a GRU (gated Recurrent Unit) cell used in recurrent neural networks. The neural network based recommender systems engine consists of multiple kernels including fully connected layers, embedding layer, 1-D batchnorm, Adam optimizer, etc. Transformer is a massively parallel reconfigurable multicore architecture designed at the University of Michigan. The Transformer configuration considered here is 4 tiles and 16 General Processing Elements (GPEs) per tile. It supports a two level cache hierarchy where the L1 and L2 caches can operate in shared (S) or private (P) modes. The architecture was modeled using Gem5 and cycle accurate simulations were done to evaluate the performance in terms of execution times, giga-operations per second per Watt (GOPS/W), and giga-floating-point-operations per second per Watt (GFLOPS/W). This thesis shows that for linear algebra kernels, each kernel achieves high performance for a certain cache mode and that this cache mode can change when the matrix size changes. For instance, for smaller matrix sizes, L1P, L2P cache mode is best for TRSM, while L1S, L2S is the best cache mode for LUD, and L1P, L2S is the best for QRD. For each kernel, the optimal cache mode changes when the matrix size is increased. For instance, for TRSM, the L1P, L2P cache mode is best for smaller matrix sizes (N=64,128,256,512N=64, 128, 256, 512) and it changes to L1S, L2P for larger matrix sizes (N=1024N=1024). For machine learning kernels, L1P, L2P is the best cache mode for all network parameter sizes. Gem5 simulations show that the peak performance for TRSM, LUD, QRD and Matrix Inverse in the 14nm node is 97.5, 59.4, 133.0 and 83.05 GFLOPS/W, respectively. For LSTM and GRU, the peak performance is 44.06 and 69.3 GFLOPS/W. The neural network based recommender system was implemented in L1S, L2S cache mode. It includes a forward pass and a backward pass and is significantly more complex in terms of both computational complexity and data movement. The most computationally intensive block is the fully connected layer followed by Adam optimizer. The overall performance of the recommender systems engine is 54.55 GFLOPS/W and 169.12 GOPS/W.Dissertation/ThesisMasters Thesis Electrical Engineering 201

    A scalable recommender system : using latent topics and alternating least squares techniques

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsA recommender system is one of the major techniques that handles information overload problem of Information Retrieval. Improves access and proactively recommends relevant information to each user, based on preferences and objectives. During the implementation and planning phases, designers have to cope with several issues and challenges that need proper attention. This thesis aims to show the issues and challenges in developing high-quality recommender systems. A paper solves a current research problem in the field of job recommendations using a distributed algorithmic framework built on top of Spark for parallel computation which allows the algorithm to scale linearly with the growing number of users. The final solution consists of two different recommenders which could be utilised for different purposes. The first method is mainly driven by latent topics among users, meanwhile the second technique utilises a latent factor algorithm that directly addresses the preference-confidence paradigm

    A Theoretical Analysis of Two-Stage Recommendation for Cold-Start Collaborative Filtering

    Full text link
    In this paper, we present a theoretical framework for tackling the cold-start collaborative filtering problem, where unknown targets (items or users) keep coming to the system, and there is a limited number of resources (users or items) that can be allocated and related to them. The solution requires a trade-off between exploitation and exploration as with the limited recommendation opportunities, we need to, on one hand, allocate the most relevant resources right away, but, on the other hand, it is also necessary to allocate resources that are useful for learning the target's properties in order to recommend more relevant ones in the future. In this paper, we study a simple two-stage recommendation combining a sequential and a batch solution together. We first model the problem with the partially observable Markov decision process (POMDP) and provide an exact solution. Then, through an in-depth analysis over the POMDP value iteration solution, we identify that an exact solution can be abstracted as selecting resources that are not only highly relevant to the target according to the initial-stage information, but also highly correlated, either positively or negatively, with other potential resources for the next stage. With this finding, we propose an approximate solution to ease the intractability of the exact solution. Our initial results on synthetic data and the Movie Lens 100K dataset confirm the performance gains of our theoretical development and analysis

    Sequence modelling for e-commerce

    Get PDF
    • …
    corecore