478 research outputs found

    Recurrent Latent Variable Networks for Session-Based Recommendation

    Full text link
    In this work, we attempt to ameliorate the impact of data sparsity in the context of session-based recommendation. Specifically, we seek to devise a machine learning mechanism capable of extracting subtle and complex underlying temporal dynamics in the observed session data, so as to inform the recommendation algorithm. To this end, we improve upon systems that utilize deep learning techniques with recurrently connected units; we do so by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network recurrent units as stochastic latent variables with a prior distribution imposed over them. On this basis, we proceed to infer corresponding posteriors; these can be used for prediction and recommendation generation, in a way that accounts for the uncertainty in the available sparse training data. To allow for our approach to easily scale to large real-world datasets, we perform inference under an approximate amortized variational inference (AVI) setup, whereby the learned posteriors are parameterized via (conventional) neural networks. We perform an extensive experimental evaluation of our approach using challenging benchmark datasets, and illustrate its superiority over existing state-of-the-art techniques

    Naive Feature Selection: Sparsity in Naive Bayes

    Full text link
    Due to its linear complexity, naive Bayes classification remains an attractive supervised learning method, especially in very large-scale settings. We propose a sparse version of naive Bayes, which can be used for feature selection. This leads to a combinatorial maximum-likelihood problem, for which we provide an exact solution in the case of binary data, or a bound in the multinomial case. We prove that our bound becomes tight as the marginal contribution of additional features decreases. Both binary and multinomial sparse models are solvable in time almost linear in problem size, representing a very small extra relative cost compared to the classical naive Bayes. Numerical experiments on text data show that the naive Bayes feature selection method is as statistically effective as state-of-the-art feature selection methods such as recursive feature elimination, l1l_1-penalized logistic regression and LASSO, while being orders of magnitude faster. For a large data set, having more than with 1.61.6 million training points and about 1212 million features, and with a non-optimized CPU implementation, our sparse naive Bayes model can be trained in less than 15 seconds

    A Relational Tucker Decomposition for Multi-Relational Link Prediction

    Full text link
    We propose the Relational Tucker3 (RT) decomposition for multi-relational link prediction in knowledge graphs. We show that many existing knowledge graph embedding models are special cases of the RT decomposition with certain predefined sparsity patterns in its components. In contrast to these prior models, RT decouples the sizes of entity and relation embeddings, allows parameter sharing across relations, and does not make use of a predefined sparsity pattern. We use the RT decomposition as a tool to explore whether it is possible and beneficial to automatically learn sparsity patterns, and whether dense models can outperform sparse models (using the same number of parameters). Our experiments indicate that---depending on the dataset--both questions can be answered affirmatively

    Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data

    Full text link
    In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance

    One-Pass Ranking Models for Low-Latency Product Recommendations

    Full text link
    Purchase logs collected in e-commerce platforms provide rich information about customer preferences. These logs can be leveraged to improve the quality of product recommenda-tions by feeding them to machine-learned ranking models. However, a variety of deployment constraints limit the näıve applicability of machine learning to this problem. First, the amount and the dimensionality of the data make in-memory learning simply not possible. Second, the drift of customers’ preference over time require to retrain the ranking model regularly with freshly collected data. This limits the time that is available for training to prohibitively short intervals. Third, ranking in real-time is necessary whenever the query complexity prevents us from caching the predictions. This constraint requires to minimize prediction time (or equiva
    corecore