45,920 research outputs found

    Sum-Product Network structure learning by efficient product nodes discovery

    Get PDF
    Sum-Product Networks (SPNs) are recently introduced deep probabilistic models providing exact and tractable inference. SPNs have been successfully employed in several application domains, from computer vision to natural language processing, as accurate density estimators. However, learning their structure and parameters from high dimensional data poses a challenge in terms of time complexity. Classical SPNs structure learning algorithms work by repeating several times two high cost operations: determining independencies among random variables (RVs)-introducing product nodes-and finding sub-populations among samples-introducing sum nodes. Even one of the simplest greedy structure learner, LearnSPN, scales quadratically in the number of the variables to determine RVs independencies. In this work, we investigate the trade-off between accuracy and efficiency when employing approximate but fast procedures to determine independencies among RVs. We introduce and evaluate sub-quadratic procedures based on a random subspace approach and leveraging entropy as a proxy criterion to split independent RVs. Experimental results on many benchmark datasets for density estimation show that LearnSPN-like structure learners, when equipped by our splitting procedures, provide reduced learning and/or inference times, generally containing the degradation of inference accuracy. Ultimately, we provide an empirical confirmation of a "no free lunch" when learning the structure of SPNs

    Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures

    Full text link
    Probabilistic graphical models are a central tool in AI; however, they are generally not as expressive as deep neural models, and inference is notoriously hard and slow. In contrast, deep probabilistic models such as sum-product networks (SPNs) capture joint distributions in a tractable fashion, but still lack the expressive power of intractable models based on deep neural networks. Therefore, we introduce conditional SPNs (CSPNs), conditional density estimators for multivariate and potentially hybrid domains which allow harnessing the expressive power of neural networks while still maintaining tractability guarantees. One way to implement CSPNs is to use an existing SPN structure and condition its parameters on the input, e.g., via a deep neural network. This approach, however, might misrepresent the conditional independence structure present in data. Consequently, we also develop a structure-learning approach that derives both the structure and parameters of CSPNs from data. Our experimental evidence demonstrates that CSPNs are competitive with other probabilistic models and yield superior performance on multilabel image classification compared to mean field and mixture density networks. Furthermore, they can successfully be employed as building blocks for structured probabilistic models, such as autoregressive image models.Comment: 13 pages, 6 figure

    Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data

    Full text link
    In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance

    Learning Tree-based Deep Model for Recommender Systems

    Full text link
    Model-based methods for recommender systems have been studied extensively in recent years. In systems with large corpus, however, the calculation cost for the learnt model to predict all user-item preferences is tremendous, which makes full corpus retrieval extremely difficult. To overcome the calculation barriers, models such as matrix factorization resort to inner product form (i.e., model user-item preference as the inner product of user, item latent factors) and indexes to facilitate efficient approximate k-nearest neighbor searches. However, it still remains challenging to incorporate more expressive interaction forms between user and item features, e.g., interactions through deep neural networks, because of the calculation cost. In this paper, we focus on the problem of introducing arbitrary advanced models to recommender systems with large corpus. We propose a novel tree-based method which can provide logarithmic complexity w.r.t. corpus size even with more expressive models such as deep neural networks. Our main idea is to predict user interests from coarse to fine by traversing tree nodes in a top-down fashion and making decisions for each user-node pair. We also show that the tree structure can be jointly learnt towards better compatibility with users' interest distribution and hence facilitate both training and prediction. Experimental evaluations with two large-scale real-world datasets show that the proposed method significantly outperforms traditional methods. Online A/B test results in Taobao display advertising platform also demonstrate the effectiveness of the proposed method in production environments.Comment: Accepted by KDD 201
    corecore