1,443 research outputs found

    Inferring Networks of Substitutable and Complementary Products

    Full text link
    In a modern recommender system, it is important to understand how products relate to each other. For example, while a user is looking for mobile phones, it might make sense to recommend other phones, but once they buy a phone, we might instead want to recommend batteries, cases, or chargers. These two types of recommendations are referred to as substitutes and complements: substitutes are products that can be purchased instead of each other, while complements are products that can be purchased in addition to each other. Here we develop a method to infer networks of substitutable and complementary products. We formulate this as a supervised link prediction task, where we learn the semantics of substitutes and complements from data associated with products. The primary source of data we use is the text of product reviews, though our method also makes use of features such as ratings, specifications, prices, and brands. Methodologically, we build topic models that are trained to automatically discover topics from text that are successful at predicting and explaining such relationships. Experimentally, we evaluate our system on the Amazon product catalog, a large dataset consisting of 9 million products, 237 million links, and 144 million reviews.Comment: 12 pages, 6 figure

    Correction: A correlated topic model of Science

    Full text link
    Correction to Annals of Applied Statistics 1 (2007) 17--35 [doi:10.1214/07-AOAS114]Comment: Published in at http://dx.doi.org/10.1214/07-AOAS136 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Analysis of Computer Science Communities Based on DBLP

    Full text link
    It is popular nowadays to bring techniques from bibliometrics and scientometrics into the world of digital libraries to analyze the collaboration patterns and explore mechanisms which underlie community development. In this paper we use the DBLP data to investigate the author's scientific career and provide an in-depth exploration of some of the computer science communities. We compare them in terms of productivity, population stability and collaboration trends.Besides we use these features to compare the sets of topranked conferences with their lower ranked counterparts.Comment: 9 pages, 7 figures, 6 table

    Stochastic Variational Inference

    Full text link
    We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets
    corecore