Search CORE

8 research outputs found

Learning Topic Models - Going beyond SVD

Author: Arora Sanjeev
Ge Rong
Moitra Ankur
Publication venue
Publication date: 01/01/2012
Field of study

Topic Modeling is an approach used for automatic comprehension and classification of data in a variety of settings, and perhaps the canonical application is in uncovering thematic structure in a corpus of documents. A number of foundational works both in machine learning and in theory have suggested a probabilistic model for documents, whereby documents arise as a convex combination of (i.e. distribution on) a small number of topic vectors, each topic vector being a distribution on words (i.e. a vector of word-frequencies). Similar models have since been used in a variety of application areas; the Latent Dirichlet Allocation or LDA model of Blei et al. is especially popular. Theoretical studies of topic modeling focus on learning the model's parameters assuming the data is actually generated from it. Existing approaches for the most part rely on Singular Value Decomposition(SVD), and consequently have one of two limitations: these works need to either assume that each document contains only one topic, or else can only recover the span of the topic vectors instead of the topic vectors themselves. This paper formally justifies Nonnegative Matrix Factorization(NMF) as a main tool in this context, which is an analog of SVD where all vectors are nonnegative. Using this tool we give the first polynomial-time algorithm for learning topic models without the above two limitations. The algorithm uses a fairly mild assumption about the underlying topic matrix called separability, which is usually found to hold in real-life data. A compelling feature of our algorithm is that it generalizes to models that incorporate topic-topic correlations, such as the Correlated Topic Model and the Pachinko Allocation Model. We hope that this paper will motivate further theoretical results that use NMF as a replacement for SVD - just as NMF has come to replace SVD in many applications

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Crossref

Adaptive Matching for Expert Systems with Uncertain Task Types

Author: Gulikers Lennart
Massoulie Laurent
Shah Virag
Vojnovic Milan
Publication venue
Publication date: 03/10/2017
Field of study

A matching in a two-sided market often incurs an externality: a matched resource may become unavailable to the other side of the market, at least for a while. This is especially an issue in online platforms involving human experts as the expert resources are often scarce. The efficient utilization of experts in these platforms is made challenging by the fact that the information available about the parties involved is usually limited. To address this challenge, we develop a model of a task-expert matching system where a task is matched to an expert using not only the prior information about the task but also the feedback obtained from the past matches. In our model the tasks arrive online while the experts are fixed and constrained by a finite service capacity. For this model, we characterize the maximum task resolution throughput a platform can achieve. We show that the natural greedy approaches where each expert is assigned a task most suitable to her skill is suboptimal, as it does not internalize the above externality. We develop a throughput optimal backpressure algorithm which does so by accounting for the `congestion' among different task types. Finally, we validate our model and confirm our theoretical findings with data-driven simulations via logs of Math.StackExchange, a StackOverflow forum dedicated to mathematics.Comment: A part of it presented at Allerton Conference 2017, 18 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

A random walk method for alleviating the sparsity problem in collaborative filtering

Author: Hilmi Yıldırım
Mukkai S. Krishnamoorthy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Collaborative Filtering is one of the most widely used ap-proaches in recommendation systems which predicts user preferences by learning past user-item relationships. In re-cent years, item-oriented collaborative filtering methods came into prominence as they are more scalable compared to user-oriented methods. Item-oriented methods discover item-item relationships from the training data and use these re-lations to compute predictions. In this paper, we propose a novel item-oriented algorithm, RandomWalk Recommender, that first infers transition probabilities between items based on their similarities and models finite length random walks on the item space to compute predictions. This method is especially useful when training data is less than plentiful, namely when typical similarity measures fail to capture ac-tual relationships between items. Aside from the proposed prediction algorithm, the final transition probability matrix computed in one of the intermediate steps can be used as an item similarity matrix in typical item-oriented approaches. Thus, this paper suggests a method to enhance similarity matrices under sparse data as well. Experiments on Movie-Lens data show that RandomWalk Recommender algorithm outperforms two other item-oriented methods in different sparsity levels while having the best performance difference in sparse datasets

CiteSeerX

Crossref

Adaptive Matching for Expert Systems with Uncertain Task Types

Author: Gulikers Lennart
Massoulié Laurent
Shah Virag
Vojnović Milan
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 01/09/2020
Field of study

International audienceA matching in a two-sided market often incurs an externality: a matched resource maybecome unavailable to the other side of the market, at least for a while. This is especiallyan issue in online platforms involving human experts as the expert resources are often scarce.The efficient utilization of experts in these platforms is made challenging by the fact that theinformation available about the parties involved is usually limited.To address this challenge, we develop a model of a task-expert matching system where atask is matched to an expert using not only the prior information about the task but alsothe feedback obtained from the past matches. In our model the tasks arrive online while theexperts are fixed and constrained by a finite service capacity. For this model, we characterizethe maximum task resolution throughput a platform can achieve. We show that the naturalgreedy approaches where each expert is assigned a task most suitable to her skill is suboptimal,as it does not internalize the above externality. We develop a throughput optimal backpressurealgorithm which does so by accounting for the ‘congestion’ among different task types. Finally,we validate our model and confirm our theoretical findings with data-driven simulations vialogs of Math.StackExchange, a StackOverflow forum dedicated to mathematic

LSE Research Online

INRIA a CCSD electronic archive server

Information Gathering in Resource Constrained Wireless Networks

Author: Colesanti UGO MARIA
Publication venue
Publication date: 15/06/2011
Field of study

Archivio della ricerca- Università di Roma La Sapienza

Convergent Algorithms for Collaborative Filtering

Author: Jon Kleinberg
Mark Sandler
Publication venue
Publication date: 01/01/2003
Field of study

A collaborative filtering system analyzes data on the past behavior of its users so as to make recommendations --- a canonical example is the recommending of books based on prior purchases. The full potential of collaborative filtering implicitly rests on the premise that, as an increasing amount of data is collected, it should be possible to make increasingly high-quality recommendations. Despite the prevalence of this notion at an informal level, the theoretical study of such convergent algorithms has been quite limited

CiteSeerX

Crossref