Search CORE

1,819 research outputs found

Extracting information from informal communication

Author: Rennie Jason D. M. (Jason Daniel Malyutin), 1976-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2007
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (leaves 89-93).This thesis focuses on the problem of extracting information from informal communication. Textual informal communication, such as e-mail, bulletin boards and blogs, has become a vast information resource. However, such information is poorly organized and difficult for a computer to understand due to lack of editing and structure. Thus, techniques which work well for formal text, such as newspaper articles, may be considered insufficient on informal text. One focus of ours is to attempt to advance the state-of-the-art for sub-problems of the information extraction task. We make contributions to the problems of named entity extraction, co-reference resolution and context tracking. We channel our efforts toward methods which are particularly applicable to informal communication. We also consider a type of information which is somewhat unique to informal communication: preferences and opinions. Individuals often expression their opinions on products and services in such communication. Others' may read these "reviews" to try to predict their own experiences. However, humans do a poor job of aggregating and generalizing large sets of data. We develop techniques that can perform the job of predicting unobserved opinions.(cont.) We address both the single-user case where information about the items is known, and the multi-user case where we can generalize opinions without external information. Experiments on large-scale rating data sets validate our approach.by Jason D.M. Rennie.Ph.D

DSpace@MIT

A Collaborative Kalman Filter for Time-Evolving Dyadic Processes

Author: Gultekin San
Paisley John
Publication venue
Publication date: 22/01/2015
Field of study

We present the collaborative Kalman filter (CKF), a dynamic model for collaborative filtering and related factorization models. Using the matrix factorization approach to collaborative filtering, the CKF accounts for time evolution by modeling each low-dimensional latent embedding as a multidimensional Brownian motion. Each observation is a random variable whose distribution is parameterized by the dot product of the relevant Brownian motions at that moment in time. This is naturally interpreted as a Kalman filter with multiple interacting state space vectors. We also present a method for learning a dynamically evolving drift parameter for each location by modeling it as a geometric Brownian motion. We handle posterior intractability via a mean-field variational approximation, which also preserves tractability for downstream calculations in a manner similar to the Kalman filter. We evaluate the model on several large datasets, providing quantitative evaluation on the 10 million Movielens and 100 million Netflix datasets and qualitative evaluation on a set of 39 million stock returns divided across roughly 6,500 companies from the years 1962-2014.Comment: Appeared at 2014 IEEE International Conference on Data Mining (ICDM

arXiv.org e-Print Archive

CiteSeerX

Incremental Sparse Bayesian Ordinal Regression

Author: de Rijke Maarten
Li Chang
Publication venue: 'Elsevier BV'
Publication date: 18/06/2018
Field of study

Ordinal Regression (OR) aims to model the ordering information between different data categories, which is a crucial topic in multi-label learning. An important class of approaches to OR models the problem as a linear combination of basis functions that map features to a high dimensional non-linear space. However, most of the basis function-based algorithms are time consuming. We propose an incremental sparse Bayesian approach to OR tasks and introduce an algorithm to sequentially learn the relevant basis functions in the ordinal scenario. Our method, called Incremental Sparse Bayesian Ordinal Regression (ISBOR), automatically optimizes the hyper-parameters via the type-II maximum likelihood method. By exploiting fast marginal likelihood optimization, ISBOR can avoid big matrix inverses, which is the main bottleneck in applying basis function-based algorithms to OR tasks on large-scale datasets. We show that ISBOR can make accurate predictions with parsimonious basis functions while offering automatic estimates of the prediction uncertainty. Extensive experiments on synthetic and real word datasets demonstrate the efficiency and effectiveness of ISBOR compared to other basis function-based OR approaches

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Preference Learning

Author: Fürnkranz Johannes
Fürnkranz Johannes
Hüllermeier Eyke
Hüllermeier Eyke
Rudin Cynthia
Rudin Cynthia
Sanner Scott
Sanner Scott
Slowinski Roman
Słowiński Roman
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/2014
Field of study

This report documents the program and the outcomes of Dagstuhl Seminar 14101 “Preference Learning”. Preferences have recently received considerable attention in disciplines such as machine learning, knowledge discovery, information retrieval, statistics, social choice theory, multiple criteria decision making, decision under risk and uncertainty, operations research, and others. The motivation for this seminar was to showcase recent progress in these different areas with the goal of working towards a common basis of understanding, which should help to facilitate future synergies

Open Access LMU

A latent model for collaborative filtering

Author: Langseth Helge
Nielsen Thomas Dyhre
Publication venue: 'Elsevier BV'
Publication date: 01/06/2012
Field of study

AbstractRecommender systems based on collaborative filtering have received a great deal of interest over the last two decades. In particular, recently proposed methods based on dimensionality reduction techniques and using a symmetrical representation of users and items have shown promising results. Following this line of research, we propose a probabilistic collaborative filtering model that explicitly represents all items and users simultaneously in the model. Experimental results show that the proposed system obtains significantly better results than other collaborative filtering systems (evaluated on the MovieLens data set). Furthermore, the explicit representation of all users and items allows the model to e.g. make group-based recommendations balancing the preferences of the individual users

Elsevier - Publisher Connector

VBN