Search CORE

128 research outputs found

Learning Users’ Interests in a Market-Based Recommender System

Author: J. Herlocker
L.P. Kaelbling
M. Montaner
P. Resnick
T. Mitchell
Y.Z. Wei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Recommender systems are widely used to cope with the problem of information overload and, consequently, many recommendation methods have been developed. However, no one technique is best for all users in all situations. To combat this, we have previously developed a market-based recommender system that allows multiple agents (each representing a different recommendation method or system) to compete with one another to present their best recommendations to the user. Our marketplace thus coordinates multiple recommender agents and ensures only the best recommendations are presented. To do this effectively, however, each agent needs to learn the users’ interests and adapt its recommending behaviour accordingly. To this end, in this paper, we develop a reinforcement learning and Boltzmann exploration strategy that the recommender agents can use for these tasks. We then demonstrate that this strategy helps the agents to effectively obtain information about the users’ interests which, in turn, speeds up the market convergence and enables the system to rapidly highlight the best recommendations

Crossref

Southampton (e-Prints Soton)

Spiral - Imperial College Digital Repository

Comment on ``Two Time Scales and Violation of the Fluctuation-Dissipation Theorem in a Finite Dimensional Model for Structural Glasses''

Author: Brooks R. A.
Gat E.
Kaelbling L.P.
Laffey T. J.
Musliner D. J.
Nilsson N. J.
Nilsson R. E.
Rossnschein S. J.
Stankovic J. A.
Wilkins D.
Publication venue
Publication date: 01/01/1994
Field of study

In cond-mat/0002074 Ricci-Tersenghi et al. find two linear regimes in the fluctuation-dissipation relation between density-density correlations and associated responses of the Frustrated Ising Lattice Gas. Here we show that this result does not seem to correspond to the equilibrium quantities of the model, by measuring the overlap distribution P(q) of the density and comparing the FDR expected on the ground of the P(q) with the one measured in the off-equilibrium experiments.Comment: RevTeX, 1 page, 2 eps figures, Comment on F. Ricci-Tersenghi et al., Phys. Rev. Lett. 84, 4473 (2000

arXiv.org e-Print Archive

CiteSeerX

Archivio della ricerca - Università degli studi di Napoli Federico II

Crossref

NASA Technical Reports Server

Deep Blue Documents at the University of Michigan

Fermionic Molecular Dynamics for nuclear dynamics and thermodynamics

Author: D.A. Berry
F.P. Kelly
K.E. Avrachenkov
L.P. Kaelbling
M. Hutter
M. Hutter
P. Samuelson
P.R. Kumar
R.H. Strotz
R.S. Sutton
S. Frederick
S. Kakade
S. Mahadevan
S.J. Russell
Publication venue
Publication date: 01/01/2006
Field of study

A new Fermionic Molecular Dynamics (FMD) model based on a Skyrme functional is proposed in this paper. After introducing the basic formalism, some first applications to nuclear structure and nuclear thermodynamics are presentedComment: 5 pages, Proceedings of the French-Japanese Symposium, September 2008. To be published in Int. J. of Mod. Phys.

arXiv.org e-Print Archive

HAL - Normandie Université

CiteSeerX

HAL-IN2P3

Crossref

The Australian National University

HAL-CEA

The Apriori Stochastic Dependency Detection (ASDD) algorithm for learning Stochastic logic rules

Author: C. Boutilier
G.L. Drescher
J. McCarthy
L.P. Kaelbling
R. Agrawal
R.E. Fikes
R.S. Sutton
S.H. Muggleton
T. Oates
T. Oates
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Apriori Stochastic Dependency Detection (ASDD) is an algorithm for fast induction of stochastic logic rules from a database of observations made by an agent situated in an environment. ASDD is based on features of the Apriori algorithm for mining association rules in large databases of sales transactions [1] and the MSDD algorithm for discovering stochastic dependencies in multiple streams of data [15]. Once these rules have been acquired the Precedence algorithm assigns operator precedence when two or more rules matching the input data are applicable to the same output variable. These algorithms currently learn propositional rules, with future extensions aimed towards learning first-order models. We show that stochastic rules produced by this algorithm are capable of reproducing an accurate world model in a simple predator-prey environment

CiteSeerX

City Research Online

Crossref

Toward Automatic Verification of Multiagent Systems for Training Simulations

Author: J. Klatt
J.M. Kim
L.P. Kaelbling
M. Tambe
W.K. Hastings
W.R. Gilks
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract. Advances in multiagent systems have led to their successful applica-tion in experiential training simulations, where students learn by interacting with agents who represent people, groups, structures, etc. These multiagent simula-tions must model the training scenario so that the students ’ success is correlated with the degree to which they follow the intended pedagogy. As these simula-tions increase in size and richness, it becomes harder to guarantee that the agents accurately encode the pedagogy. Testing with human subjects provides the most accurate feedback, but it can explore only a limited subspace of simulation paths. In this paper, we present a mechanism for using human data to verify the degree to which the simulation encodes the intended pedagogy. Starting with an analysis of data from a deployed multiagent training simulation, we then present an auto-mated mechanism for using the human data to generate a distribution appropriate for sampling simulation paths. By generalizing from a small set of human data, the automated approach can systematically explore a much larger space of possi-ble training paths and verify the degree to which a multiagent training simulation adheres to its intended pedagogy

CiteSeerX

Crossref

Recommended from our members

SMART (Stochastic Model Acquisition with ReinforcemenT) learning agents: A preliminary report

Author: C. Boutilier
G.J. Tesauro
G.L. Drescher
L. Dehaspe
L.P. Kaelbling
R.E. Fikes
R.S. Sutton
S.H. Muggleton
T. Oates
T. Oates
W. Shen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

We present a framework for building agents that learn using SMART, a system that combines stochastic model acquisition with reinforcement learning to enable an agent to model its environment through experience and subsequently form action selection policies using the acquired model. We extend an existing algorithm for automatic creation of stochastic strips operators [9] as a preliminary method of environment modelling. We then define the process of generation of future states using these operators and an initial state and finally show the process by which the agent can use the generated states to form a policy with a standard reinforcement learning algorithm. The potential of SMART is exemplified using the well-known predator prey scenario. Results of applying SMART to this environment and directions for future work are discussed

City Research Online

Crossref

Immediate reward reinforcement learning for clustering and topology preserving mappings

Author: C. Fyfe
C. Fyfe
C. Fyfe
J.H. Friedman
J.H. Friedman
L.P. Kaelbling
N. Intrator
R. Williams
R.J. Williams
R.S. Sutton
T. Kohonen
W. Barbakh
X. Ma
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We extend a reinforcement learning algorithm which has previously been shown to cluster data. Our extension involves creating an underlying latent space with some pre-defined structure which enables us to create a topology preserving mapping. We investigate different forms of the reward function, all of which are created with the intent of merging local and global information, thus avoiding one of the major difficulties with e.g. K-means which is its convergence to local optima depending on the initial values of its parameters. We also show that the method is quite general and can be used with the recently developed method of stochastic weight reinforcement learning [14]

Crossref

Institutional Repository of the Islamic University of Gaza