128 research outputs found
Learning Users’ Interests in a Market-Based Recommender System
Recommender systems are widely used to cope with the problem of information overload and, consequently, many recommendation methods have been developed. However, no one technique is best for all users in all situations. To combat this, we have previously developed a market-based recommender system that allows multiple agents (each representing a different recommendation method or system) to compete with one another to present their best recommendations to the user. Our marketplace thus coordinates multiple recommender agents and ensures only the best recommendations are presented. To do this effectively, however, each agent needs to learn the users’ interests and adapt its recommending behaviour accordingly. To this end, in this paper, we develop a reinforcement learning and Boltzmann exploration strategy that the recommender agents can use for these tasks. We then demonstrate that this strategy helps the agents to effectively obtain information about the users’ interests which, in turn, speeds up the market convergence and enables the system to rapidly highlight the best recommendations
Comment on ``Two Time Scales and Violation of the Fluctuation-Dissipation Theorem in a Finite Dimensional Model for Structural Glasses''
In cond-mat/0002074 Ricci-Tersenghi et al. find two linear regimes in the
fluctuation-dissipation relation between density-density correlations and
associated responses of the Frustrated Ising Lattice Gas. Here we show that
this result does not seem to correspond to the equilibrium quantities of the
model, by measuring the overlap distribution P(q) of the density and comparing
the FDR expected on the ground of the P(q) with the one measured in the
off-equilibrium experiments.Comment: RevTeX, 1 page, 2 eps figures, Comment on F. Ricci-Tersenghi et al.,
Phys. Rev. Lett. 84, 4473 (2000
Fermionic Molecular Dynamics for nuclear dynamics and thermodynamics
A new Fermionic Molecular Dynamics (FMD) model based on a Skyrme functional
is proposed in this paper. After introducing the basic formalism, some first
applications to nuclear structure and nuclear thermodynamics are presentedComment: 5 pages, Proceedings of the French-Japanese Symposium, September
2008. To be published in Int. J. of Mod. Phys.
The Apriori Stochastic Dependency Detection (ASDD) algorithm for learning Stochastic logic rules
Apriori Stochastic Dependency Detection (ASDD) is an algorithm for fast induction of stochastic logic rules from a database of observations made by an agent situated in an environment. ASDD is based on features of the Apriori algorithm for mining association rules in large databases of sales transactions [1] and the MSDD algorithm for discovering stochastic dependencies in multiple streams of data [15]. Once these rules have been acquired the Precedence algorithm assigns operator precedence when two or more rules matching the input data are applicable to the same output variable. These algorithms currently learn propositional rules, with future extensions aimed towards learning first-order models. We show that stochastic rules produced by this algorithm are capable of reproducing an accurate world model in a simple predator-prey environment
Toward Automatic Verification of Multiagent Systems for Training Simulations
Abstract. Advances in multiagent systems have led to their successful applica-tion in experiential training simulations, where students learn by interacting with agents who represent people, groups, structures, etc. These multiagent simula-tions must model the training scenario so that the students ’ success is correlated with the degree to which they follow the intended pedagogy. As these simula-tions increase in size and richness, it becomes harder to guarantee that the agents accurately encode the pedagogy. Testing with human subjects provides the most accurate feedback, but it can explore only a limited subspace of simulation paths. In this paper, we present a mechanism for using human data to verify the degree to which the simulation encodes the intended pedagogy. Starting with an analysis of data from a deployed multiagent training simulation, we then present an auto-mated mechanism for using the human data to generate a distribution appropriate for sampling simulation paths. By generalizing from a small set of human data, the automated approach can systematically explore a much larger space of possi-ble training paths and verify the degree to which a multiagent training simulation adheres to its intended pedagogy
Recommended from our members
SMART (Stochastic Model Acquisition with ReinforcemenT) learning agents: A preliminary report
We present a framework for building agents that learn using SMART, a system that combines stochastic model acquisition with reinforcement learning to enable an agent to model its environment through experience and subsequently form action selection policies using the acquired model. We extend an existing algorithm for automatic creation of stochastic strips operators [9] as a preliminary method of environment modelling. We then define the process of generation of future states using these operators and an initial state and finally show the process by which the agent can use the generated states to form a policy with a standard reinforcement learning algorithm. The potential of SMART is exemplified using the well-known predator prey scenario. Results of applying SMART to this environment and directions for future work are discussed
Immediate reward reinforcement learning for clustering and topology preserving mappings
We extend a reinforcement learning algorithm which has previously been shown to cluster data. Our extension involves creating an underlying latent space with some pre-defined structure which enables us to create a topology preserving mapping. We investigate different forms of the reward function, all of which are created with the intent of merging local and global information, thus avoiding one of the major difficulties with e.g. K-means which is its convergence to local optima depending on the initial values of its parameters. We also show that the method is quite general and can be used with the recently developed method of stochastic weight reinforcement learning [14]
Evaluating the Effectiveness of Exploration and Accumulated Experience in Automatic Case Elicitation
- …