116 research outputs found
Recommended from our members
Multi-criteria average reward reinforcement learning
Reinforcement learning (RL) is the study of systems that learn from interaction with their environment. The current framework of Reinforcement Learning is based on receiving scalar rewards, which the agent aims to maximize. But in many real world situations, tradeoffs must be made among multiple objectives. This necessitates the use of vector representation of values and rewards and the use of weights to represent the importance of different objectives.
In this thesis, we consider the problem of learning in the presence of time-varying preferences among multiple objectives. Learning a new policy for every possible weight vector is wasteful. Instead we propose a method that allows us store a finite number of policies, choose an appropriate policy for any weight vector and improve upon it. The idea is that though there can be infinitely many weight vectors, a lot of them will have the same optimal policy. We prove this empirically in two domains: a version of the Buridan's ass problem and network routing. We show that while learning is required for the first few weight vectors, later the agent would settle for an already learnt policy and thus would converge very quickly
Recommended from our members
Effective decision-theoretic assistance through relational hierarchical models
Building intelligent computer assistants has been a long-cherished goal of AI. Many intelligent assistant systems were built and fine-tuned to specific application domains. In this work, we develop a general model of assistance that combines three powerful ideas: decision theory, hierarchical task models and probabilistic relational languages. We use the principles of decision theory to model the general problem of intelligent assistance. We use a combination of hierarchical task models and probabilistic relational languages to specify prior knowledge of the computer assistant. The assistant exploits its prior knowledge to infer the user's goals and takes actions to assist the user. We evaluate the decision theoretic assistance model in three different domains including a real-world domain to demonstrate its generality. We show through experiments that both the hierarchical structure of the goals and the parameter sharing facilitated by relational models significantly improve the learning speed of the agent. Finally, we present the results of deploying our relational hierarchical model in a real-world activity recognition task
Exploiting prior knowledge in Intelligent Assistants - Combining relational models with hierarchies
Statitsical relational models have been successfully used to model
static probabilistic relationships between the entities of the domain.
In this talk, we illustrate their use in a dynamic decison-theoretic
setting where the task is to assist a user by inferring his intentional
structure and taking appropriate assistive actions. We show that the
statistical relational models can be used to succintly express the
system\u27s prior knowledge about the user\u27s goal-subgoal structure and
tune it with experience. As the system is better able to predict the
user\u27s goals, it improves the effectiveness of its assistance. We show
through experiments that both the hierarchical structure of the goals
and the parameter sharing facilitated by relational models significantly
improve the learning speed
Relational Boosted Bandits
Contextual bandits algorithms have become essential in real-world user
interaction problems in recent years. However, these algorithms rely on context
as attribute value representation, which makes them unfeasible for real-world
domains like social networks are inherently relational. We propose Relational
Boosted Bandits(RB2), acontextual bandits algorithm for relational domains
based on (relational) boosted trees. RB2 enables us to learn interpretable and
explainable models due to the more descriptive nature of the relational
representation. We empirically demonstrate the effectiveness and
interpretability of RB2 on tasks such as link prediction, relational
classification, and recommendations.Comment: 8 pages, 3 figure
- …