Search CORE

16 research outputs found

Tree-Independent Dual-Tree Algorithms

Author: Anderson David V.
Curtin Ryan R.
Gray Alexander G.
Isbell Jr Charles L.
March William B.
Ram Parikshit
Publication venue
Publication date: 16/04/2013
Field of study

Dual-tree algorithms are a widely used class of branch-and-bound algorithms. Unfortunately, developing dual-tree algorithms for use with different trees and problems is often complex and burdensome. We introduce a four-part logical split: the tree, the traversal, the point-to-point base case, and the pruning rule. We provide a meta-algorithm which allows development of dual-tree algorithms in a tree-independent manner and easy extension to entirely new types of trees. Representations are provided for five common algorithms; for k-nearest neighbor search, this leads to a novel, tighter pruning bound. The meta-algorithm also allows straightforward extensions to massively parallel settings.Comment: accepted in ICML 201

arXiv.org e-Print Archive

CiteSeerX

The Parallel Problems Server

Author: Charles L. Isbell
Jr.
Parry Husbands
Publication venue
Publication date
Field of study

Introduction We describe a novel architecture for a "linear algebra server" that operates in parallel on extremely large matrices. Matrices are created by the server and distributed across many machines. All operations therefore take place automatically in parallel. The server is extenisble and includes a general application interface to clients. This project is motivated by three observations. First, many widely-used algorithms in Computer Science can be realized as operations on matrices. Common techniques in smart text retrieval, object recognition and machine learning can all be described in this framework. Second, it is of utmost importance to be able to test new ideas quickly in an interactive setting. Finally, in order to understand the computational issues that arise with many algorithms it is necessary to test them on very large problems. There are commonly available solutions that address this general problem, but there are several difficulties. Interactive and easi

CiteSeerX

On the Difficulty of Modular Reinforcement Learning for Real-World Partial Programming

Author: Charles L. Isbell Jr.
Michael Mateas
Sooraj Bhat
Publication venue
Publication date
Field of study

In recent years there has been a great deal of interest in "modular reinforcement learning" (MRL). Typically, problems are decomposed into concurrent subgoals, allowing increased scalability and state abstraction. An arbitrator combines the subagents' preferences to select an action. In this work, we contrast treating an MRL agent as a set of subagents with the same goal with treating an MRL agent as a set of subagents who may have different, possibly conflicting goals. We argue that the latter is a more realistic description of real-world problems, especially when building partial programs. We address a range of algorithms for single-goal MRL, and leveraging social choice theory, we present an impossibility result for applications of such algorithms to multigoal MRL. We suggest an alternative formulation of arbitration as scheduling that avoids the assumptions of comparability of preference that are implicit in single-goal MRL. A notable feature of this formulation is the explicit codification of the tradeoffs between the subproblems. Finally, we introduce A BL, a language that encapsulates many of these ideas

CiteSeerX

Object Focused Q-Learning for Autonomous Agents

Author: Cobo Luis C.
Isbell Charles L., Jr.
Thomaz Andrea L.
Publication venue: ACM Press
Publication date: 01/01/2013
Field of study

© ACM 2013. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in AAMAS '13 Proceedings of the 2013 International Conference on Autonomous Agents and Multi-agent Systems.We present Object Focused Q-learning (OF-Q), a novel reinforcement learning algorithm that can offer exponential speed-ups over classic Q-learning on domains composed of independent objects. An OF-Q agent treats the state space as a collection of objects organized into different object classes. Our key contribution is a control policy that uses non-optimal Q-functions to estimate the risk of ignoring parts of the state space. We compare our algorithm to traditional Q-learning and previous arbitration algorithms in two domains, including a version of Space Invaders

Scholarly Materials And Research @ Georgia Tech

CiteSeerX

Automatic Task Decomposition and State Abstraction from Demonstration

Author: Cobo Luis C.
Isbell Charles L., Jr.
Thomaz Andrea L.
Publication venue: International Foundation for Autonomous Agents and Multiagent Systems
Publication date: 01/01/2012
Field of study

Presented at the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2012) 4-8 June 2012, Valencia, Spain.© 2012, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org).Both Learning from Demonstration (LfD) and Reinforcement Learning (RL) are popular approaches for building decision-making agents. LfD applies supervised learning to a set of human demonstrations to infer and imitate the human policy, while RL uses only a reward signal and exploration to find an optimal policy. For complex tasks both of these techniques may be ineffective. LfD may require many more demonstrations than it is feasible to obtain, and RL can take an inadmissible amount of time to converge. We present Automatic Decomposition and Abstraction from demonstration (ADA), an algorithm that uses mutual information measures over a set of human demonstrations to decompose a sequential decision process into several sub- tasks, finding state abstractions for each one of these sub- tasks. ADA then projects the human demonstrations into the abstracted state space to build a policy. This policy can later be improved using RL algorithms to surpass the performance of the human teacher. We find empirically that ADA can find satisficing policies for problems that are too complex to be solved with traditional LfD and RL algorithms. In particular, we show that we can use mutual information across state features to leverage human demonstrations to reduce the effects of the curse of dimensionality by finding subtasks and abstractions in sequential decision processes

Scholarly Materials And Research @ Georgia Tech

CiteSeerX

Autonomous Nondeterministic Tour Guides: Improving Quality of Experience with TTD-MDPs

Author: Cantino Andrew S.
Isbell Charles Lee, Jr.
Roberts David L.
Publication venue: Georgia Institute of Technology
Publication date: 01/01/2007
Field of study

In this paper, we address the problem of building a system of autonomous tour guides for a complex environment, such as a museum with many visitors. Visitors may have varying preferences for types of art or may wish to visit different areas across multiple visits. Often, these goals conflict. For example, many visitors may wish to see the museum's most popular work, but that could cause congestion, ruining the experience. Thus, our task is to build a set of agents that can satisfy their visitors' goals while simultaneously providing quality experiences for all. We use targeted trajectory distribution MDPs (TTD-MDPs), a technology developed to guide players in an interactive entertainment setting. The solution to a TTD-MDP is a probabilistic policy that results in a specific distribution of trajectories through a state space. We motivate TTD-MDPs for the museum tour problem, then describe the development of a number of models of museum visitors. Additionally, we propose a museum model and simulate tours using personalized TTD-MDP tour guides for each kind of visitor. We explain how the use of probabilistic policies reduces the congestion experienced by visitors while preserving their ability to pursue and realize goals

Scholarly Materials And Research @ Georgia Tech

From devices to tasks: . . .

Author: Charles L. Isbell Jr.
Jeffrey S. Pierce
O. Omojokun
Publication venue
Publication date
Field of study

One of the driving applications of ubiquitous computing is universal appliance interaction: the ability to use arbitrary mobile devices to interact with arbitrary appliances, such as TVs, printers, and lights. Because of limited screen real estate and the plethora of devices and commands available to the user, a central problem in achieving this vision is predicting which appliances and devices the user wishes to use next in order to make interfaces for those devices available. We believe that universal appliance interaction is best supported through the deployment of appliance user interfaces (UIs) that are personalized to a user's habits and information needs. In this paper, we suggest that, in a truly ubiquitous computing environment, the user will not necessarily think of devices as separate entities; therefore, rather than focus on which device the user may want to use next, we present a method for automatically discovering the user's common tasks (e.g., watching a movie, or surfing TV channels), predicting the task that the user wishes to engage in, and generating an appropriate interface that spans multiple devices. We have several results. We show that it is possible to discover and cluster collections of commands that represent tasks and to use history to predict the next task reliably. In fact, we show that moving from devices to tasks is not only a useful way of representing our core problem, but that it is, in fact, an easier problem to solve. Finally, we show that tasks can vary from user to user

CiteSeerX

Reinforcement Learning for Declarative Optimization-Based Drama Management

Author: Charles L. Isbell Jr.
David L. Roberts
Mark J. Nelson
Michael Mateas
Publication venue
Publication date: 01/01/2006
Field of study

A long-standing challenge in interactive entertainment is the creation of story-based games with dynamically responsive story-lines. Such games are populated by multiple objects and autonomous characters, and must provide a coherent story experience while giving the player freedom of action. To maintain coherence, the game author must provide for modifying the world in reaction to the player's actions, directing agents to act in particular ways (overriding or modulating their autonomy), or causing inanimate objects to reconfigure themselves "behind the player's back". Declarativ

CiteSeerX