185 research outputs found
Learning Representations in Model-Free Hierarchical Reinforcement Learning
Common approaches to Reinforcement Learning (RL) are seriously challenged by
large-scale applications involving huge state spaces and sparse delayed reward
feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address
this scalability issue by learning action selection policies at multiple levels
of temporal abstraction. Abstraction can be had by identifying a relatively
small set of states that are likely to be useful as subgoals, in concert with
the learning of corresponding skill policies to achieve those subgoals. Many
approaches to subgoal discovery in HRL depend on the analysis of a model of the
environment, but the need to learn such a model introduces its own problems of
scale. Once subgoals are identified, skills may be learned through intrinsic
motivation, introducing an internal reward signal marking subgoal attainment.
In this paper, we present a novel model-free method for subgoal discovery using
incremental unsupervised learning over a small memory of the most recent
experiences (trajectories) of the agent. When combined with an intrinsic
motivation learning mechanism, this method learns both subgoals and skills,
based on experiences in the environment. Thus, we offer an original approach to
HRL that does not require the acquisition of a model of the environment,
suitable for large-scale applications. We demonstrate the efficiency of our
method on two RL problems with sparse delayed feedback: a variant of the rooms
environment and the first screen of the ATARI 2600 Montezuma's Revenge game
Massively parallel support for a case-based planning system
Case-based planning (CBP), a kind of case-based reasoning, is a technique in which previously generated plans (cases) are stored in memory and can be reused to solve similar planning problems in the future. CBP can save considerable time over generative planning, in which a new plan is produced from scratch. CBP thus offers a potential (heuristic) mechanism for handling intractable problems. One drawback of CBP systems has been the need for a highly structured memory to reduce retrieval times. This approach requires significant domain engineering and complex memory indexing schemes to make these planners efficient. In contrast, our CBP system, CaPER, uses a massively parallel frame-based AI language (PARKA) and can do extremely fast retrieval of complex cases from a large, unindexed memory. The ability to do fast, frequent retrievals has many advantages: indexing is unnecessary; very large case bases can be used; memory can be probed in numerous alternate ways; and queries can be made at several levels, allowing more specific retrieval of stored plans that better fit the target problem with less adaptation. In this paper we describe CaPER's case retrieval techniques and some experimental results showing its good performance, even on large case bases
Planning And Scheduling For Large-scaledistributed Systems
Many applications require computing resources well beyond those available on any single system. Simulations of atomic and subatomic systems with application to material science, computations related to study of natural sciences, and computer-aided design are examples of applications that can benefit from the resource-rich environment provided by a large collection of autonomous systems interconnected by high-speed networks. To transform such a collection of systems into a user\u27s virtual machine, we have to develop new algorithms for coordination, planning, scheduling, resource discovery, and other functions that can be automated. Then we can develop societal services based upon these algorithms, which hide the complexity of the computing system for users. In this dissertation, we address the problem of planning and scheduling for large-scale distributed systems. We discuss a model of the system, analyze the need for planning, scheduling, and plan switching to cope with a dynamically changing environment, present algorithms for the three functions, report the simulation results to study the performance of the algorithms, and introduce an architecture for an intelligent large-scale distributed system
An adaptive deductive planning system
A generic planning system is introduced which allows for custom building of planners able to generate plans for different plan consumers in the context of intelligent support systems. All planners are adapted to the pecularities of different plan consumers, to their domain knowledge, their typical behavior, their preferences, and their utilization of plans. The necessary knowledge sources of the generic planner are fixed in order to enable it to produce plans of a certain specificity. Its control strategy is described in a formal specification language containing constructs which allow for the configuration of characteristic parts of the control strategy. The customized planners are defined by executable specifications. An application of the approach to deductive planning based on a modal temporal logic is shown. It is demonstrated in an example how needs of different plan consumers in an intelligent help system can be met by a deductive planner
Multi-agent planning using an abductive : event calculus
Temporal reasoning within distributed Artificial Intelligence Systems is faced with the problem of concurrent streams of action. Well known, logic-based systems using the SITUATION CALCULUS solve the frame problem in a purely linear manner. Recent research, however, has revealed that the EVENT CALCULUS under the abduction principle is capable of nonlinear planning. In this report, we present a planning service module which incorporates this approach into a constraint logic framework and even allows a notion of strong nonlinearity. The work includes the axiomatisation of appropriate versions of the EVENT CALCULUS, the development of a suitably sound and complete proof procedure that supports abduction and the implementation of both of these layers on the constraint platform OZ. We demonstrate prototypically how this module, EVE, can be integrated into an existing multi-agent architecture and evaluate the behaviour of such agents within an application domain, the loading dock scenario
Storing and Indexing Plan Derivations through Explanation-based Analysis of Retrieval Failures
Case-Based Planning (CBP) provides a way of scaling up domain-independent
planning to solve large problems in complex domains. It replaces the detailed
and lengthy search for a solution with the retrieval and adaptation of previous
planning experiences. In general, CBP has been demonstrated to improve
performance over generative (from-scratch) planning. However, the performance
improvements it provides are dependent on adequate judgements as to problem
similarity. In particular, although CBP may substantially reduce planning
effort overall, it is subject to a mis-retrieval problem. The success of CBP
depends on these retrieval errors being relatively rare. This paper describes
the design and implementation of a replay framework for the case-based planner
DERSNLP+EBL. DERSNLP+EBL extends current CBP methodology by incorporating
explanation-based learning techniques that allow it to explain and learn from
the retrieval failures it encounters. These techniques are used to refine
judgements about case similarity in response to feedback when a wrong decision
has been made. The same failure analysis is used in building the case library,
through the addition of repairing cases. Large problems are split and stored as
single goal subproblems. Multi-goal problems are stored only when these smaller
cases fail to be merged into a full solution. An empirical evaluation of this
approach demonstrates the advantage of learning from experienced retrieval
failure.Comment: See http://www.jair.org/ for any accompanying file
Decision Representation Language (DRL) and Its Support Environment
In this report, I describe a language, called Decision Representation Language (DRL), for representing the qualitative aspects of decision making processes such as the alternatives being evaluated, goals to satisfy, and the arguments evaluating the alternatives. Once a decision process is represented in this language, the system can provide a set of services that support people making the decision. These services, together with the interface such as the object and the different presentation formats, form the support environment for using the language. I describe the services that have been so far identified to be useful — the managements of dependency, plausibility, viewpoints, and precedents. I also discuss how this work on DRL is related to other studies on decision making.MIT Artificial Intelligence Laborator
A foundation for machine learning in design
This paper presents a formalism for considering the issues of learning in design. A foundation for machine learning in design (MLinD) is defined so as to provide answers to basic questions on learning in design, such as, "What types of knowledge can be learnt?", "How does learning occur?", and "When does learning occur?". Five main elements of MLinD are presented as the input knowledge, knowledge transformers, output knowledge, goals/reasons for learning, and learning triggers. Using this foundation, published systems in MLinD were reviewed. The systematic review presents a basis for validating the presented foundation. The paper concludes that there is considerable work to be carried out in order to fully formalize the foundation of MLinD
Landmark-based approaches for goal recognition as planning
This article is a revised and extended version of two papers published at AAAI 2017 (Pereira et al., 2017b) and ECAI 2016 (Pereira and Meneguzzi, 2016). We thank the anonymous reviewers that helped improve the research in this article. The authors thank Shirin Sohrabi for discussing the way in which the algorithms of Sohrabi et al. (2016) should be configured, and Yolanda Escudero-Martın for providing code for the approach of E-Martın et al. (2015) and engaging with us. We also thank Miquel Ramırez and Mor Vered for various discussions, and Andre Grahl Pereira for a discussion of properties of our algorithm. Felipe thanks CNPq for partial financial support under its PQ fellowship, grant number 305969/2016-1.Peer reviewedPostprin
Cognitive Modeling for Computer Animation: A Comparative Review
Cognitive modeling is a provocative new paradigm that paves the way towards intelligent graphical characters by providing them with logic and reasoning skills. Cognitively empowered self-animating characters will see in the near future a widespread use in the interactive game, multimedia, virtual reality and production animation industries. This review covers three recently-published papers from the field of cognitive modeling for computer animation. The approaches and techniques employed are very different. The cognition model in the first paper is built on top of Soar, which is intended as a general cognitive architecture for developing systems that exhibit intelligent behaviors. The second paper uses an active plan tree and a plan library to achieve the fast and robust reactivity to the environment changes. The third paper, based on an AI formalism known as the situation calculus, develops a cognitive modeling language called CML and uses it to specify a behavior outline or sketch plan to direct the characters in terms of goals. Instead of presenting each paper in isolation then comparatively analyzing them, we take a top-down approach by first classifying the field into three different categories and then attempting to put each paper into a proper category. Hopefully in this way it can provide a more cohesive, systematic view of cognitive modeling approaches employed in computer animation
- …