3,295 research outputs found

    Reinforcement Learning: A Survey

    Full text link
    This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

    The Effective Design of Managerial Incentive Systems:Combining Theoretical Principles and Practical Trade'-offs.

    Get PDF
    On the use of theoretical developments in agency economics for the practical design of incentive- and performance-based compensation systems.managerial compensation; principal-agent theory; incentive and performance based compensation systems

    Collected notes from the Benchmarks and Metrics Workshop

    Get PDF
    In recent years there has been a proliferation of proposals in the artificial intelligence (AI) literature for integrated agent architectures. Each architecture offers an approach to the general problem of constructing an integrated agent. Unfortunately, the ways in which one architecture might be considered better than another are not always clear. There has been a growing realization that many of the positive and negative aspects of an architecture become apparent only when experimental evaluation is performed and that to progress as a discipline, we must develop rigorous experimental methods. In addition to the intrinsic intellectual interest of experimentation, rigorous performance evaluation of systems is also a crucial practical concern to our research sponsors. DARPA, NASA, and AFOSR (among others) are actively searching for better ways of experimentally evaluating alternative approaches to building intelligent agents. One tool for experimental evaluation involves testing systems on benchmark tasks in order to assess their relative performance. As part of a joint DARPA and NASA funded project, NASA-Ames and Teleos Research are carrying out a research effort to establish a set of benchmark tasks and evaluation metrics by which the performance of agent architectures may be determined. As part of this project, we held a workshop on Benchmarks and Metrics at the NASA Ames Research Center on June 25, 1990. The objective of the workshop was to foster early discussion on this important topic. We did not achieve a consensus, nor did we expect to. Collected here is some of the information that was exchanged at the workshop. Given here is an outline of the workshop, a list of the participants, notes taken on the white-board during open discussions, position papers/notes from some participants, and copies of slides used in the presentations

    The optimal management of research portfolios

    Get PDF
    Risky research projects are, other things being equal, intrinsically harder to monitor than projects that are less risky. It is shown using agency theory that a standard cost benefit analysis, which ignores the agency problem, will introduce a bias towards excessively risky projects, and it will under‐estimate the benefits from complementary investments in libraries, scientific equipment and other expenditures that increase the productivity of scientists. Research managers should be risk‐averse in their choice of projects, and they should aim to hold a balanced portfolio of projects. The nature of this portfolio problem is, however, quite different from the portfolio management problem in financial markets.Research and Development/Tech Change/Emerging Technologies,

    The agent architecture InteRRaP : concept and application

    Get PDF
    One of the basic questions of research in Distributed Artificial Intelligence (DAI) is how agents have to be structured and organized, and what functionalities they need in order to be able to act and to interact in a dynamic environment. To cope with this question is the purpose of models and architectures for autonomous and intelligent agents. In the first part of this report, InteRRaP, an agent architecture for multi-agent systems is presented. The basic idea is to combine the use of patterns of behaviour with planning facilities in order to be able to exploit the advantages both of the reactive, behaviour-based and of the deliberate, plan-based paradigm. Patterns of behaviour allow an agent to react flexibly to changes in its environment. What is considered necessary for the performance of more sophisticated tasks is the ability of devising plans deliberately. A further important feature of the model is that it explicitly represents knowledge and strategies for cooperation. This makes it suitable for describing high-level interaction among autonomous agents. In the second part of the report, the loading-dock domain is presented, which has been the first application the InteRRaP agent model has been tested with. An automated loading-dock is described where the agent society consists of forklifts which have to load and unload trucks in a shared, dynamic environment

    Towards responsive Sensitive Artificial Listeners

    Get PDF
    This paper describes work in the recently started project SEMAINE, which aims to build a set of Sensitive Artificial Listeners – conversational agents designed to sustain an interaction with a human user despite limited verbal skills, through robust recognition and generation of non-verbal behaviour in real-time, both when the agent is speaking and listening. We report on data collection and on the design of a system architecture in view of real-time responsiveness

    Robot graphic simulation testbed

    Get PDF
    The objective of this research was twofold. First, the basic capabilities of ROBOSIM (graphical simulation system) were improved and extended by taking advantage of advanced graphic workstation technology and artificial intelligence programming techniques. Second, the scope of the graphic simulation testbed was extended to include general problems of Space Station automation. Hardware support for 3-D graphics and high processing performance make high resolution solid modeling, collision detection, and simulation of structural dynamics computationally feasible. The Space Station is a complex system with many interacting subsystems. Design and testing of automation concepts demand modeling of the affected processes, their interactions, and that of the proposed control systems. The automation testbed was designed to facilitate studies in Space Station automation concepts

    Complexity and Organizational Architecture

    Get PDF
    This paper revisits the literature on modelling organizations by means of networks of agents. Individual agents are engaged in screening projects, and architectural features of organizations, that is how each agent’s decision combines with those of others, a®ect the organization’s screening performance. It emphasizes how an organization of several agents may be improve upon individual performance by a suitable arrangement of the flow of decisions. The paper is motivated, in part, by a theorem due to Von Neumann, Moore and Shannon on how to build reliable networks using unreliable components and extends previous contributions by Sah and Stiglitz by recasting their original model in standard firm-theoretic terms and endogenizing its features. For an organization’s screening performance to improve over those of an individual’s, it must be sigmoid in individual performance, as measured by the probability that a good (bad) project be accepted (rejected). This is indeed the case for organizations with mixed Sah-stiglitz architectures, such as hierarchies made up of components that are polyarchies, and polyarchies made up of components that are hierarchies, give rise to such functions. This property is in turn critical for determining of the optimal number of levels of a hierarchy, and for endogenizing individual screening performance. The models are extended to allow for individuals’ own screening to be influenced from the opinions of superiors and subordinates. The paper examines the implications of such interactions for the limits to organizational performance.government to influence the real value of assets using fiscal and monetary policy.organizations, architecture, complexity, composition

    Human Management of the Hierarchical System for the Control of Multiple Mobile Robots

    Get PDF
    In order to take advantage of autonomous robotic systems, and yet ensure successful completion of all feasible tasks, we propose a mediation hierarchy in which an operator can interact at all system levels. Robotic systems are not robust in handling un-modeled events. Reactive behaviors may be able to guide the robot back into a modeled state and to continue. Reasoning systems may simply fail. Once a system has failed it is difficult to re-start the task from the failed state. Rather, the rule base is revised, programs altered, and the task re-tried from the beginning
    corecore