31 research outputs found

    The Murray Ledger and Times, September 25, 1987

    Get PDF

    The Optimal Reward Problem: Designing Effective Reward for Bounded Agents.

    Full text link
    In the field of reinforcement learning, agent designers build agents which seek to maximize reward. In standard practice, one reward function serves two purposes. It is used to evaluate the agent and is used to directly guide agent behavior in the agent's learning algorithm. This dissertation makes four main contributions to the theory and practice of reward function design. The first is a demonstration that if an agent is bounded---if it is limited in its ability to maximize expected reward---the designer may benefit by considering two reward functions. A designer reward function is used to evaluate the agent, while a separate agent reward function is used to guide agent behavior. The designer can then solve the Optimal Reward Problem (ORP): choose the agent reward function which leads to the greatest expected reward for the designer. The second contribution is the demonstration through examples that good reward functions are chosen by assessing an agent's limitations and how they interact with the environment. An agent which maintains knowledge of the environment in the form of a Bayesian posterior distribution, but lacks adequate planning resources, can be given a reward proportional to the variance of the posterior, resulting in provably efficient exploration. An agent with poor modeling assumptions can be punished for visiting the areas of the state space it has trouble modeling, resulting in better performance. The third contribution is the Policy Gradient for Reward Design (PGRD) algorithm, a convergent gradient ascent algorithm for learning good reward functions. Experiments in multiple environments demonstrate that using PGRD for reward optimization yields better agents than using the designer's reward directly as the agent's reward. It also outperforms the use of an evaluation function at the leaf-states of the planning tree. Finally, this dissertation shows that the ORP differs from the popular work on potential-based reward shaping. Shaping rewards are constrained by properties of the environment and the designer's reward function, but they generally are defined irrespective of properties of the agent. The best shaping reward functions are suboptimal for some agents and environments.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89705/1/jdsorg_1.pd

    Learning plan networks in conversational video games

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2007.Includes bibliographical references (p. 121-123).We look forward to a future where robots collaborate with humans in the home and workplace, and virtual agents collaborate with humans in games and training simulations. A representation of common ground for everyday scenarios is essential for these agents if they are to be effective collaborators and communicators. Effective collaborators can infer a partner's goals and predict future actions. Effective communicators can infer the meaning of utterances based on semantic context. This thesis introduces a computational cognitive model of common ground called a Plan Network. A Plan Network is a statistical model that provides representations of social roles, object affordances, and expected patterns of behavior and language. I describe a methodology for unsupervised learning of a Plan Network using a multiplayer video game, visualization of this network, and evaluation of the learned model with respect to human judgment of typical behavior. Specifically, I describe learning the Restaurant Plan Network from data collected from over 5,000 players of an online game called The Restaurant Game.by Jeffrey David Orkin.S.M

    Combining SOA and BPM Technologies for Cross-System Process Automation

    Get PDF
    This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

    State of New Hampshire. Reports, 1907-1908, volume IV.- Biennial

    Get PDF
    Sometimes issued both annually and biennially; Each vol. contains the reports of various departments of the government of the state of New Hampshire; Includes attorneys general\u27s opinion

    Hands-on Science. Advancing Science. Improving Education

    Get PDF
    The book herein aims to contribute to the advancement of Science to the improvement of Science Education and to an effective implementation of a sound widespread scientific literacy at all levels of society. Its chapters reunite a variety of diverse and valuable works presented in this line of thought at the 15th International Conference on Hands-on Science “Advancing Science. Improving Education

    Security Analysis of System Behaviour - From "Security by Design" to "Security at Runtime" -

    Get PDF
    The Internet today provides the environment for novel applications and processes which may evolve way beyond pre-planned scope and purpose. Security analysis is growing in complexity with the increase in functionality, connectivity, and dynamics of current electronic business processes. Technical processes within critical infrastructures also have to cope with these developments. To tackle the complexity of the security analysis, the application of models is becoming standard practice. However, model-based support for security analysis is not only needed in pre-operational phases but also during process execution, in order to provide situational security awareness at runtime. This cumulative thesis provides three major contributions to modelling methodology. Firstly, this thesis provides an approach for model-based analysis and verification of security and safety properties in order to support fault prevention and fault removal in system design or redesign. Furthermore, some construction principles for the design of well-behaved scalable systems are given. The second topic is the analysis of the exposition of vulnerabilities in the software components of networked systems to exploitation by internal or external threats. This kind of fault forecasting allows the security assessment of alternative system configurations and security policies. Validation and deployment of security policies that minimise the attack surface can now improve fault tolerance and mitigate the impact of successful attacks. Thirdly, the approach is extended to runtime applicability. An observing system monitors an event stream from the observed system with the aim to detect faults - deviations from the specified behaviour or security compliance violations - at runtime. Furthermore, knowledge about the expected behaviour given by an operational model is used to predict faults in the near future. Building on this, a holistic security management strategy is proposed. The architecture of the observing system is described and the applicability of model-based security analysis at runtime is demonstrated utilising processes from several industrial scenarios. The results of this cumulative thesis are provided by 19 selected peer-reviewed papers

    Introductory Computer Forensics

    Get PDF
    INTERPOL (International Police) built cybercrime programs to keep up with emerging cyber threats, and aims to coordinate and assist international operations for ?ghting crimes involving computers. Although signi?cant international efforts are being made in dealing with cybercrime and cyber-terrorism, ?nding effective, cooperative, and collaborative ways to deal with complicated cases that span multiple jurisdictions has proven dif?cult in practic

    Annual Report

    Get PDF
    corecore