12,925 research outputs found

    Evaluation Report: 2010-2015 Social Development Strategy

    Get PDF
    In 2010, Centraide of Greater Montreal adopted a strategy to fight poverty and social exclusion that was firmly grounded in a territorial approach.The hypothesis: Centraide could generate better outcomes by prioritizing investment targets and applying strategic and proactive investment approaches across a neighbourhood instead of considering agencies in isolation.Five years later, an evaluation was conducted in six communities to observe changes and draw lessons from this approach.Instead of just increasing resources so that agencies can do more, the territorial approach helped Centraide create a plan for each neighbourhood and use available vectors to support the desired improvements. These vectors include: * Support for solid and dynamic agencies that provide leadership in their communities. * Support for multi-network and intersectoral coordination so that communities can implement solutions that have the greatest chance of reducing and mitigating the impact of poverty and social exclusion. * Reinforcement of agency skills and leadership. * Ongoing relationships between organizations, mobilization initiatives and Centraid

    Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning

    Full text link
    Reinforcement learning is an approach used by intelligent agents to autonomously learn new skills. Although reinforcement learning has been demonstrated to be an effective learning approach in several different contexts, a common drawback exhibited is the time needed in order to satisfactorily learn a task, especially in large state-action spaces. To address this issue, interactive reinforcement learning proposes the use of externally-sourced information in order to speed up the learning process. Up to now, different information sources have been used to give advice to the learner agent, among them human-sourced advice. When interacting with a learner agent, humans may provide either evaluative or informative advice. From the agent's perspective these styles of interaction are commonly referred to as reward-shaping and policy-shaping respectively. Evaluation requires the human to provide feedback on the prior action performed, while informative advice they provide advice on the best action to select for a given situation. Prior research has focused on the effect of human-sourced advice on the interactive reinforcement learning process, specifically aiming to improve the learning speed of the agent, while reducing the engagement with the human. This work presents an experimental setup for a human-trial designed to compare the methods people use to deliver advice in term of human engagement. Obtained results show that users giving informative advice to the learner agents provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode. Additionally, self-evaluation from participants using the informative approach has indicated that the agent's ability to follow the advice is higher, and therefore, they feel their own advice to be of higher accuracy when compared to people providing evaluative advice.Comment: 33 pages, 15 figure

    Regret Bounds for Reinforcement Learning with Policy Advice

    Get PDF
    In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with policy advice (RLPA) algorithm which leverages this input set and learns to use the best policy in the set for the reinforcement learning task at hand. We prove that RLPA has a sub-linear regret of \tilde O(\sqrt{T}) relative to the best input policy, and that both this regret and its computational complexity are independent of the size of the state and action space. Our empirical simulations support our theoretical analysis. This suggests RLPA may offer significant advantages in large domains where some prior good policies are provided

    California four cities program, 1971 - 1973

    Get PDF
    A pilot project in aerospace-to-urban technology application is reported. Companies assigned senior engineering professionals to serve as Science and Technology Advisors to participating city governments. Technical support was provided by the companies and JPL. The cities, Anaheim, Fresno, Pasadena, and San Hose, California, provided the working environment and general service support. Each city/company team developed and carried out one or more technical or management pilot projects together with a number of less formalized technology efforts and studies. An account and evaluation is provided of the initial two-year phase of the program

    Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability

    Get PDF
    The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities. Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.publishedVersio

    Chapter 6 - Empowerment Programming: Case Study of How Intentionality and Consideration Create Breakthrough Elevating Graduate Programs

    Get PDF
    Administrators in the upper echelons of higher education face an array of dilemmas that impact and inform institutional priorities around how to serve various student populations best. Chief among those considerations is how to empower historically disenfranchised students toward a deeply substantive experience that inspires them intellectually and involves them in areas of social justice. This chapter provides an explanatory case study of a successful program launched by two vice presidents of a small, Predominately White Institution (PWI) in rural Kansas. It shows how deeply impactful outcomes for black male students can be achieved through intentional Elevating Educational Intentional Practice Programs. The case study explores the “how” and “why” and offers insights for sustained future programming

    USDLA: An Instructional Media Selection Guide For Distance Learning

    Get PDF
    Purpose and Use of the Media Selection Guide Increasingly, educators and trainers are challenged within their respective organizations to provide for the efficient distribution of instructional con-tent using instructional media. The appropriate selection of instructional media to support distance learning is not intuitive and does not occur as a matter of personal preference. On the contrary, instructional media selec-tion is a systematic sequence of qualitative processes based on sound in-structional design principles. Although media selection is often mentioned when studying the discipline of instructional technology or Instructional Systems Design (ISD), it is sometimes overlooked when applying the se-lection process in a distance-learning environment. It is our intent, there-fore, for this guide to highlight the essentials of good media selection. We hope to present an instructionally sound and systematic approach to se-lecting the most appropriate media for the delivery of content at a dis-tance
    • …
    corecore