Search CORE

2,504,807 research outputs found

Task Transfer by Preference-Based Cost Learning

Author: Huang Wenbing
Jing Mingxuan
Liu Huaping
Ma Xiaojian
Sun Fuchun
Publication venue
Publication date: 18/02/2019
Field of study

The goal of task transfer in reinforcement learning is migrating the action policy of an agent to the target task from the source task. Given their successes on robotic action planning, current methods mostly rely on two requirements: exactly-relevant expert demonstrations or the explicitly-coded cost function on target task, both of which, however, are inconvenient to obtain in practice. In this paper, we relax these two strong conditions by developing a novel task transfer framework where the expert preference is applied as a guidance. In particular, we alternate the following two steps: Firstly, letting experts apply pre-defined preference rules to select related expert demonstrates for the target task. Secondly, based on the selection result, we learn the target cost function and trajectory distribution simultaneously via enhanced Adversarial MaxEnt IRL and generate more trajectories by the learned target distribution for the next preference selection. The theoretical analysis on the distribution learning and convergence of the proposed algorithm are provided. Extensive simulations on several benchmarks have been conducted for further verifying the effectiveness of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed equally to this wor

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Time-Varying Priority Queuing Models for Human Dynamics

Author: G. K. Zipf
G. S. Becker
Hang-Hyun Jo
I. Ajzen
J. W. Weibull
Kimmo Kaski
Raj Kumar Pan
Publication venue: 'American Physical Society (APS)'
Publication date: 02/05/2012
Field of study

Queuing models provide insight into the temporal inhomogeneity of human dynamics, characterized by the broad distribution of waiting times of individuals performing tasks. We study the queuing model of an agent trying to execute a task of interest, the priority of which may vary with time due to the agent's "state of mind." However, its execution is disrupted by other tasks of random priorities. By considering the priority of the task of interest either decreasing or increasing algebraically in time, we analytically obtain and numerically confirm the bimodal and unimodal waiting time distributions with power-law decaying tails, respectively. These results are also compared to the updating time distribution of papers in the arXiv.org and the processing time distribution of papers in Physical Review journals. Our analysis helps to understand human task execution in a more realistic scenario.Comment: 8 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Procrastination with variable present bias

Author: Gravin Nick
Immorlica Nicole
Lucier Brendan
Pountourakis Emmanouil
Publication venue
Publication date: 09/06/2016
Field of study

Individuals working towards a goal often exhibit time inconsistent behavior, making plans and then failing to follow through. One well-known model of such behavioral anomalies is present-bias discounting: individuals over-weight present costs by a bias factor. This model explains many time-inconsistent behaviors, but can make stark predictions in many settings: individuals either follow the most efficient plan for reaching their goal or procrastinate indefinitely. We propose a modification in which the present-bias parameter can vary over time, drawn independently each step from a fixed distribution. Following Kleinberg and Oren (2014), we use a weighted task graph to model task planning, and measure the cost of procrastination as the relative expected cost of the chosen path versus the optimal path. We use a novel connection to optimal pricing theory to describe the structure of the worst-case task graph for any present-bias distribution. We then leverage this structure to derive conditions on the bias distribution under which the worst-case ratio is exponential (in time) or constant. We also examine conditions on the task graph that lead to improved procrastination ratios: graphs with a uniformly bounded distance to the goal, and graphs in which the distance to the goal monotonically decreases on any path.Comment: 19 pages, 2 figures. To appear in the 17th ACM Conference on Economics and Computation (EC 2016

arXiv.org e-Print Archive

Crossref

Chemical industry supply chain optimisation using agent-based modelling

Author: Dyntar Jakub
Škvor Jan
Publication venue: 'Univerzita Pardubice'
Publication date: 01/01/2013
Field of study

In this paper we present an application of Supply Chain Spread Sheet Simulator (SCSS) in a task dealing with chemical industry supply chain redesign and optimisation. SCSS uses principles of Agent-Based Modelling combining 4 types of agents with 3 algorithms to control their behaviour. Location Algorithm is used to place the logistics objects satisfying the demand of customers, Clarke&Wright's Savings Algorithm is applied to plan the routes and Past Stock Movement Simulation is used to control the stock levels. SCSS is developed in MS Excel using programming language Visual Basic for Applications. Its basic functionality is discussed simulating a real task dealing with the redesign of the distribution system for goods coming from chemical industry in the Czech Republic. We test 6 different structures of the distribution system differing in number of located logistics objects ranging from 1 to 6. Based on the outputs of SCSS recalculated to distribution costs we suggest decreasing the number of located warehouses from 6 to 1 estimating almost 33 % distribution costs savings per year

UPCE Digital Library