2,504,807 research outputs found
Task Transfer by Preference-Based Cost Learning
The goal of task transfer in reinforcement learning is migrating the action
policy of an agent to the target task from the source task. Given their
successes on robotic action planning, current methods mostly rely on two
requirements: exactly-relevant expert demonstrations or the explicitly-coded
cost function on target task, both of which, however, are inconvenient to
obtain in practice. In this paper, we relax these two strong conditions by
developing a novel task transfer framework where the expert preference is
applied as a guidance. In particular, we alternate the following two steps:
Firstly, letting experts apply pre-defined preference rules to select related
expert demonstrates for the target task. Secondly, based on the selection
result, we learn the target cost function and trajectory distribution
simultaneously via enhanced Adversarial MaxEnt IRL and generate more
trajectories by the learned target distribution for the next preference
selection. The theoretical analysis on the distribution learning and
convergence of the proposed algorithm are provided. Extensive simulations on
several benchmarks have been conducted for further verifying the effectiveness
of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed
equally to this wor
Time-Varying Priority Queuing Models for Human Dynamics
Queuing models provide insight into the temporal inhomogeneity of human
dynamics, characterized by the broad distribution of waiting times of
individuals performing tasks. We study the queuing model of an agent trying to
execute a task of interest, the priority of which may vary with time due to the
agent's "state of mind." However, its execution is disrupted by other tasks of
random priorities. By considering the priority of the task of interest either
decreasing or increasing algebraically in time, we analytically obtain and
numerically confirm the bimodal and unimodal waiting time distributions with
power-law decaying tails, respectively. These results are also compared to the
updating time distribution of papers in the arXiv.org and the processing time
distribution of papers in Physical Review journals. Our analysis helps to
understand human task execution in a more realistic scenario.Comment: 8 pages, 6 figure
Procrastination with variable present bias
Individuals working towards a goal often exhibit time inconsistent behavior,
making plans and then failing to follow through. One well-known model of such
behavioral anomalies is present-bias discounting: individuals over-weight
present costs by a bias factor. This model explains many time-inconsistent
behaviors, but can make stark predictions in many settings: individuals either
follow the most efficient plan for reaching their goal or procrastinate
indefinitely.
We propose a modification in which the present-bias parameter can vary over
time, drawn independently each step from a fixed distribution. Following
Kleinberg and Oren (2014), we use a weighted task graph to model task planning,
and measure the cost of procrastination as the relative expected cost of the
chosen path versus the optimal path. We use a novel connection to optimal
pricing theory to describe the structure of the worst-case task graph for any
present-bias distribution. We then leverage this structure to derive conditions
on the bias distribution under which the worst-case ratio is exponential (in
time) or constant. We also examine conditions on the task graph that lead to
improved procrastination ratios: graphs with a uniformly bounded distance to
the goal, and graphs in which the distance to the goal monotonically decreases
on any path.Comment: 19 pages, 2 figures. To appear in the 17th ACM Conference on
Economics and Computation (EC 2016
Chemical industry supply chain optimisation using agent-based modelling
In this paper we present an application of Supply Chain Spread Sheet Simulator (SCSS) in a task dealing with chemical industry supply chain redesign and optimisation. SCSS uses principles of Agent-Based Modelling combining 4 types of agents with 3 algorithms to control their behaviour. Location Algorithm is used to place the logistics objects satisfying the demand of customers, Clarke&Wright's Savings Algorithm is applied to plan the routes and Past Stock Movement Simulation is used to control the stock levels. SCSS is developed in MS Excel using programming language Visual Basic for Applications. Its basic functionality is discussed simulating a real task dealing with the redesign of the distribution system for goods coming from chemical industry in the Czech Republic. We test 6 different structures of the distribution system differing in number of located logistics objects ranging from 1 to 6. Based on the outputs of SCSS recalculated to distribution costs we suggest decreasing the number of located warehouses from 6 to 1 estimating almost 33 % distribution costs savings per year
- …
