29,312 research outputs found
Policies for allocation of information in task-oriented groups: elitism and egalitarianism outperform welfarism
Communication or influence networks are probably the most controllable of all
factors that are known to impact on the problem-solving capability of
task-forces. In the case connections are costly, it is necessary to implement a
policy to allocate them to the individuals. Here we use an agent-based model to
study how distinct allocation policies affect the performance of a group of
agents whose task is to find the global maxima of NK fitness landscapes. Agents
cooperate by broadcasting messages informing on their fitness and use this
information to imitate the fittest agent in their influence neighborhoods. The
larger the influence neighborhood of an agent, the more links, and hence
information, the agent receives. We find that the elitist policy in which
agents with above-average fitness have their influence neighborhoods amplified,
whereas agents with below-average fitness have theirs deflated, is optimal for
smooth landscapes, provided the group size is not too small. For rugged
landscapes, however, the elitist policy can perform very poorly for certain
group sizes. In addition, we find that the egalitarian policy, in which the
size of the influence neighborhood is the same for all agents, is optimal for
both smooth and rugged landscapes in the case of small groups. The welfarist
policy, in which the actions of the elitist policy are reversed, is always
suboptimal, i.e., depending on the group size it is outperformed by either the
elitist or the egalitarian policies
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
Efficient Supervision for Robot Learning via Imitation, Simulation, and Adaptation
Recent successes in machine learning have led to a shift in the design of
autonomous systems, improving performance on existing tasks and rendering new
applications possible. Data-focused approaches gain relevance across diverse,
intricate applications when developing data collection and curation pipelines
becomes more effective than manual behaviour design. The following work aims at
increasing the efficiency of this pipeline in two principal ways: by utilising
more powerful sources of informative data and by extracting additional
information from existing data. In particular, we target three orthogonal
fronts: imitation learning, domain adaptation, and transfer from simulation.Comment: Dissertation Summar
Recommended from our members
Peer effects in risk taking: Envy or conformity?
We examine two explanations for peer effects in risk taking: relative payoff concerns and preferences that depend on peer choices. We vary experimentally whether individuals can condition a simple lottery choice on the lottery choice or the lottery allocation of a peer. We find that peer effects increase significantly, almost double, when peers make choices, relative to when they are allocated a lottery. In both situations, imitation is the most frequent form of peer effect. Hence, peer effects in our environment are explained by a combination of relative payoff concerns and preferences that depend on peer choices. Comparative statics analyses and structural estimation results suggest that a norm to conform to the peer may explain why peer choices matter. Our results suggest that peer choices are important in generating peer effects and hence have important implications for modeling as well as for policy
- …