5,965 research outputs found
Inverse Reinforcement Learning in Swarm Systems
Inverse reinforcement learning (IRL) has become a useful tool for learning
behavioral models from demonstration data. However, IRL remains mostly
unexplored for multi-agent systems. In this paper, we show how the principle of
IRL can be extended to homogeneous large-scale problems, inspired by the
collective swarming behavior of natural systems. In particular, we make the
following contributions to the field: 1) We introduce the swarMDP framework, a
sub-class of decentralized partially observable Markov decision processes
endowed with a swarm characterization. 2) Exploiting the inherent homogeneity
of this framework, we reduce the resulting multi-agent IRL problem to a
single-agent one by proving that the agent-specific value functions in this
model coincide. 3) To solve the corresponding control problem, we propose a
novel heterogeneous learning scheme that is particularly tailored to the swarm
setting. Results on two example systems demonstrate that our framework is able
to produce meaningful local reward models from which we can replicate the
observed global system dynamics.Comment: 9 pages, 8 figures; ### Version 2 ### version accepted at AAMAS 201
A framework for proving the self-organization of dynamic systems
This paper aims at providing a rigorous definition of self- organization, one
of the most desired properties for dynamic systems (e.g., peer-to-peer systems,
sensor networks, cooperative robotics, or ad-hoc networks). We characterize
different classes of self-organization through liveness and safety properties
that both capture information re- garding the system entropy. We illustrate
these classes through study cases. The first ones are two representative P2P
overlays (CAN and Pas- try) and the others are specific implementations of
\Omega (the leader oracle) and one-shot query abstractions for dynamic
settings. Our study aims at understanding the limits and respective power of
existing self-organized protocols and lays the basis of designing robust
algorithm for dynamic systems
Coalition Formation and Combinatorial Auctions; Applications to Self-organization and Self-management in Utility Computing
In this paper we propose a two-stage protocol for resource management in a
hierarchically organized cloud. The first stage exploits spatial locality for
the formation of coalitions of supply agents; the second stage, a combinatorial
auction, is based on a modified proxy-based clock algorithm and has two phases,
a clock phase and a proxy phase. The clock phase supports price discovery; in
the second phase a proxy conducts multiple rounds of a combinatorial auction
for the package of services requested by each client. The protocol strikes a
balance between low-cost services for cloud clients and a decent profit for the
service providers. We also report the results of an empirical investigation of
the combinatorial auction stage of the protocol.Comment: 14 page
Graph Transformation Model of a Triangulated Network of Mobile Units
A triangulated network of mobile units is modelled by means of a graph trans-formation system in which graph nodes are labelled with geometric coordinates and edges are labelled with distances. Nodes represent mobile units and edges represent wireless radio communication links between them. Under concurrency the model can describe interesting practical scenarios, for example swarms of taxis in an urban environment. The contribution features the enhancement of a graph transformation system by trigonometric calculations. By the way it is also shown that the classical negative edge condition has only limited applicability if a strict locality principle is assumed, and "vice versa" that there are reasonable modeling cases in which this locality principle itself fails to suffice
Supervised cross-modal factor analysis for multiple modal data classification
In this paper we study the problem of learning from multiple modal data for
purpose of document classification. In this problem, each document is composed
two different modals of data, i.e., an image and a text. Cross-modal factor
analysis (CFA) has been proposed to project the two different modals of data to
a shared data space, so that the classification of a image or a text can be
performed directly in this space. A disadvantage of CFA is that it has ignored
the supervision information. In this paper, we improve CFA by incorporating the
supervision information to represent and classify both image and text modals of
documents. We project both image and text data to a shared data space by factor
analysis, and then train a class label predictor in the shared space to use the
class label information. The factor analysis parameter and the predictor
parameter are learned jointly by solving one single objective function. With
this objective function, we minimize the distance between the projections of
image and text of the same document, and the classification error of the
projection measured by hinge loss function. The objective function is optimized
by an alternate optimization strategy in an iterative algorithm. Experiments in
two different multiple modal document data sets show the advantage of the
proposed algorithm over other CFA methods
Coupling Memory and Computation for Locality Management
We articulate the need for managing (data) locality automatically rather than leaving it to the programmer, especially in parallel programming systems. To this end, we propose techniques for coupling tightly the computation (including the thread scheduler) and the memory manager so that data and computation can be positioned closely in hardware. Such tight coupling of computation and memory management is in sharp contrast with the prevailing practice of considering each in isolation. For example, memory-management techniques usually abstract the computation as an unknown "mutator", which is treated as a "black box". As an example of the approach, in this paper we consider a specific class of parallel computations, nested-parallel computations. Such computations dynamically create a nesting of parallel tasks. We propose a method for organizing memory as a tree of heaps reflecting the structure of the nesting. More specifically, our approach creates a heap for a task if it is separately scheduled on a processor. This allows us to couple garbage collection with the structure of the computation and the way in which it is dynamically scheduled on the processors. This coupling enables taking advantage of locality in the program by mapping it to the locality of the hardware. For example for improved locality a heap can be garbage collected immediately after its task finishes when the heap contents is likely in cache
- …