15,509 research outputs found
Approaches for Future Internet architecture design and Quality of Experience (QoE) Control
Researching a Future Internet capable of overcoming the current Internet limitations is a strategic
investment. In this respect, this paper presents some concepts that can contribute to provide some guidelines to
overcome the above-mentioned limitations. In the authors' vision, a key Future Internet target is to allow
applications to transparently, efficiently and flexibly exploit the available network resources with the aim to
match the users' expectations. Such expectations could be expressed in terms of a properly defined Quality of
Experience (QoE). In this respect, this paper provides some approaches for coping with the QoE provision
problem
Multi-agent quality of experience control
In the framework of the Future Internet, the aim of the Quality of Experience (QoE) Control functionalities is to track the personalized desired QoE level of the applications. The paper proposes to perform such a task by dynamically selecting the most appropriate Classes of Service (among the ones supported by the network), this selection being driven by a novel heuristic Multi-Agent Reinforcement Learning (MARL) algorithm. The paper shows that such an approach offers the opportunity to cope with some practical implementation problems: in particular, it allows to face the so-called “curse of dimensionality” of MARL algorithms, thus achieving satisfactory performance results even in the presence of several hundreds of Agents
Gradient-free Policy Architecture Search and Adaptation
We develop a method for policy architecture search and adaptation via
gradient-free optimization which can learn to perform autonomous driving tasks.
By learning from both demonstration and environmental reward we develop a model
that can learn with relatively few early catastrophic failures. We first learn
an architecture of appropriate complexity to perceive aspects of world state
relevant to the expert demonstration, and then mitigate the effect of
domain-shift during deployment by adapting a policy demonstrated in a source
domain to rewards obtained in a target environment. We show that our approach
allows safer learning than baseline methods, offering a reduced cumulative
crash metric over the agent's lifetime as it learns to drive in a realistic
simulated environment.Comment: Accepted in Conference on Robot Learning, 201
- …