Search CORE

53 research outputs found

Deep Reinforcement Learning for Intraday Power Trading

Author: Ballesteros Castilla D.
Publication venue
Publication date: 16/01/2020
Field of study

Introducing the new paradigm of Social Dispersed Computing: Applications, Technologies and Challenges

Author: Abhishek Dubey
Al Mallah
Alzahrani
Artikis
Azaria
Beck
Beloglazov
Benson
Bergquist
Bessani
Bessani
Blackstock
Boissier
Bonomi
Bormann
Botta
Buysse
Chen
Cho
Choudhary
Chow
Cox
Cugola
Dag
del Val
Denti
Dewri
Dolui
Dubey
Eisele
Gai
García-Fornes
García-Valls
García-Valls
García-Valls
García-Valls
García-Valls
Ghafouri
Ghosh
Hall
Hara
Hewitt
Hindman
Hu
Hunkeler
Jararweh
Kamijo
Kandoi
Khan
King
Kleiner
Kok
Kong
Kreutz
Krčo
Kvaternik
Kwoczek
Kwoczek
Lamport
Lamport
Lamport
Laszka
Lev-Ari
Levin
Li
Liu
Liu
Lockwood
Lu
Luck
Mao
Marisol García-Valls
Masdari
Mavridou
McKeown
Mell
Melton
Mocevicius
Mollah
Morsy
Mueffelmann
Mukherjee
Neagoe
Ongaro
Ongaro
Paolucci
Preden
Rasmussen
Rhea
Robinson
Sapienza
Satyanarayanan
Sheth
Shi
Shi
Sierra
Simmhan
Spillner
Stojmenovic
Storey
Suhothayan
Varghese
Veeraraghavan
Verbelen
Vicent Botti
Willis
Wooldridge
Wooldridge
Xu
Yang
Yi
Yi
Yuan
Zygouras
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

[EN] If last decade viewed computational services as a utility then surely this decade has transformed computation into a commodity. Computation is now progressively integrated into the physical networks in a seamless way that enables cyber-physical systems (CPS) and the Internet of Things (IoT) meet their latency requirements. Similar to the concept of ¿platform as a service¿ or ¿software as a service¿, both cloudlets and fog computing have found their own use cases. Edge devices (that we call end or user devices for disambiguation) play the role of personal computers, dedicated to a user and to a set of correlated applications. In this new scenario, the boundaries between the network node, the sensor, and the actuator are blurring, driven primarily by the computation power of IoT nodes like single board computers and the smartphones. The bigger data generated in this type of networks needs clever, scalable, and possibly decentralized computing solutions that can scale independently as required. Any node can be seen as part of a graph, with the capacity to serve as a computing or network router node, or both. Complex applications can possibly be distributed over this graph or network of nodes to improve the overall performance like the amount of data processed over time. In this paper, we identify this new computing paradigm that we call Social Dispersed Computing, analyzing key themes in it that includes a new outlook on its relation to agent based applications. We architect this new paradigm by providing supportive application examples that include next generation electrical energy distribution networks, next generation mobility services for transportation, and applications for distributed analysis and identification of non-recurring traffic congestion in cities. The paper analyzes the existing computing paradigms (e.g., cloud, fog, edge, mobile edge, social, etc.), solving the ambiguity of their definitions; and analyzes and discusses the relevant foundational software technologies, the remaining challenges, and research opportunities.Garcia Valls, MS.; Dubey, A.; Botti, V. (2018). Introducing the new paradigm of Social Dispersed Computing: Applications, Technologies and Challenges. Journal of Systems Architecture. 91:83-102. https://doi.org/10.1016/j.sysarc.2018.05.007S831029

Crossref

RiuNet

Many-agent Reinforcement Learning

Author: Yang Yaodong
Publication venue: UCL (University College London)
Publication date: 28/03/2021
Field of study

Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (

N \gg 2

), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm --

\alpha^\alpha

-Rank -- in many-agent systems. The critical advantage of

\alpha^\alpha

-Rank is that it can compute the solution concept of

\alpha

-Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be

PPAD

-hard in even two-player cases.

\alpha^\alpha

-Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games

UCL Discovery

Agoric computation: trust and cyber-physical systems

Author: Bottone M.
Bottone M.
Publication venue
Publication date: 01/01/2018
Field of study

In the past two decades advances in miniaturisation and economies of scale have led to the emergence of billions of connected components that have provided both a spur and a blueprint for the development of smart products acting in specialised environments which are uniquely identifiable, localisable, and capable of autonomy. Adopting the computational perspective of multi-agent systems (MAS) as a technological abstraction married with the engineering perspective of cyber-physical systems (CPS) has provided fertile ground for designing, developing and deploying software applications in smart automated context such as manufacturing, power grids, avionics, healthcare and logistics, capable of being decentralised, intelligent, reconfigurable, modular, flexible, robust, adaptive and responsive. Current agent technologies are, however, ill suited for information-based environments, making it difficult to formalise and implement multiagent systems based on inherently dynamical functional concepts such as trust and reliability, which present special challenges when scaling from small to large systems of agents. To overcome such challenges, it is useful to adopt a unified approach which we term agoric computation, integrating logical, mathematical and programming concepts towards the development of agent-based solutions based on recursive, compositional principles, where smaller systems feed via directed information flows into larger hierarchical systems that define their global environment. Considering information as an integral part of the environment naturally defines a web of operations where components of a systems are wired in some way and each set of inputs and outputs are allowed to carry some value. These operations are stateless abstractions and procedures that act on some stateful cells that cumulate partial information, and it is possible to compose such abstractions into higher-level ones, using a publish-and-subscribe interaction model that keeps track of update messages between abstractions and values in the data. In this thesis we review the logical and mathematical basis of such abstractions and take steps towards the software implementation of agoric modelling as a framework for simulation and verification of the reliability of increasingly complex systems, and report on experimental results related to a few select applications, such as stigmergic interaction in mobile robotics, integrating raw data into agent perceptions, trust and trustworthiness in orchestrated open systems, computing the epistemic cost of trust when reasoning in networks of agents seeded with contradictory information, and trust models for distributed ledgers in the Internet of Things (IoT); and provide a roadmap for future developments of our research

Middlesex University Research Repository