Search CORE

91 research outputs found

High fidelity progressive reinforcement learning for agile maneuvering UAVs

Author: Abbeel P.
Faust A.
Kim H. J.
Lillicrap T. P.
Uzun S.
Yuksek B.
Zhang T.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 05/01/2020
Field of study

In this work, we present a high fidelity model based progressive reinforcement learning method for control system design for an agile maneuvering UAV. Our work relies on a simulation-based training and testing environment for doing software-in-the-loop (SIL), hardware-in-the-loop (HIL) and integrated flight testing within photo-realistic virtual reality (VR) environment. Through progressive learning with the high fidelity agent and environment models, the guidance and control policies build agile maneuvering based on fundamental control laws. First, we provide insight on development of high fidelity mathematical models using frequency domain system identification. These models are later used to design reinforcement learning based adaptive flight control laws allowing the vehicle to be controlled over a wide range of operating conditions covering model changes on operating conditions such as payload, voltage and damage to actuators and electronic speed controllers (ESCs). We later design outer flight guidance and control laws. Our current work and progress is summarized in this work

Crossref

Cranfield CERES

vrAIn: a deep learning approach tailoring computing and radio resources in virtualized RANs

Author: Bega D.
Goodfellow I.
Kawser M. T.
Li Y.
Lillicrap T. P.
Rost P.
Silver D.
Turner P.
U.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/09/2019
Field of study

Proceeding of: 25th Annual International Conference on Mobile Computing and Networking (MobiCom'19), October 21-25, 2019, Los Cabos, Mexico.The virtualization of radio access networks (vRAN) is the last milestone in the NFV revolution. However, the complex dependencies between computing and radio resources make vRAN resource control particularly daunting. We present vrAIn, a dynamic resource controller for vRANs based on deep reinforcement learning. First, we use an autoencoder to project high-dimensional context data (traffic and signal quality patterns) into a latent representation. Then, we use a deep deterministic policy gradient (DDPG) algorithm based on an actor-critic neural network structure and a classifier to map (encoded) contexts into resource control decisions. We have implemented vrAIn using an open-source LTE stack over different platforms. Our results show that vrAIn successfully derives appropriate compute and radio control actions irrespective of the platform and context: (i) it provides savings in computational capacity of up to 30% over CPU-unaware methods; (ii) it improves the probability of meeting QoS targets by 25% over static allocation policies using similar CPU resources in average; (iii) upon CPU capacity shortage, it improves throughput performance by 25% over state-of-the-art schemes; and (iv) it performs close to optimal policies resulting from an offline oracle. To the best of our knowledge, this is the first work that thoroughly studies the computational behavior of vRANs, and the first approach to a model-free solution that does not need to assume any particular vRAN platform or system conditions.The work of University Carlos III of Madrid was supported by H2020 5GMoNArch project (grant agreement no. 761445) and H2020 5G-TOURS project (grant agreement no. 856950). The work of NEC Laboratories Europe was supported by H2020 5GTRANSFORMER project (grant agreement no. 761536) and 5GROWTH project (grant agreement no. 856709). The work of University of Cartagena was supported by Grant AEI/FEDER TEC2016-76465-C2-1-R (AIM) and Grant FPU14/03701.Publicad

Crossref

Universidad Carlos III de Madrid e-Archivo

Vector-based navigation using grid-like representations in artificial agents

Author: Banino A
Barry CJ
Benigno U
Blundell C
Chadwick M
Hadsell R
Hassabis D
Kumaran D
Lillicrap T
Mirowski P
Pritzel A
Publication venue
Publication date: 08/05/2018
Field of study

Deep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go. Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex. Grid cells are thought to provide a multi-scale periodic representation that functions as a metric for coding space and is critical for integrating self-motion (path integration) and planning direct trajectories to goals (vector-based navigation). Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammal-like navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types12. We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environments—optimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with grid-like representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vector-based navigation derived from grid-like units within the network. Furthermore, grid-like representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vector-based navigation, demonstrating that the latter can be combined with path-based strategies to support navigation in challenging environments

UCL Discovery

Plasmin Generation Potential and Recanalization in Acute Ischaemic Stroke; an Observational Cohort Study of Stroke Biobank Samples.

Author: Attia J
Bivard A
Bustamante A
Cadenas IF
Chan J
Choi P
Cloud G
Draxler DF
Garcia-Esperon C
Gyawali P
Hamilton-Bruce MA
Harman S
Ho H
Holliday E
Keragala CB
Kleinig T
Koblar S
Levi CR
Lillicrap T
Lincz L
Maguire JM
Medcalf RL
Montaner J
Niego B
Parsons MW
Spratt N
Worrall BB
Publication venue: 'Frontiers Media SA'
Publication date: 09/12/2020
Field of study

Rationale: More than half of patients who receive thrombolysis for acute ischaemic stroke fail to recanalize. Elucidating biological factors which predict recanalization could identify therapeutic targets for increasing thrombolysis success. Hypothesis: We hypothesize that individual patient plasmin potential, as measured by in vitro response to recombinant tissue-type plasminogen activator (rt-PA), is a biomarker of rt-PA response, and that patients with greater plasmin response are more likely to recanalize early. Methods: This study will use historical samples from the Barcelona Stroke Thrombolysis Biobank, comprised of 350 pre-thrombolysis plasma samples from ischaemic stroke patients who received serial transcranial-Doppler (TCD) measurements before and after thrombolysis. The plasmin potential of each patient will be measured using the level of plasmin-antiplasmin complex (PAP) generated after in-vitro addition of rt-PA. Levels of antiplasmin, plasminogen, t-PA activity, and PAI-1 activity will also be determined. Association between plasmin potential variables and time to recanalization [assessed on serial TCD using the thrombolysis in brain ischemia (TIBI) score] will be assessed using Cox proportional hazards models, adjusted for potential confounders. Outcomes: The primary outcome will be time to recanalization detected by TCD (defined as TIBI ≥4). Secondary outcomes will be recanalization within 6-h and recanalization and/or haemorrhagic transformation at 24-h. This analysis will utilize an expanded cohort including ~120 patients from the Targeting Optimal Thrombolysis Outcomes (TOTO) study. Discussion: If association between proteolytic response to rt-PA and recanalization is confirmed, future clinical treatment may customize thrombolytic therapy to maximize outcomes and minimize adverse effects for individual patients

OPUS - University of Technology Sydney

Learning to Communicate: A Machine Learning Framework for Heterogeneous Multi-Agent Robotic Systems

Author: Foerster J.
Foerster J.
Hausknecht M.
Heess N.
Hinton G. E.
Ioffe S.
Juliani A.
Kaminer I.
Konda V. R.
Kushner H.
Lample G.
Lillicrap T. P.
Lowe R.
Mao H.
Shalev-Shwartz S.
Silver D.
Sutton R. S.
Tsitsiklis J. N.
Yoon H.-J.
Publication venue
Publication date: 12/12/2018
Field of study

We present a machine learning framework for multi-agent systems to learn both the optimal policy for maximizing the rewards and the encoding of the high dimensional visual observation. The encoding is useful for sharing local visual observations with other agents under communication resource constraints. The actor-encoder encodes the raw images and chooses an action based on local observations and messages sent by the other agents. The machine learning agent generates not only an actuator command to the physical device, but also a communication message to the other agents. We formulate a reinforcement learning problem, which extends the action space to consider the communication action as well. The feasibility of the reinforcement learning framework is demonstrated using a 3D simulation environment with two collaborating agents. The environment provides realistic visual observations to be used and shared between the two agents.Comment: AIAA SciTech 201

arXiv.org e-Print Archive

Crossref

Design and evaluation of advanced intelligent flight controllers

Author: Abadi M.
Brockman G.
Dorf R. C.
Goupil P.
Kalman R. E.
Keijzer T.
Lillicrap T. P.
Lin S. H.
Looye G.
Mnih V.
Mnih V.
Sutton R. S.
Weiser C.
Zribi A.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 05/01/2020
Field of study

Reinforcement learning based methods could be feasible of solving adaptive optimal control problems for nonlinear dynamical systems. This work presents a proof of concept for applying reinforcement learning based methods to robust and adaptive flight control tasks. A framework for designing and examining these methods is introduced by means of the open research civil aircraft model (RCAM) and optimality criteria. A state-of-the-art robust flight controller - the incremental nonlinear dynamic inversion (INDI) controller - serves as a reference controller. Two intelligent control methods are introduced and examined. The deep deterministic policy gradient (DDPG) controller is selected as a promising actor critic reinforcement learning method that currently gains much attraction in the field of robotics. In addition, an adaptive version of a proportional-integral-derivative (PID) controller, the PID neural network (PIDNN) controller, is selected as the second method. The results show that all controllers are able to control the aircraft model. Moreover, the PIDNN controller exhibits improved reference tracking if a good initial guess of its weights is available. In turn, the DDPG algorithm is able to control the nonlinear aircraft model while minimizing a multi-objective value function. This work provides insight into the usability of selected intelligent controllers as flight control functions as well as a comparison to state-of-the-art flight control functions

Institute of Transport Research:Publications

Crossref

Investigations to extend viability of a rainbow trout primary gill cell culture

Author: A Lillicrap
A Lillicrap
AG Jimenez
AT El-Dakhly
Awadhesh N. Jha
B Srinivasan
BS Zhou
CM Wood
CM Wood
CM Wood
CM Wood
D Boyle
F Galvez
F Zakaria-Runkat
FI Iftikar
G Rathore
GB West
H Hashimoto
I Leguen
J Farkas
J Rosa
JM Wilson
KE Tollefsen
KM Gilmour
LC Stott
M Fletcher
M Fujiwara
M Fujiwara
M Minghetti
Matthew G. Baron
MG Baron
MG Baron
N Burden
NR Bury
P Pärt
PA Walker
PA Walker
Richard J. Maunder
S Chen
S Pawlowski
S Schnell
SF Perry
SF Perry
SF Perry
SJ Alper
SP Kelly
SP Kelly
SP Kelly
Stewart F. Owen
T Kocal
TL Weissgerber
V Matey
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2017
Field of study

Crossref

Plymouth Electronic Archive and Research Library

Knowledge Hub on the Integrated Assessment of Chemical Contaminants and their Effects on the Marine Environment

Author: Bellas J. (Juan)
Brockmeyer B. (Berit)
Brooks S. (Steven)
Burgeot T. (Thierry)
Chacón E. (Esther)
De Witte B. (Bavo)
Deudero S. (Salud)
Giani M. (Michele)
Giorgi G. (Giordano)
Hanke G. (Georg)
Katsiadaki I. (Ioanna)
Lee Behrens H. (Hanna)
Lillicrap A. (Adam)
Maggi C. (Chiara)
Mauffret A. (Aourell)
Parmentier K. (Koen)
Parts L. (Laine)
Roose P. (Patrick)
Schulz-Bull D. (Detlef)
Tornero V. (Victoria)
Trujillo A. (Abraham)
Ureta J. (Jorge)
Vethaak D. (Dick)
Publication venue: Centro Oceanográfico de Vigo
Publication date: 25/06/2021
Field of study

In a time of environmental awareness, spurred on by the possibility that our world is threatened by climate change, it is important to remember that there are other anthropogenic pressures, which are also essential for addressing the protection of the marine and coastal environment. Pollution is a global, complex issue that contributes to biodiversity loss and poor environmental health and comes from the production and release of many of the synthetic chemicals that we use in our daily lives. Chemical contaminants are often underrepresented as a major contributor of environmental deterioration. The Joint Programming Initiative Healthy and Productive Seas and Oceans (JPI Oceans) established in 2018 the JPI Oceans Knowledge Hub on the integrated assessment of chemical contaminants and their effects on the marine environment. The purpose of the Knowledge Hub was to provide recommendations on how to improve the methodological basis for marine chemical status assessment. The work has resulted in the following policy paper which focuses on improving the efficiency and implementation of integrated assessment methodology of effects of chemicals of emerging concern. Substantial additional knowledge of biological effects is needed to achieve Good Environmental Status (GES) of our oceans and coastal areas. The Knowledge Hub is represented by highly skilled scientists and policy makers, appointed by the JPI Oceans Management Board, to ensure that the recommendations provided are useful for policy making

Digital.CSIC

Repositorio Institucional Digital del IEO