13 research outputs found
Strategy and Benchmark for Converting Deep Q-Networks to Event-Driven Spiking Neural Networks
Spiking neural networks (SNNs) have great potential for energy-efficient
implementation of Deep Neural Networks (DNNs) on dedicated neuromorphic
hardware. Recent studies demonstrated competitive performance of SNNs compared
with DNNs on image classification tasks, including CIFAR-10 and ImageNet data.
The present work focuses on using SNNs in combination with deep reinforcement
learning in ATARI games, which involves additional complexity as compared to
image classification. We review the theory of converting DNNs to SNNs and
extending the conversion to Deep Q-Networks (DQNs). We propose a robust
representation of the firing rate to reduce the error during the conversion
process. In addition, we introduce a new metric to evaluate the conversion
process by comparing the decisions made by the DQN and SNN, respectively. We
also analyze how the simulation time and parameter normalization influence the
performance of converted SNNs. We achieve competitive scores on 17
top-performing Atari games. To the best of our knowledge, our work is the first
to achieve state-of-the-art performance on multiple Atari games with SNNs. Our
work serves as a benchmark for the conversion of DQNs to SNNs and paves the way
for further research on solving reinforcement learning tasks with SNNs.Comment: Accepted by AAAI202
QC-SANE: Robust Control in DRL using Quantile Critic with Spiking Actor and Normalized Ensemble
Recently Introduced Deep Reinforcement Learning (DRL) Techniques in Discrete-Time Have Resulted in Significant Advances in Online Games, Robotics, and So On. Inspired from Recent Developments, We Have Proposed an Approach Referred to as Quantile Critic with Spiking Actor and Normalized Ensemble (QC-SANE) for Continuous Control Problems, Which Uses Quantile Loss to Train Critic and a Spiking Neural Network (NN) to Train an Ensemble of Actors. the NN Does an Internal Normalization using a Scaled Exponential Linear Unit (SELU) Activation Function and Ensures Robustness. the Empirical Study on Multijoint Dynamics with Contact (MuJoCo)-Based Environments Shows Improved Training and Test Results Than the State-Of-The-Art Approach: Population Coded Spiking Actor Network (PopSAN)
Tuning Synaptic Connections instead of Weights by Genetic Algorithm in Spiking Policy Network
Learning from the interaction is the primary way biological agents know about
the environment and themselves. Modern deep reinforcement learning (DRL)
explores a computational approach to learning from interaction and has
significantly progressed in solving various tasks. However, the powerful DRL is
still far from biological agents in energy efficiency. Although the underlying
mechanisms are not fully understood, we believe that the integration of spiking
communication between neurons and biologically-plausible synaptic plasticity
plays a prominent role. Following this biological intuition, we optimize a
spiking policy network (SPN) by a genetic algorithm as an energy-efficient
alternative to DRL. Our SPN mimics the sensorimotor neuron pathway of insects
and communicates through event-based spikes. Inspired by biological research
that the brain forms memories by forming new synaptic connections and rewires
these connections based on new experiences, we tune the synaptic connections
instead of weights in SPN to solve given tasks. Experimental results on several
robotic control tasks show that our method can achieve the performance level of
mainstream DRL methods and exhibit significantly higher energy efficiency
Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning
With the Deep Neural Networks (DNNs) as a powerful function approximator,
Deep Reinforcement Learning (DRL) has been excellently demonstrated on robotic
control tasks. Compared to DNNs with vanilla artificial neurons, the
biologically plausible Spiking Neural Network (SNN) contains a diverse
population of spiking neurons, making it naturally powerful on state
representation with spatial and temporal information. Based on a hybrid
learning framework, where a spike actor-network infers actions from states and
a deep critic network evaluates the actor, we propose a Population-coding and
Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state
representation from two different scales: input coding and neuronal coding. For
input coding, we apply population coding with dynamically receptive fields to
directly encode each input state component. For neuronal coding, we propose
different types of dynamic-neurons (containing 1st-order and 2nd-order neuronal
dynamics) to describe much more complex neuronal dynamics. Finally, the PDSAN
is trained in conjunction with deep critic networks using the Twin Delayed Deep
Deterministic policy gradient algorithm (TD3-PDSAN). Extensive experimental
results show that our TD3-PDSAN model achieves better performance than
state-of-the-art models on four OpenAI gym benchmark tasks. It is an important
attempt to improve RL with SNN towards the effective computation satisfying
biological plausibility.Comment: 27 pages, 11 figures, accepted by Journal of Neural Network
Algebraic Neural Architecture Representation, Evolutionary Neural Architecture Search, and Novelty Search in Deep Reinforcement Learning
Evolutionary algorithms have recently re-emerged as powerful tools for machine learning and artificial intelligence, especially when combined with advances in deep learning developed over the last decade. In contrast to the use of fixed architectures and rigid learning algorithms, we leveraged the open-endedness of evolutionary algorithms to make both theoretical and methodological contributions to deep reinforcement learning. This thesis explores and develops two major areas at the intersection of evolutionary algorithms and deep reinforcement learning: generative network architectures and behaviour-based optimization. Over three distinct contributions, both theoretical and experimental methods were applied to deliver a novel mathematical framework and experimental method for generative, modular neural network architecture search for reinforcement learning, and a generalized formulation of a behaviour- based optimization framework for reinforcement learning called novelty search. Experimental results indicate that both alternative, behaviour-based optimization and neural architecture search can each be used to improve learning in the popular Atari 2600 benchmark compared to DQN — a popular gradient-based method. These results are in-line with related work demonstrating that strictly gradient-free methods are competitive with gradient-based reinforcement learning. These contributions, together with other successful combinations of evolutionary algorithms and deep learning, demonstrate that alternative architectures and learning algorithms to those conventionally used in deep learning should be seriously investigated in an effort to drive progress in artificial intelligence
Simulation Intelligence: Towards a New Generation of Scientific Methods
The original "Seven Motifs" set forth a roadmap of essential methods for the
field of scientific computing, where a motif is an algorithmic method that
captures a pattern of computation and data movement. We present the "Nine
Motifs of Simulation Intelligence", a roadmap for the development and
integration of the essential algorithms necessary for a merger of scientific
computing, scientific simulation, and artificial intelligence. We call this
merger simulation intelligence (SI), for short. We argue the motifs of
simulation intelligence are interconnected and interdependent, much like the
components within the layers of an operating system. Using this metaphor, we
explore the nature of each layer of the simulation intelligence operating
system stack (SI-stack) and the motifs therein: (1) Multi-physics and
multi-scale modeling; (2) Surrogate modeling and emulation; (3)
Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based
modeling; (6) Probabilistic programming; (7) Differentiable programming; (8)
Open-ended optimization; (9) Machine programming. We believe coordinated
efforts between motifs offers immense opportunity to accelerate scientific
discovery, from solving inverse problems in synthetic biology and climate
science, to directing nuclear energy experiments and predicting emergent
behavior in socioeconomic settings. We elaborate on each layer of the SI-stack,
detailing the state-of-art methods, presenting examples to highlight challenges
and opportunities, and advocating for specific ways to advance the motifs and
the synergies from their combinations. Advancing and integrating these
technologies can enable a robust and efficient hypothesis-simulation-analysis
type of scientific method, which we introduce with several use-cases for
human-machine teaming and automated science
Special Topics in Information Technology
This open access book presents thirteen outstanding doctoral dissertations in Information Technology from the Department of Electronics, Information and Bioengineering, Politecnico di Milano, Italy. Information Technology has always been highly interdisciplinary, as many aspects have to be considered in IT systems. The doctoral studies program in IT at Politecnico di Milano emphasizes this interdisciplinary nature, which is becoming more and more important in recent technological advances, in collaborative projects, and in the education of young researchers. Accordingly, the focus of advanced research is on pursuing a rigorous approach to specific research topics starting from a broad background in various areas of Information Technology, especially Computer Science and Engineering, Electronics, Systems and Control, and Telecommunications. Each year, more than 50 PhDs graduate from the program. This book gathers the outcomes of the thirteen best theses defended in 2020-21 and selected for the IT PhD Award. Each of the authors provides a chapter summarizing his/her findings, including an introduction, description of methods, main achievements and future work on the topic. Hence, the book provides a cutting-edge overview of the latest research trends in Information Technology at Politecnico di Milano, presented in an easy-to-read format that will also appeal to non-specialists
Special Topics in Information Technology
This open access book presents thirteen outstanding doctoral dissertations in Information Technology from the Department of Electronics, Information and Bioengineering, Politecnico di Milano, Italy. Information Technology has always been highly interdisciplinary, as many aspects have to be considered in IT systems. The doctoral studies program in IT at Politecnico di Milano emphasizes this interdisciplinary nature, which is becoming more and more important in recent technological advances, in collaborative projects, and in the education of young researchers. Accordingly, the focus of advanced research is on pursuing a rigorous approach to specific research topics starting from a broad background in various areas of Information Technology, especially Computer Science and Engineering, Electronics, Systems and Control, and Telecommunications. Each year, more than 50 PhDs graduate from the program. This book gathers the outcomes of the thirteen best theses defended in 2020-21 and selected for the IT PhD Award. Each of the authors provides a chapter summarizing his/her findings, including an introduction, description of methods, main achievements and future work on the topic. Hence, the book provides a cutting-edge overview of the latest research trends in Information Technology at Politecnico di Milano, presented in an easy-to-read format that will also appeal to non-specialists