231,899 research outputs found
Actor-critic reinforcement learning algorithms for yaw control of an Autonomous Underwater Vehicle
An Autonomous Underwater Vehicle (AUV) poses unique challenges that must be solved
in order to achieve persistent autonomy. The requirement of persistent autonomy entails
that a control solution must be capable of controlling a vehicle that is operating in an
environment with complex non-linear dynamics and adapt to changes in those dynamics.
In essence, artificial intelligence is required so that the vehicle can learn from its
experience operating in the domain.
In this thesis, reinforcement learning is the chosen machine learning mechanism. This
learning paradigm is investigated by applying multiple actor-critic temporal difference
learning algorithms to the yaw degree-of-freedom of a simulated model and the physical
hardware of the Nessie VII AUV in a closed-loop feedback control problem. Additionally,
results are also presented for path planning and path optimisation problems. These control
problems are solved by modelling the AUV’s interaction with its environment as an
optimal decision-making problem using a Markov Decision Process (MDP).
Two novel actor-critic temporal difference learning algorithms called Linear True
Online Continuous Learning Automation (Linear TOCLA) and Non-linear True Online
Continuous Learning Automation (Non-linear TOCLA) are also presented and serve as
new contributions to the reinforcement learning research community. These algorithms
have been applied to the real Nessie vehicle and its simulated model. The proposed
algorithms hold theoretical and practical advantages over previous state-of-the-art
temporal difference learning algorithms. A new genetic algorithm is also presented
and developed specifically for the optimisation of the continuous-valued reinforcement
learning algorithms’. This genetic algorithm is used to find the optimal hyperparameters
for four actor-critic algorithms in the well-known continuous-valued mountain car
reinforcement learning benchmark problem.
The results of this benchmark show that the Non-linear TOCLA algorithm achieves a
similar performance to the state-of-the-art forward actor-critic algorithm it extends while
significantly reducing the sensitivity of the hyperparameter selection. This reduction in
hyperparameter sensitivity is shown using the distribution of optimal hyperparameters
from ten separate optimisation runs. The actor learning rate of the forward actor-critic
algorithm had a standard deviation of 0.00088, while the Non-linear TOCLA algorithm
demonstrated a standard deviation of 0.00186. An even greater improvement is observed
in the multi-step target weight, λ, which increased from a standard deviation of 0.036 for
the forward actor-critic to 0.266 for the Non-linear TOCLA algorithm. All of the sourcecode used to generate the results in this thesis has been made available as open-source
software.ARchaeological RObot systems for the Worlds Seas (ARROWS) EU FP7 project under grant agreement ID
30872
Hierarchical Propagation Networks for Fake News Detection: Investigation and Exploitation
Consuming news from social media is becoming increasingly popular. However,
social media also enables the widespread of fake news. Because of its
detrimental effects brought by social media, fake news detection has attracted
increasing attention. However, the performance of detecting fake news only from
news content is generally limited as fake news pieces are written to mimic true
news. In the real world, news pieces spread through propagation networks on
social media. The news propagation networks usually involve multi-levels. In
this paper, we study the challenging problem of investigating and exploiting
news hierarchical propagation network on social media for fake news detection.
In an attempt to understand the correlations between news propagation
networks and fake news, first, we build a hierarchical propagation network from
macro-level and micro-level of fake news and true news; second, we perform a
comparative analysis of the propagation network features of linguistic,
structural and temporal perspectives between fake and real news, which
demonstrates the potential of utilizing these features to detect fake news;
third, we show the effectiveness of these propagation network features for fake
news detection. We further validate the effectiveness of these features from
feature important analysis. Altogether, this work presents a data-driven view
of hierarchical propagation network and fake news and paves the way towards a
healthier online news ecosystem.Comment: 10 page
- …