231,899 research outputs found

    Actor-critic reinforcement learning algorithms for yaw control of an Autonomous Underwater Vehicle

    Get PDF
    An Autonomous Underwater Vehicle (AUV) poses unique challenges that must be solved in order to achieve persistent autonomy. The requirement of persistent autonomy entails that a control solution must be capable of controlling a vehicle that is operating in an environment with complex non-linear dynamics and adapt to changes in those dynamics. In essence, artificial intelligence is required so that the vehicle can learn from its experience operating in the domain. In this thesis, reinforcement learning is the chosen machine learning mechanism. This learning paradigm is investigated by applying multiple actor-critic temporal difference learning algorithms to the yaw degree-of-freedom of a simulated model and the physical hardware of the Nessie VII AUV in a closed-loop feedback control problem. Additionally, results are also presented for path planning and path optimisation problems. These control problems are solved by modelling the AUV’s interaction with its environment as an optimal decision-making problem using a Markov Decision Process (MDP). Two novel actor-critic temporal difference learning algorithms called Linear True Online Continuous Learning Automation (Linear TOCLA) and Non-linear True Online Continuous Learning Automation (Non-linear TOCLA) are also presented and serve as new contributions to the reinforcement learning research community. These algorithms have been applied to the real Nessie vehicle and its simulated model. The proposed algorithms hold theoretical and practical advantages over previous state-of-the-art temporal difference learning algorithms. A new genetic algorithm is also presented and developed specifically for the optimisation of the continuous-valued reinforcement learning algorithms’. This genetic algorithm is used to find the optimal hyperparameters for four actor-critic algorithms in the well-known continuous-valued mountain car reinforcement learning benchmark problem. The results of this benchmark show that the Non-linear TOCLA algorithm achieves a similar performance to the state-of-the-art forward actor-critic algorithm it extends while significantly reducing the sensitivity of the hyperparameter selection. This reduction in hyperparameter sensitivity is shown using the distribution of optimal hyperparameters from ten separate optimisation runs. The actor learning rate of the forward actor-critic algorithm had a standard deviation of 0.00088, while the Non-linear TOCLA algorithm demonstrated a standard deviation of 0.00186. An even greater improvement is observed in the multi-step target weight, λ, which increased from a standard deviation of 0.036 for the forward actor-critic to 0.266 for the Non-linear TOCLA algorithm. All of the sourcecode used to generate the results in this thesis has been made available as open-source software.ARchaeological RObot systems for the Worlds Seas (ARROWS) EU FP7 project under grant agreement ID 30872

    Hierarchical Propagation Networks for Fake News Detection: Investigation and Exploitation

    Full text link
    Consuming news from social media is becoming increasingly popular. However, social media also enables the widespread of fake news. Because of its detrimental effects brought by social media, fake news detection has attracted increasing attention. However, the performance of detecting fake news only from news content is generally limited as fake news pieces are written to mimic true news. In the real world, news pieces spread through propagation networks on social media. The news propagation networks usually involve multi-levels. In this paper, we study the challenging problem of investigating and exploiting news hierarchical propagation network on social media for fake news detection. In an attempt to understand the correlations between news propagation networks and fake news, first, we build a hierarchical propagation network from macro-level and micro-level of fake news and true news; second, we perform a comparative analysis of the propagation network features of linguistic, structural and temporal perspectives between fake and real news, which demonstrates the potential of utilizing these features to detect fake news; third, we show the effectiveness of these propagation network features for fake news detection. We further validate the effectiveness of these features from feature important analysis. Altogether, this work presents a data-driven view of hierarchical propagation network and fake news and paves the way towards a healthier online news ecosystem.Comment: 10 page
    • …
    corecore