1,783 research outputs found

    Creating Multi-Level Skill Hierarchies in Reinforcement Learning

    Get PDF
    What is a useful skill hierarchy for an autonomous agent? We propose an answer based on the graphical structure of an agent's interaction with its environment. Our approach uses hierarchical graph partitioning to expose the structure of the graph at varying timescales, producing a skill hierarchy with multiple levels of abstraction. At each level of the hierarchy, skills move the agent between regions of the state space that are well connected within themselves but weakly connected to each other. We illustrate the utility of the proposed skill hierarchy in a wide variety of domains in the context of reinforcement learning

    Recent Advances in Social Data and Artificial Intelligence 2019

    Get PDF
    The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace

    Developing a Model for Explaining Network Attributes and Relationships of Organised Crime Activities by Utilizing Network Science

    Get PDF
    The main objective of this research is to provide an innovative exploratory model for investigating substantive organised crime activities. The study articulates 30 critical independent variables related to organised crime, network science and a comprehensive exploratory approach which converts measurements of the variables into meaningful crime related inferences and conclusions. A case study was conducted to review initial feasibility of the selected variables, exploratory approach and model, and the results suggesting good effectiveness and useability

    Digital supply chain surveillance using artificial intelligence: definitions, opportunities and risks

    Get PDF
    Digital Supply Chain Surveillance (DSCS) is the proactive monitoring and analysis of digital data that allows firms to extract information related to a supply network, without the explicit consent of firms involved in the supply chain. AI has made DSCS to become easier and larger-scale, posing significant opportunities for automated detection of actors and dependencies involved in a supply chain, which in turn, can help firms to detect risky, unethical and environmentally unsustainable practices. Here, we define DSCS, review priority areas using a survey conducted in the UK. Visibility, sustainability, resilience are significant areas that DSCS can support, through a number of machine-learning approaches and predictive algorithms. Despite anecdotal narrative on the importance of explainability of algorithmic results, practitioners often prefer accuracy over explainability; however, there are significant differences between industrial sectors and application areas. Using a case study, we highlight a number of concerns on the unchecked use of AI in DSCS, such as bias or misinterpretation resulting in erroneous conclusions, which may lead to suboptimal decisions or relationship damage. Building on this, we develop and discuss a number of illustrative cases to highlight risks that practitioners should be aware of, proposing key areas of further research

    Policy space abstraction for a lifelong learning agent

    Get PDF
    This thesis is concerned with policy space abstractions that concisely encode alternative ways of making decisions; dealing with discovery, learning, adaptation and use of these abstractions. This work is motivated by the problem faced by autonomous agents that operate within a domain for long periods of time, hence having to learn to solve many different task instances that share some structural attributes. An example of such a domain is an autonomous robot in a dynamic domestic environment. Such environments raise the need for transfer of knowledge, so as to eliminate the need for long learning trials after deployment. Typically, these tasks would be modelled as sequential decision making problems, including path optimisation for navigation tasks, or Markov Decision Process models for more general tasks. Learning within such models often takes the form of online learning or reinforcement learning. However, handling issues such as knowledge transfer and multiple task instances requires notions of structure and hierarchy, and that raises several questions that form the topic of this thesis – (a) can an agent acquire such hierarchies in policies in an online, incremental manner, (b) can we devise mathematically rigorous ways to abstract policies based on qualitative attributes, (c) when it is inconvenient to employ prolonged trial and error learning, can we devise alternate algorithmic methods for decision making in a lifelong setting? The first contribution of this thesis is an algorithmic method for incrementally acquiring hierarchical policies. Working with the framework of options - temporally extended actions - in reinforcement learning, we present a method for discovering persistent subtasks that define useful options for a particular domain. Our algorithm builds on a probabilistic mixture model in state space to define a generalised and persistent form of ‘bottlenecks’, and suggests suitable policy fragments to make options. In order to continuously update this hierarchy, we devise an incremental process which runs in the background and takes care of proposing and forgetting options. We evaluate this framework in simulated worlds, including the RoboCup 2D simulation league domain. The second contribution of this thesis is in defining abstractions in terms of equivalence classes of trajectories. Utilising recently developed techniques from computational topology, in particular the concept of persistent homology, we show that a library of feasible trajectories could be retracted to representative paths that may be sufficient for reasoning about plans at the abstract level. We present a complete framework, starting from a novel construction of a simplicial complex that describes higher-order connectivity properties of a spatial domain, to methods for computing the homology of this complex at varying resolutions. The resulting abstractions are motion primitives that may be used as topological options, contributing a novel criterion for option discovery. This is validated by experiments in simulated 2D robot navigation, and in manipulation using a physical robot platform. Finally, we develop techniques for solving a family of related, but different, problem instances through policy reuse of a finite policy library acquired over the agent’s lifetime. This represents an alternative approach when traditional methods such as hierarchical reinforcement learning are not computationally feasible. We abstract the policy space using a non-parametric model of performance of policies in multiple task instances, so that decision making is posed as a Bayesian choice regarding what to reuse. This is one approach to transfer learning that is motivated by the needs of practical long-lived systems. We show the merits of such Bayesian policy reuse in simulated real-time interactive systems, including online personalisation and surveillance

    컴퓨터를 활용한 여러 사람의 동작 연출

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 이제희.Choreographing motion is the process of converting written stories or messages into the real movement of actors. In performances or movie, directors spend a consid-erable time and effort because it is the primary factor that audiences concentrate. If multiple actors exist in the scene, choreography becomes more challenging. The fundamental difficulty is that the coordination between actors should precisely be ad-justed. Spatio-temporal coordination is the first requirement that must be satisfied, and causality/mood are also another important coordinations. Directors use several assistant tools such as storyboards or roughly crafted 3D animations, which can visu-alize the flow of movements, to organize ideas or to explain them to actors. However, it is difficult to use the tools because artistry and considerable training effort are required. It also doesnt have ability to give any suggestions or feedbacks. Finally, the amount of manual labor increases exponentially as the number of actor increases. In this thesis, we propose computational approaches on choreographing multiple actor motion. The ultimate goal is to enable novice users easily to generate motions of multiple actors without substantial effort. We first show an approach to generate motions for shadow theatre, where actors should carefully collaborate to achieve the same goal. The results are comparable to ones that are made by professional ac-tors. In the next, we present an interactive animation system for pre-visualization, where users exploits an intuitive graphical interface for scene description. Given a de-scription, the system can generate motions for the characters in the scene that match the description. Finally, we propose two controller designs (combining regression with trajectory optimization, evolutionary deep reinforcement learning) for physically sim-ulated actors, which guarantee physical validity of the resultant motions.Chapter 1 Introduction 1 Chapter 2 Background 8 2.1 Motion Generation Technique 9 2.1.1 Motion Editing and Synthesis for Single-Character 9 2.1.2 Motion Editing and Synthesis for Multi-Character 9 2.1.3 Motion Planning 10 2.1.4 Motion Control by Reinforcement Learning 11 2.1.5 Pose/Motion Estimation from Incomplete Information 11 2.1.6 Diversity on Resultant Motions 12 2.2 Authoring System 12 2.2.1 System using High-level Input 12 2.2.2 User-interactive System 13 2.3 Shadow Theatre 14 2.3.1 Shadow Generation 14 2.3.2 Shadow for Artistic Purpose 14 2.3.3 Viewing Shadow Theatre as Collages/Mosaics of People 15 2.4 Physics-based Controller Design 15 2.4.1 Controllers for Various Characters 15 2.4.2 Trajectory Optimization 15 2.4.3 Sampling-based Optimization 16 2.4.4 Model-Based Controller Design 16 2.4.5 Direct Policy Learning 17 2.4.6 Deep Reinforcement Learning for Control 17 Chapter 3 Motion Generation for Shadow Theatre 19 3.1 Overview 19 3.2 Shadow Theatre Problem 21 3.2.1 Problem Definition 21 3.2.2 Approaches of Professional Actors 22 3.3 Discovery of Principal Poses 24 3.3.1 Optimization Formulation 24 3.3.2 Optimization Algorithm 27 3.4 Animating Principal Poses 29 3.4.1 Initial Configuration 29 3.4.2 Optimization for Motion Generation 30 3.5 Experimental Results 32 3.5.1 Implementation Details 33 3.5.2 Animation 34 3.5.3 3D Fabrication 34 3.6 Discussion 37 Chapter 4 Interactive Animation System for Pre-visualization 40 4.1 Overview 40 4.2 Graphical Scene Description 42 4.3 Candidate Scene Generation 45 4.3.1 Connecting Paths 47 4.3.2 Motion Cascade 47 4.3.3 Motion Selection For Each Cycle 49 4.3.4 Cycle Ordering 51 4.3.5 Generalized Paths and Cycles 52 4.3.6 Motion Editing 54 4.4 Scene Ranking 54 4.4.1 Ranking Criteria 54 4.4.2 Scene Ranking Measures 57 4.5 Scene Refinement 58 4.6 Experimental Results 62 4.7 Discussion 65 Chapter 5 Physics-based Design and Control 69 5.1 Overview 69 5.2 Combining Regression with Trajectory Optimization 70 5.2.1 Simulation and Motor Skills 71 5.2.2 Control Adaptation 75 5.2.3 Control Parameterization 79 5.2.4 Efficient Construction 81 5.2.5 Experimental Results 84 5.2.6 Discussion 89 5.3 Example-Guided Control by Deep Reinforcement Learning 91 5.3.1 System Overview 92 5.3.2 Initial Policy Construction 95 5.3.3 Evolutionary Deep Q-Learning 100 5.3.4 Experimental Results 107 5.3.5 Discussion 114 Chapter 6 Conclusion 119 6.1 Contribution 119 6.2 Future Work 120 요약 135Docto

    Machine learning for the sustainable energy transition: a data-driven perspective along the value chain from manufacturing to energy conversion

    Get PDF
    According to the special report Global Warming of 1.5 °C of the IPCC, climate action is not only necessary but more than ever urgent. The world is witnessing rising sea levels, heat waves, events of flooding, droughts, and desertification resulting in the loss of lives and damage to livelihoods, especially in countries of the Global South. To mitigate climate change and commit to the Paris agreement, it is of the uttermost importance to reduce greenhouse gas emissions coming from the most emitting sector, namely the energy sector. To this end, large-scale penetration of renewable energy systems into the energy market is crucial for the energy transition toward a sustainable future by replacing fossil fuels and improving access to energy with socio-economic benefits. With the advent of Industry 4.0, Internet of Things technologies have been increasingly applied to the energy sector introducing the concept of smart grid or, more in general, Internet of Energy. These paradigms are steering the energy sector towards more efficient, reliable, flexible, resilient, safe, and sustainable solutions with huge environmental and social potential benefits. To realize these concepts, new information technologies are required, and among the most promising possibilities are Artificial Intelligence and Machine Learning which in many countries have already revolutionized the energy industry. This thesis presents different Machine Learning algorithms and methods for the implementation of new strategies to make renewable energy systems more efficient and reliable. It presents various learning algorithms, highlighting their advantages and limits, and evaluating their application for different tasks in the energy context. In addition, different techniques are presented for the preprocessing and cleaning of time series, nowadays collected by sensor networks mounted on every renewable energy system. With the possibility to install large numbers of sensors that collect vast amounts of time series, it is vital to detect and remove irrelevant, redundant, or noisy features, and alleviate the curse of dimensionality, thus improving the interpretability of predictive models, speeding up their learning process, and enhancing their generalization properties. Therefore, this thesis discussed the importance of dimensionality reduction in sensor networks mounted on renewable energy systems and, to this end, presents two novel unsupervised algorithms. The first approach maps time series in the network domain through visibility graphs and uses a community detection algorithm to identify clusters of similar time series and select representative parameters. This method can group both homogeneous and heterogeneous physical parameters, even when related to different functional areas of a system. The second approach proposes the Combined Predictive Power Score, a method for feature selection with a multivariate formulation that explores multiple sub-sets of expanding variables and identifies the combination of features with the highest predictive power over specified target variables. This method proposes a selection algorithm for the optimal combination of variables that converges to the smallest set of predictors with the highest predictive power. Once the combination of variables is identified, the most relevant parameters in a sensor network can be selected to perform dimensionality reduction. Data-driven methods open the possibility to support strategic decision-making, resulting in a reduction of Operation & Maintenance costs, machine faults, repair stops, and spare parts inventory size. Therefore, this thesis presents two approaches in the context of predictive maintenance to improve the lifetime and efficiency of the equipment, based on anomaly detection algorithms. The first approach proposes an anomaly detection model based on Principal Component Analysis that is robust to false alarms, can isolate anomalous conditions, and can anticipate equipment failures. The second approach has at its core a neural architecture, namely a Graph Convolutional Autoencoder, which models the sensor network as a dynamical functional graph by simultaneously considering the information content of individual sensor measurements (graph node features) and the nonlinear correlations existing between all pairs of sensors (graph edges). The proposed neural architecture can capture hidden anomalies even when the turbine continues to deliver the power requested by the grid and can anticipate equipment failures. Since the model is unsupervised and completely data-driven, this approach can be applied to any wind turbine equipped with a SCADA system. When it comes to renewable energies, the unschedulable uncertainty due to their intermittent nature represents an obstacle to the reliability and stability of energy grids, especially when dealing with large-scale integration. Nevertheless, these challenges can be alleviated if the natural sources or the power output of renewable energy systems can be forecasted accurately, allowing power system operators to plan optimal power management strategies to balance the dispatch between intermittent power generations and the load demand. To this end, this thesis proposes a multi-modal spatio-temporal neural network for multi-horizon wind power forecasting. In particular, the model combines high-resolution Numerical Weather Prediction forecast maps with turbine-level SCADA data and explores how meteorological variables on different spatial scales together with the turbines' internal operating conditions impact wind power forecasts. The world is undergoing a third energy transition with the main goal to tackle global climate change through decarbonization of the energy supply and consumption patterns. This is not only possible thanks to global cooperation and agreements between parties, power generation systems advancements, and Internet of Things and Artificial Intelligence technologies but also necessary to prevent the severe and irreversible consequences of climate change that are threatening life on the planet as we know it. This thesis is intended as a reference for researchers that want to contribute to the sustainable energy transition and are approaching the field of Artificial Intelligence in the context of renewable energy systems
    corecore