1,321 research outputs found

    Applications of DEC-MDPs in multi-robot systems

    Get PDF
    International audienceOptimizing the operation of cooperative multi-robot systems that can cooperatively act in large and complex environments has become an important focal area of research. This issue is motivated by many applications involving a set of cooperative robots that have to decide in a decentralized way how to execute a large set of tasks in partially observable and uncertain environments. Such decision problems are encountered while developing exploration rovers, teams of patrolling robots, rescue-robot colonies, mine-clearance robots, et cetera.In this chapter, we introduce problematics related to the decentralized control of multi-robot systems. We rst describe some applicative domains and review the main characteristics of the decision problems the robots must deal with. Then, we review some existing approaches to solve problems of multiagent decen- tralized control in stochastic environments. We present the Decentralized Markov Decision Processes and discuss their applicability to real-world multi-robot applications. Then, we introduce OC-DEC-MDPs and 2V-DEC-MDPs which have been developed to increase the applicability of DEC-MDPs

    Engineering Emergence: A Survey on Control in the World of Complex Networks

    Get PDF
    Complex networks make an enticing research topic that has been increasingly attracting researchers from control systems and various other domains over the last two decades. The aim of this paper was to survey the interest in control related to complex networks research over time since 2000 and to identify recent trends that may generate new research directions. The survey was performed for Web of Science, Scopus, and IEEEXplore publications related to complex networks. Based on our findings, we raised several questions and highlighted ongoing interests in the control of complex networks.publishedVersio

    Mixed-initiative mission planning considering human operator state estimation based on physiological sensors

    Get PDF
    Missions involving humans with automated systems become increasingly common and are subject to risk of failing due to human factors. In fact, missions workload may generate stress or mental fatigue increasing the accident risk. The idea of our project is to refine human-robot supervision by using data from physiological sensors(eye tracking and heart rate monitoring devices) giving information about the operator's state. The proof of concept mission consists of a ground robot, autonomous or controlled by a human operator, which has to fight fires that catch randomly. We proposed to use the planning framework called Partially Observable Markov Decision Process (POMDP) along with machine learning techniques to improve human-machine interactions by optimizing the decision of the mode (autonomous or controlled robot) and of the display of alarms in the form of visual stimuli.A dataset of demonstrations produced by remote volunteers through an online video game simulating the mission allows to learn a POMDP that infers human state and to optimize the associated strategy. Cognitive availability, current task, type of behavior, situation awareness or involvement in the mission are examples of studied human operator states. Finally, scores of the missions, consisting in the number of extinguished fires, will quantify the improvement made by using physiological data

    Physics-Informed Machine Learning for Data Anomaly Detection, Classification, Localization, and Mitigation: A Review, Challenges, and Path Forward

    Full text link
    Advancements in digital automation for smart grids have led to the installation of measurement devices like phasor measurement units (PMUs), micro-PMUs (μ\mu-PMUs), and smart meters. However, a large amount of data collected by these devices brings several challenges as control room operators need to use this data with models to make confident decisions for reliable and resilient operation of the cyber-power systems. Machine-learning (ML) based tools can provide a reliable interpretation of the deluge of data obtained from the field. For the decision-makers to ensure reliable network operation under all operating conditions, these tools need to identify solutions that are feasible and satisfy the system constraints, while being efficient, trustworthy, and interpretable. This resulted in the increasing popularity of physics-informed machine learning (PIML) approaches, as these methods overcome challenges that model-based or data-driven ML methods face in silos. This work aims at the following: a) review existing strategies and techniques for incorporating underlying physical principles of the power grid into different types of ML approaches (supervised/semi-supervised learning, unsupervised learning, and reinforcement learning (RL)); b) explore the existing works on PIML methods for anomaly detection, classification, localization, and mitigation in power transmission and distribution systems, c) discuss improvements in existing methods through consideration of potential challenges while also addressing the limitations to make them suitable for real-world applications

    Efficient Deep Reinforcement Learning via Planning, Generalization, and Improved Exploration

    Full text link
    Reinforcement learning (RL) is a general-purpose machine learning framework, which considers an agent that makes sequential decisions in an environment to maximize its reward. Deep reinforcement learning (DRL) approaches use deep neural networks as non-linear function approximators that parameterize policies or value functions directly from raw observations in RL. Although DRL approaches have been shown to be successful on many challenging RL benchmarks, much of the prior work has mainly focused on learning a single task in a model-free setting, which is often sample-inefficient. On the other hand, humans have abilities to acquire knowledge by learning a model of the world in an unsupervised fashion, use such knowledge to plan ahead for decision making, transfer knowledge between many tasks, and generalize to previously unseen circumstances from the pre-learned knowledge. Developing such abilities are some of the fundamental challenges for building RL agents that can learn as efficiently as humans. As a step towards developing the aforementioned capabilities in RL, this thesis develops new DRL techniques to address three important challenges in RL: 1) planning via prediction, 2) rapidly generalizing to new environments and tasks, and 3) efficient exploration in complex environments. The first part of the thesis discusses how to learn a dynamics model of the environment using deep neural networks and how to use such a model for planning in complex domains where observations are high-dimensional. Specifically, we present neural network architectures for action-conditional video prediction and demonstrate improved exploration in RL. In addition, we present a neural network architecture that performs lookahead planning by predicting the future only in terms of rewards and values without predicting observations. We then discuss why this approach is beneficial compared to conventional model-based planning approaches. The second part of the thesis considers generalization to unseen environments and tasks. We first introduce a set of cognitive tasks in a 3D environment and present memory-based DRL architectures that generalize better to previously unseen 3D environments compared to existing baselines. In addition, we introduce a new multi-task RL problem where the agent should learn to execute different tasks depending on given instructions and generalize to new instructions in a zero-shot fashion. We present a new hierarchical DRL architecture that learns to generalize over previously unseen task descriptions with minimal prior knowledge. The third part of the thesis discusses how exploiting past experiences can indirectly drive deep exploration and improve sample-efficiency. In particular, we propose a new off-policy learning algorithm, called self-imitation learning, which learns a policy to reproduce past good experiences. We empirically show that self-imitation learning indirectly encourages the agent to explore reasonably good state spaces and thus significantly improves sample-efficiency on RL domains where exploration is challenging. Overall, the main contribution of this thesis are to explore several fundamental challenges in RL in the context of DRL and develop new DRL architectures and algorithms to address such challenges. This allows us to understand how deep learning can be used to improve sample efficiency, and thus come closer to human-like learning abilities.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145829/1/junhyuk_1.pd
    corecore