135 research outputs found

    Developing Intelligent Routing Algorithm over SDN: Reusable Reinforcement Learning Approach

    Get PDF
    Traffic routing is vital for the proper functioning of the Internet. As users and network traffic increase, researchers try to develop adaptive and intelligent routing algorithms that can fulfill various QoS requirements. Reinforcement Learning (RL) based routing algorithms have shown better performance than traditional approaches. We developed a QoS-aware, reusable RL routing algorithm, RLSR-Routing over SDN. During the learning process, our algorithm ensures loop-free path exploration. While finding the path for one traffic demand (a source destination pair with certain amount of traffic), RLSR-Routing learns the overall network QoS status, which can be used to speed up algorithm convergence when finding the path for other traffic demands. By adapting Segment Routing, our algorithm can achieve flow-based, source packet routing, and reduce communications required between SDN controller and network plane. Our algorithm shows better performance in terms of load balancing than the traditional approaches. It also has faster convergence than the non-reusable RL approach when finding paths for multiple traffic demands

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field

    Distributed reinforcement learning for self-reconfiguring modular robots

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 101-106).In this thesis, we study distributed reinforcement learning in the context of automating the design of decentralized control for groups of cooperating, coupled robots. Specifically, we develop a framework and algorithms for automatically generating distributed controllers for self-reconfiguring modular robots using reinforcement learning. The promise of self-reconfiguring modular robots is that of robustness, adaptability and versatility. Yet most state-of-the-art distributed controllers are laboriously handcrafted and task-specific, due to the inherent complexities of distributed, local-only control. In this thesis, we propose and develop a framework for using reinforcement learning for automatic generation of such controllers. The approach is profitable because reinforcement learning methods search for good behaviors during the lifetime of the learning agent, and are therefore applicable to online adaptation as well as automatic controller design. However, we must overcome the challenges due to the fundamental partial observability inherent in a distributed system such as a self reconfiguring modular robot. We use a family of policy search methods that we adapt to our distributed problem. The outcome of a local search is always influenced by the search space dimensionality, its starting point, and the amount and quality of available exploration through experience.(cont) We undertake a systematic study of the effects that certain robot and task parameters, such as the number of modules, presence of exploration constraints, availability of nearest-neighbor communications, and partial behavioral knowledge from previous experience, have on the speed and reliability of learning through policy search in self-reconfiguring modular robots. In the process, we develop novel algorithmic variations and compact search space representations for learning in our domain, which we test experimentally on a number of tasks. This thesis is an empirical study of reinforcement learning in a simulated lattice based self-reconfiguring modular robot domain. However, our results contribute to the broader understanding of automatic generation of group control and design of distributed reinforcement learning algorithms.by Paulina Varshavskaya.Ph.D

    Reinforcement Learning for Building Heating via Mixing Loops

    Get PDF

    Outdoor operations of multiple quadrotors in windy environment

    Get PDF
    Coordinated multiple small unmanned aerial vehicles (sUAVs) offer several advantages over a single sUAV platform. These advantages include improved task efficiency, reduced task completion time, improved fault tolerance, and higher task flexibility. However, their deployment in an outdoor environment is challenging due to the presence of wind gusts. The coordinated motion of a multi-sUAV system in the presence of wind disturbances is a challenging problem when considering collision avoidance (safety), scalability, and communication connectivity. Performing wind-agnostic motion planning for sUAVs may produce a sizeable cross-track error if the wind on the planned route leads to actuator saturation. In a multi-sUAV system, each sUAV has to locally counter the wind disturbance while maintaining the safety of the system. Such continuous manipulation of the control effort for multiple sUAVs under uncertain environmental conditions is computationally taxing and can lead to reduced efficiency and safety concerns. Additionally, modern day sUAV systems are susceptible to cyberattacks due to their use of commercial wireless communication infrastructure. This dissertation aims to address these multi-faceted challenges related to the operation of outdoor rotor-based multi-sUAV systems. A comprehensive review of four representative techniques to measure and estimate wind speed and direction using rotor-based sUAVs is discussed. After developing a clear understanding of the role wind gusts play in quadrotor motion, two decentralized motion planners for a multi-quadrotor system are implemented and experimentally evaluated in the presence of wind disturbances. The first planner is rooted in the reinforcement learning (RL) technique of state-action-reward-state-action (SARSA) to provide generalized path plans in the presence of wind disturbances. While this planner provides feasible trajectories for the quadrotors, it does not provide guarantees of collision avoidance. The second planner implements a receding horizon (RH) mixed-integer nonlinear programming (MINLP) model that is integrated with control barrier functions (CBFs) to guarantee collision-free transit of the multiple quadrotors in the presence of wind disturbances. Finally, a novel communication protocol using Ethereum blockchain-based smart contracts is presented to address the challenge of secure wireless communication. The U.S. sUAV market is expected to be worth $92 Billion by 2030. The Association for Unmanned Vehicle Systems International (AUVSI) noted in its seminal economic report that UAVs would be responsible for creating 100,000 jobs by 2025 in the U.S. The rapid proliferation of drone technology in various applications has led to an increasing need for professionals skilled in sUAV piloting, designing, fabricating, repairing, and programming. Engineering educators have recognized this demand for certified sUAV professionals. This dissertation aims to address this growing sUAV-market need by evaluating two active learning-based instructional approaches designed for undergraduate sUAV education. The two approaches leverages the interactive-constructive-active-passive (ICAP) framework of engagement and explores the use of Competition based Learning (CBL) and Project based Learning (PBL). The CBL approach is implemented through a drone building and piloting competition that featured 97 students from undergraduate and graduate programs at NJIT. The competition focused on 1) drone assembly, testing, and validation using commercial off-the-shelf (COTS) parts, 2) simulation of drone flight missions, and 3) manual and semi-autonomous drone piloting were implemented. The effective student learning experience from this competition served as the basis of a new undergraduate course on drone science fundamentals at NJIT. This undergraduate course focused on the three foundational pillars of drone careers: 1) drone programming using Python, 2) designing and fabricating drones using Computer-Aided Design (CAD) and rapid prototyping, and 3) the US Federal Aviation Administration (FAA) Part 107 Commercial small Unmanned Aerial Vehicles (sUAVs) pilot test. Multiple assessment methods are applied to examine the students’ gains in sUAV skills and knowledge and student attitudes towards an active learning-based approach for sUAV education. The use of active learning techniques to address these challenges lead to meaningful student engagement and positive gains in the learning outcomes as indicated by quantitative and qualitative assessments

    Modeling and Control Techniques in Smart Systems

    Get PDF
    Energy and food crisis are two major problems that our human society has to face in the 21st century. With the world’s population reaching 7.62 billion as of May 2018, both electric power and agricultural industries turn to technological innovations for solutions to keep up the increasing demand. In the past and currently, utility companies rely on rule of thumb to estimate power consumption. However, inaccurate predictions often result in over production, and much energy is wasted. On the other hand, traditional periodic and threshold based irrigation practices have also been proven outdated. This problem is further compounded by recent years’ frequent droughts across the globe. New technologies are needed to manage irrigations more efficiently. Fortunately, with the unprecedented development of Artificial Intelligence (AI), wireless communication, and ubiquitous computing technologies, high degree of information integration and automation are steadily becoming reality. More smart metering devices are installed today than ever before, enabling fast and massive data collection. Patterns and trends can be more accurately predicted using machine learning techniques. Based on the results, utility companies can schedule production more efficiently, not only enhancing their profitabilities, but also making our world’s energy supply more sustainable. In addition, predictions can serve as references to detect anomalous activities like power theft and cyber attacks. On the other hand, with wireless communication, real-time soil moisture sensor readings and weather forecasts can be collected for precision irrigation. Smaller but more powerful controllers provide perfect platforms for complicated control algorithms. We designed and built a fully automated irrigation system at Bushland, Texas. It is designed to operate without any human intervention. Workers can program, move, and monitor multiple irrigation systems remotely. The algorithm that runs on the controls deserves more attention. AI and other state of art controlling techniques are implemented, making it much more powerful than any existing systems. By integrating professional crop yield simulation models like DSSAT, computers can run tens of thousand simulations on all kinds of weather and soil conditions, and more importantly, learn from the experience. In reality, such process would take thousands of years to obtain. Yet, the computers can find an optimum solution in minutes. The experience is then summarized as a policy and stored inside the controller as a lookup table. Furthermore, after each crop season, users can calibrate and update current policy with real harvest data. Crop yield models like DSSAT and AquaCrop play very important roles in agricultural research. They represent our best knowledge in plant biology and can be very accurate when well calibrated. However, the calibration process itself is often time consuming, thus limiting the scale and speed of using these models. We made efforts to combine different models to produce a single accurate prediction using machine learning techniques. The process does not require manual calibration, but only soil, historical weather, and harvest data. 20 models were built, and their results were evaluated and compared. With high accuracy, machine learning techniques have shown a promising direction to best utilize professional models, and demonstrated great potential for use in future agricultural research
    • …
    corecore