75 research outputs found

    Practical reinforcement learning using representation learning and safe exploration for large scale Markov decision processes

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 157-168).While creating intelligent agents who can solve stochastic sequential decision making problems through interacting with the environment is the promise of Reinforcement Learning (RL), scaling existing RL methods to realistic domains such as planning for multiple unmanned aerial vehicles (UAVs) has remained a challenge due to three main factors: 1) RL methods often require a plethora of data to find reasonable policies, 2) the agent has limited computation time between interactions, and 3) while exploration is necessary to avoid convergence to the local optima, in sensitive domains visiting all parts of the planning space may lead to catastrophic outcomes. To address the first two challenges, this thesis introduces incremental Feature Dependency Discovery (iFDD) as a representation expansion method with cheap per-timestep computational complexity that can be combined with any online, value-based reinforcement learning using binary features. In addition to convergence and computational complexity guarantees, when coupled with SARSA, iFDD achieves much faster learning (i.e., requires much less data samples) in planning domains including two multi-UAV mission planning scenarios with hundreds of millions of state-action pairs. In particular, in a UAV mission planning domain, iFDD performed more than 12 times better than the best competitor given the same number of samples. The third challenge is addressed through a constructive relationship between a planner and a learner in order to mitigate the learning risk while boosting the asymptotic performance and safety of an agent's behavior. The framework is an instance of the intelligent cooperative control architecture where a learner initially follows a safe policy generated by a planner. The learner incrementally improves this baseline policy through interaction, while avoiding behaviors believed to be risky. The new approach is demonstrated to be superior in two multi-UAV task assignment scenarios. For example in one case, the proposed method reduced the risk by 8%, while improving the performance of the planner up to 30%.by Alborz Geramifard.Ph.D

    Embedded electronic systems driven by run-time reconfigurable hardware

    Get PDF
    Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

    Complexity of Atrial Fibrillation Electrograms Through Nonlinear Signal Analysis: In Silico Approach

    Get PDF
    Identification of atrial fibrillation (AF) mechanisms could improve the rate of ablation success. However, the incomplete understanding of those mechanisms makes difficult the decision of targeting sites for ablation. This work is focused on the importance of EGM analysis for detecting and modulating rotors to guide ablation procedures and improve its outcomes. Virtual atrial models are used to show how nonlinear measures can be used to generate electroanatomical maps to detect critical sites in AF. A description of the atrial cell mathematical models, and the procedure of coupling them within two‐dimensional and three‐dimensional virtual atrial models in order to simulate arrhythmogenic mechanisms, is given. Mathematical modeling of unipolar and bipolar electrogramas (EGM) is introduced. It follows a discussion of EGM signal processing. Nonlinear descriptors, such as approximate entropy and multifractal analysis, are used to study the dynamical behavior of EGM signals, which are not well described by a linear law. Our results evince that nonlinear analysis of EGM can provide information about the dynamics of rotors and other mechanisms of AF. Furthermore, these fibrillatory patterns can be simulated using virtual models. The combination of features using machine learning tools can be used for identifying arrhythmogenic sources of AF

    Nonlinear Systems

    Get PDF
    The editors of this book have incorporated contributions from a diverse group of leading researchers in the field of nonlinear systems. To enrich the scope of the content, this book contains a valuable selection of works on fractional differential equations.The book aims to provide an overview of the current knowledge on nonlinear systems and some aspects of fractional calculus. The main subject areas are divided into two theoretical and applied sections. Nonlinear systems are useful for researchers in mathematics, applied mathematics, and physics, as well as graduate students who are studying these systems with reference to their theory and application. This book is also an ideal complement to the specific literature on engineering, biology, health science, and other applied science areas. The opportunity given by IntechOpen to offer this book under the open access system contributes to disseminating the field of nonlinear systems to a wide range of researchers

    Review and Perspectives: Shape Memory Alloy Composite Systems

    Get PDF
    Following their discovery in the early 60's, there has been a continuous quest for ways to take advantage of the extraordinary properties of shape memory alloys (SMAs). These intermetallic alloys can be extremely compliant while retaining the strength of metals and can convert thermal energy to mechanical work. The unique properties of SMAs result from a reversible difussionless solid-to-solid phase transformation from austenite to martensite. The integration of SMAs into composite structures has resulted in many benefits, which include actuation, vibration control, damping, sensing, and self-healing. However, despite substantial research in this area, a comparable adoption of SMA composites by industry has not yet been realized. This discrepancy between academic research and commercial interest is largely associated with the material complexity that includes strong thermomechanical coupling, large inelastic deformations, and variable thermoelastic properties. Nonetheless, as SMAs are becoming increasingly accepted in engineering applications, a similar trend for SMA composites is expected in aerospace, automotive, and energy conversion and storage related applications. In an effort to aid in this endeavor, a comprehensive overview of advances with regard to SMA composites and devices utilizing them is pursued in this paper. Emphasis is placed on identifying the characteristic responses and properties of these material systems as well as on comparing the various modeling methodologies for describing their response. Furthermore, the paper concludes with a discussion of future research efforts that may have the greatest impact on promoting the development of SMA composites and their implementation in multifunctional structures

    Analytical study of the Least Squares Quasi-Newton method for interaction problems

    Get PDF
    Often in nature different systems interact, like fluids and structures, heat and electricity, populations of species, etc. It is our aim in this thesis to find, describe and analyze solution methods to solve the equations resulting from the mathematical models describing those interacting systems. Even if powerful solvers often already exist for problems in a single physical domain (e.g. structural or fluid problems), the development of similar tools for multi-physics problems is still ongoing. When the interaction (or coupling) between the two systems is strong, many methods still fail or are computationally very expensive. Approaches for solving these multi-physics problems can be broadly put in two categories: monolithic or partitioned. While we are not claiming that the partitioned approach is panacea for all coupled problems, we will only focus our attention in this thesis on studying methods to solve (strongly) coupled problems with a partitioned approach in which each of the physical problems is solved with a specialized code that we consider to be a black box solver and of which the Jacobian is unknown. We also assume that calling these black boxes is the most expensive part of any algorithm, so that performance is judged by the number of times these are called. In 2005 Vierendeels presented a new coupling procedure for this partitioned approach in a fluid-structure interaction context, based on sensitivity analysis of the important displacement and pressure modes which are detected during the iteration process. This approach only uses input-output couples of the solvers (one for the fluid problem and one for the structural problem). In this thesis we will focus on establishing the properties of this method and show that it can be interpreted as a block quasi-Newton method with approximate Jacobians based on a least squares formulation. We also establish and investigate other algorithms that exploit the original idea but use a single approximate Jacobian. The main focus in this thesis lies on establishing the algebraic properties of the methods under investigation and not so much on the best implementation form

    Explain what you see:argumentation-based learning and robotic vision

    Get PDF
    In this thesis, we have introduced new techniques for the problems of open-ended learning, online incremental learning, and explainable learning. These methods have applications in the classification of tabular data, 3D object category recognition, and 3D object parts segmentation. We have utilized argumentation theory and probability theory to develop these methods. The first proposed open-ended online incremental learning approach is Argumentation-Based online incremental Learning (ABL). ABL works with tabular data and can learn with a small number of learning instances using an abstract argumentation framework and bipolar argumentation framework. It has a higher learning speed than state-of-the-art online incremental techniques. However, it has high computational complexity. We have addressed this problem by introducing Accelerated Argumentation-Based Learning (AABL). AABL uses only an abstract argumentation framework and uses two strategies to accelerate the learning process and reduce the complexity. The second proposed open-ended online incremental learning approach is the Local Hierarchical Dirichlet Process (Local-HDP). Local-HDP aims at addressing two problems of open-ended category recognition of 3D objects and segmenting 3D object parts. We have utilized Local-HDP for the task of object part segmentation in combination with AABL to achieve an interpretable model to explain why a certain 3D object belongs to a certain category. The explanations of this model tell a user that a certain object has specific object parts that look like a set of the typical parts of certain categories. Moreover, integrating AABL and Local-HDP leads to a model that can handle a high degree of occlusion

    Ein Beitrag zur taktischen Verhaltensplanung für Fahrstreifenwechsel bei automatisierten Fahrzeugen

    Get PDF
    Automated driving within one lane is a fascinating experience. Yet, it is even more interesting to go a step ahead: Making automated lane changes without human driver interaction. This thesis presents a concept and implementation demonstrated in "Jack", the Audi A7 piloted driving concept vehicle. Given that automated driving is in the media every other day already, why is it still such a big issue to do tactical behavior planning for automated vehicles? It is one of the core areas where it is surprisingly obvious why humans are currently so much smarter than machines: Tactical driving behavior planning is a social task that requires cooperation, intention recognition, and complex situation assessment. Without complex cognitive capabilities in today's automated vehicles, it is core of this thesis to find simple algorithms that pretend intelligence in behavior planning. In fact, such behavior planning in automated driving is a constant trade-off between utility and risk: The vehicle has to balance value dimensions such as safety, legality, mobility, and additional aspects like creating user and third party satisfaction. This thesis provides a framework to boil down such abstract dimensions into a working implementation. Several of the foundations for this thesis were developed as part of the Stadtpilot project at TU Braunschweig. While there has been plenty of research on concepts being tested in perfect, simulated worlds only, the approaches in this thesis have been implemented and evaluated in real world traffic with uncertain and imperfect sensor data. The implementation has been tested, tweaked, and used in "Jack" for more than 50,000 km of automated driving in everyday traffic.Automatisiertes Fahren innerhalb eines Fahrstreifens ist eine faszinierende Erfahrung. Noch spannender ist es jedoch noch einen Schritt weiter zu gehen: Auch Fahrstreifenwechsel automatisiert auszuführen, ohne Interaktion mit einem Menschen als Fahrer. In dieser Dissertation wird hierfür ein Konzept und dessen Umsetzung in „Jack“ präsentiert, dem Audi A7 piloted driving concept Fahrzeug. Automatisiertes Fahren ist aktuell in den Medien in aller Munde. Warum ist es dennoch eine große Herausforderung taktische Verhaltensplanung für automatisierte Fahrzeuge wirklich umzusetzen? Es ist einer der Kernbereiche, in denen offensichtlich wird, warum Menschen aktuell Maschinen im Straßenverkehr noch weitaus überlegen sind: Taktische Verhaltensplanung ist eine soziale Aufgabe, welche Kooperation, das Erkennen von Absichten und der Bewertung komplexer Situationen bedarf. Mangels wirklicher kognitiver Fähigkeiten in den heutigen automatisierten Fahrzeugen ist es Kern dieser Dissertation Algorithmen zu finden, welche zumindest den Eindruck intelligenter Verhaltensplanung erzeugen. Eine solche Verhaltensplanung ist ein permanentes Abwägen von Nutzen und Risiken. Das Fahrzeug muss permanent Entscheidungen im Spannungsfeld zwischen Sicherheit, Legalität, Mobilität und weiten Aspekten wie Nutzerzufriedenheit und Zufriedenheit Dritter treffen. In dieser Dissertation wird ein Konzept entwickelt, um solche abstrakten Entscheidungsdimensionen in ein implementierbares Konzept herunterzubrechen. Viele Grundlagen dafür wurden im Rahmen des Stadtpilot Projekts der TU Braunschweig erarbeitet. In vorausgehenden Arbeiten wurden bereits viele Ansätze entwickelt und auf Basis von perfekten, simulierten Daten evaluiert. Der in dieser Arbeit präsentierte Ansatz ist in der Lage mit unsicherheits- und fehlerbehafteten Messdaten umzugehen. Der Ansatz aus dieser Dissertation wurde in dem automatisiert fahrenden Fahrzeug „Jack“ implementiert und bereits über 50.000 km im normalen Straßenverkehr genutzt und getestet
    corecore