81 research outputs found

    The Hierarchical Discrete Pursuit Learning Automaton: A Novel Scheme With Fast Convergence and Epsilon-Optimality

    Get PDF
    Author's accepted manuscript© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Since the early 1960s, the paradigm of learning automata (LA) has experienced abundant interest. Arguably, it has also served as the foundation for the phenomenon and field of reinforcement learning (RL). Over the decades, new concepts and fundamental principles have been introduced to increase the LA’s speed and accuracy. These include using probability updating functions, discretizing the probability space, and using the “Pursuit” concept. Very recently, the concept of incorporating “structure” into the ordering of the LA’s actions has improved both the speed and accuracy of the corresponding hierarchical machines, when the number of actions is large. This has led to the ϵ -optimal hierarchical continuous pursuit LA (HCPA). This article pioneers the inclusion of all the above-mentioned phenomena into a new single LA, leading to the novel hierarchical discretized pursuit LA (HDPA). Indeed, although the previously proposed HCPA is powerful, its speed has an impediment when any action probability is close to unity, because the updates of the components of the probability vector are correspondingly smaller when any action probability becomes closer to unity. We propose here, the novel HDPA, where we infuse the phenomenon of discretization into the action probability vector’s updating functionality, and which is invoked recursively at every stage of the machine’s hierarchical structure. This discretized functionality does not possess the same impediment, because discretization prohibits it. We demonstrate the HDPA’s robustness and validity by formally proving the ϵ -optimality by utilizing the moderation property. We also invoke the submartingale characteristic at every level, to prove that the action probability of the optimal action converges to unity as time goes to infinity. Apart from the new machine being ϵ -optimal, the numerical results demonstrate that the number of iterations required for convergence is significantly reduce...acceptedVersio

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field

    Learning algorithms for the control of routing in integrated service communication networks

    Get PDF
    There is a high degree of uncertainty regarding the nature of traffic on future integrated service networks. This uncertainty motivates the use of adaptive resource allocation policies that can take advantage of the statistical fluctuations in the traffic demands. The adaptive control mechanisms must be 'lightweight', in terms of their overheads, and scale to potentially large networks with many traffic flows. Adaptive routing is one form of adaptive resource allocation, and this thesis considers the application of Stochastic Learning Automata (SLA) for distributed, lightweight adaptive routing in future integrated service communication networks. The thesis begins with a broad critical review of the use of Artificial Intelligence (AI) techniques applied to the control of communication networks. Detailed simulation models of integrated service networks are then constructed, and learning automata based routing is compared with traditional techniques on large scale networks. Learning automata are examined for the 'Quality-of-Service' (QoS) routing problem in realistic network topologies, where flows may be routed in the network subject to multiple QoS metrics, such as bandwidth and delay. It is found that learning automata based routing gives considerable blocking probability improvements over shortest path routing, despite only using local connectivity information and a simple probabilistic updating strategy. Furthermore, automata are considered for routing in more complex environments spanning issues such as multi-rate traffic, trunk reservation, routing over multiple domains, routing in high bandwidth-delay product networks and the use of learning automata as a background learning process. Automata are also examined for routing of both 'real-time' and 'non-real-time' traffics in an integrated traffic environment, where the non-real-time traffic has access to the bandwidth 'left over' by the real-time traffic. It is found that adopting learning automata for the routing of the real-time traffic may improve the performance to both real and non-real-time traffics under certain conditions. In addition, it is found that one set of learning automata may route both traffic types satisfactorily. Automata are considered for the routing of multicast connections in receiver-oriented, dynamic environments, where receivers may join and leave the multicast sessions dynamically. Automata are shown to be able to minimise the average delay or the total cost of the resulting trees using the appropriate feedback from the environment. Automata provide a distributed solution to the dynamic multicast problem, requiring purely local connectivity information and a simple updating strategy. Finally, automata are considered for the routing of multicast connections that require QoS guarantees, again in receiver-oriented dynamic environments. It is found that the distributed application of learning automata leads to considerably lower blocking probabilities than a shortest path tree approach, due to a combination of load balancing and minimum cost behaviour

    Some results on a set of data driven stochastic wildfire models

    Get PDF
    Across the globe, the frequency and size of wildfire events are increasing. Research focused on minimizing wildfire is critically needed to mitigate impending humanitarian and environmental crises. Real-time wildfire response is dependent on timely and accurate prediction of dynamic wildfire fronts. Current models used to inform decisions made by the U.S. Forest Service, such as Farsite, FlamMap and Behave do not incorporate modern remotely sensed wildfire records and are typically deterministic, making uncertainty calculations difficult. In this research, we tested two methods that combine artificial intelligence with remote sensing data. First, a stochastic cellular automata that learns algebraic expressions was fit to the spread of synthetic wildfire through symbolic regression. The validity of the genetic program was tested against synthetic spreading behavior driven by a balanced logistic model. We also tested a deep learning approach to wildfire fire perimeter prediction. Trained on a time-series of geolocated fire perimeters, atmospheric conditions, and satellite images, a deep convolutional neural network forecasts the evolution of the fire front in 24-hour intervals. The approach yielded several relevant high-level abstractions of input data such as NDVI vegetation indexes and produced promising initial results. These novel data-driven methods leveraged abundant and accessible remote sensing data, which are largely unused in industry level wildfire modeling. This work represents a step forward in wildfire modeling through a curated aggregation of satellite image spectral layers, historic wildfire perimeter maps, LiDAR, atmospheric conditions, and two novel simulation models. The results can be used to train and validate future wildfire models, and offer viable alternatives to current benchmark physics-based models used in industry

    Hazard elimination using backwards reachability techniques in discrete and hybrid models

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, February 2002.Includes bibliographical references (leaves 173-181).One of the most important steps in hazard analysis is determining whether a particular design can reach a hazardous state and, if it could, how to change the design to ensure that it does not. In most cases, this is done through testing or simulation or even less rigorous processes--none of which provide much confidence for complex systems. Because state spaces for software can be enormous (which is why testing is not an effective way to accomplish the goal), the innovative Hazard Automaton Reduction Algorithm (HARA) involves starting at a hypothetical unsafe state and using backwards reachability techniques to obtain enough information to determine how to design in order to ensure that state cannot be reached. State machine models are very powerful, but also present greater challenges in terms of reachability, including the backwards reachability needed to implement the Hazard Automaton Reduction Algorithm. The key to solving the backwards reachability problem lies in converting the state machine model into a controls state space formulation and creating a state transition matrix. Each successive step backward from the hazardous state then involves only one n by n matrix manipulation. Therefore, only a finite number of matrix manipulations is necessary to determine whether or not a state is reachable from another state, thus providing the same information that could be obtained from a complete backwards reachability graph of the state machine model. Unlike model checking, the computational cost does not increase as greatly with the number of backward states that need to be visited to obtain the information necessary to ensure that the design is safe or to redesign it to be safe. The functionality and optimality of this approach is proved in both discrete and hybrid cases.(cont.) The new approach of the Hazard Automaton Reduction Algorithm combined with backwards reachability controls techniques was demonstrated on a blackbox model of a real aircraft altitude switch. The algorithm is being implemented in a commercial specification language (SpecTRM-RL). SpecTRM-RL is formally extended to include continuous and hybrid models. An analysis of the safety of a medium term conflict detection algorithm (MTCD) for aircraft, that is being developed and tested by Eurocontrol for use in European Air Traffic Control, is performed. Attempts to validate such conflict detection algorithms is currently challenging researchers world wide. Model checking is unsatisfactory in general for this problem because of the lack of a termination guarantee in backwards reachability using model checking. The new state-space controls approach does not encounter this problem.by Natasha Anita Neogi.Ph.D

    Wide-Area Surveillance System using a UAV Helicopter Interceptor and Sensor Placement Planning Techniques

    Get PDF
    This project proposes and describes the implementation of a wide-area surveillance system comprised of a sensor/interceptor placement planning and an interceptor unmanned aerial vehicle (UAV) helicopter. Given the 2-D layout of an area, the planning system optimally places perimeter cameras based on maximum coverage and minimal cost. Part of this planning system includes the MATLAB implementation of Erdem and Sclaroff’s Radial Sweep algorithm for visibility polygon generation. Additionally, 2-D camera modeling is proposed for both fixed and PTZ cases. Finally, the interceptor is also placed to minimize shortest-path flight time to any point on the perimeter during a detection event. Secondly, a basic flight control system for the UAV helicopter is designed and implemented. The flight control system’s primary goal is to hover the helicopter in place when a human operator holds an automatic-flight switch. This system represents the first step in a complete waypoint-navigation flight control system. The flight control system is based on an inertial measurement unit (IMU) and a proportional-integral-derivative (PID) controller. This system is implemented using a general-purpose personal computer (GPPC) running Windows XP and other commercial off-the-shelf (COTS) hardware. This setup differs from other helicopter control systems which typically use custom embedded solutions or micro-controllers. Experiments demonstrate the sensor placement planning achieving \u3e90% coverage at optimized-cost for several typical areas given multiple camera types and parameters. Furthermore, the helicopter flight control system experiments achieve hovering success over short flight periods. However, the final conclusion is that the COTS IMU is insufficient for high-speed, high-frequency applications such as a helicopter control system

    Preliminaries for distributed natural computing inspired by the slime mold Physarum Polycephalum

    Get PDF
    This doctoral thesis aims towards distributed natural computing inspired by the slime mold Physarum polycephalum. The vein networks formed by this organism presumably support efficient transport of protoplasmic fluid. Devising models which capture the natural efficiency of the organism and form a suitable basis for the development of natural computing algorithms is an interesting and challenging goal. We start working towards this goal by designing and executing wet-lab experi- ments geared towards producing a large number of images of the vein networks of P. polycephalum. Next, we turn the depicted vein networks into graphs using our own custom software called Nefi. This enables a detailed numerical study, yielding a catalogue of characterizing observables spanning a wide array of different graph properties. To share our results and data, i.e. raw experimental data, graphs and analysis results, we introduce a dedicated repository revolving around slime mold data, the Smgr. The purpose of this repository is to promote data reuse and to foster a practice of increased data sharing. Finally we present a model based on interacting electronic circuits including current controlled voltage sources, which mimics the emergent flow patterns observed in live P. polycephalum. The model is simple, distributed and robust to changes in the underlying network topology. Thus it constitutes a promising basis for the development of distributed natural computing algorithms.Diese Dissertation dient als Vorarbeit für den Entwurf von verteilten Algorithmen, inspiriert durch den Schleimpilz Physarum polycephalum. Es wird vermutet, dass die Venen-Netze dieses Organismus den effizienten Transport von protoplasmischer Flüssigkeit ermöglichen. Die Herleitung von Modellen, welche sowohl die natürliche Effizienz des Organismus widerspiegeln, als auch eine geeignete Basis für den Entwurf von Algorithmen bieten, gilt weiterhin als schwierig. Wir nähern uns diesem Ziel mittels Laborversuchen zur Produktion von zahlreichen Abbildungen von Venen-Netzwerken. Weiters führen wir die abgebildeten Netze in Graphen über. Hierfür verwenden wir unsere eigene Software, genannt Nefi. Diese ermöglicht eine numerische Studie der Graphen, welche einen Katalog von charakteristischen Grapheigenschaften liefert. Um die gewonnenen Erkenntnisse und Daten zu teilen, führen wir ein spezialisiertes Daten-Repository ein, genannt Smgr. Hiermit begünstigen wir die Wiederverwendung von Daten und fördern das Teilen derselben. Abschließend präsentieren wir ein Modell, basierend auf elektrischen Elementen, insbesondere stromabhängigen Spannungsquellen, welches die Flüsse von P. poly- cephalum nachahmt. Das Modell ist simpel, verteilt und robust gegenüber topolo- gischen änderungen. Aus diesen Gründen stellt es eine vielversprechende Basis für den Entwurf von verteilten Algorithmen dar

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Frameworks for High Dimensional Convex Optimization

    Get PDF
    We present novel, efficient algorithms for solving extremely large optimization problems. A significant bottleneck today is that as the size of datasets grow, researchers across disciplines desire to solve prohibitively massive optimization problems. In this thesis, we present methods to compress optimization problems. The general goal is to represent a huge problem as a smaller problem or set of smaller problems, while still retaining enough information to ensure provable guarantees on solution quality and run time. We apply this approach to the following three settings. First, we propose a framework for accelerating both linear program solvers and convex solvers for problems with linear constraints. Our focus is on a class of problems for which data is either very costly, or hard to obtain. In these situations, the number of data points m available is much smaller than the number of variables, n. In a machine learning setting, this regime is increasingly prevalent since it is often advantageous to consider larger and larger feature spaces, while not necessarily obtaining proportionally more data. Analytically, we provide worst-case guarantees on both the runtime and the quality of the solution produced. Empirically, we show that our framework speeds up state-of-the-art commercial solvers by two orders of magnitude, while maintaining a near-optimal solution. Second, we propose a novel approach for distributed optimization which uses far fewer messages than existing methods. We consider a setting in which the problem data are distributed over the nodes. We provide worst-case guarantees on the performance with respect to the amount of communication it requires and the quality of the solution. The algorithm uses O(log(n+m)) messages with high probability. We note that this is an exponential reduction compared to the O(n) communication required during each round of traditional consensus based approaches. In terms of solution quality, our algorithm produces a feasible, near optimal solution. Numeric results demonstrate that the approximation error matches that of ADMM in many cases, while using orders-of-magnitude less communication. Lastly, we propose and analyze a provably accurate long-step infeasible Interior Point Algorithm (IPM) for linear programming. The core computational bottleneck in IPMs is the need to solve a linear system of equations at each iteration. We employ sketching techniques to make the linear system computation lighter, by handling well-known ill-conditioning problems that occur when using iterative solvers in IPMs for LPs. In particular, we propose a preconditioned Conjugate Gradient iterative solver for the linear system. Our sketching strategy makes the condition number of the preconditioned system provably small. In practice we demonstrate that our approach significantly reduces the condition number of the linear system, and thus allows for more efficient solving on a range of benchmark datasets.</p
    • …
    corecore