12 research outputs found

    Action-Constrained Markov Decision Processes With Kullback-Leibler Cost

    Full text link
    This paper concerns computation of optimal policies in which the one-step reward function contains a cost term that models Kullback-Leibler divergence with respect to nominal dynamics. This technique was introduced by Todorov in 2007, where it was shown under general conditions that the solution to the average-reward optimality equations reduce to a simple eigenvector problem. Since then many authors have sought to apply this technique to control problems and models of bounded rationality in economics. A crucial assumption is that the input process is essentially unconstrained. For example, if the nominal dynamics include randomness from nature (e.g., the impact of wind on a moving vehicle), then the optimal control solution does not respect the exogenous nature of this disturbance. This paper introduces a technique to solve a more general class of action-constrained MDPs. The main idea is to solve an entire parameterized family of MDPs, in which the parameter is a scalar weighting the one-step reward function. The approach is new and practical even in the original unconstrained formulation

    Two-Echelon Vehicle and UAV Routing for Post-Disaster Humanitarian Operations with Uncertain Demand

    Full text link
    Humanitarian logistics service providers have two major responsibilities immediately after a disaster: locating trapped people and routing aid to them. These difficult operations are further hindered by failures in the transportation and telecommunications networks, which are often rendered unusable by the disaster at hand. In this work, we propose two-echelon vehicle routing frameworks for performing these operations using aerial uncrewed autonomous vehicles (UAVs or drones) to address the issues associated with these failures. In our proposed frameworks, we assume that ground vehicles cannot reach the trapped population directly, but they can only transport drones from a depot to some intermediate locations. The drones launched from these locations serve to both identify demands for medical and other aids (e.g., epi-pens, medical supplies, dry food, water) and make deliveries to satisfy them. Specifically, we present two decision frameworks, in which the resulting optimization problem is formulated as a two-echelon vehicle routing problem. The first framework addresses the problem in two stages: providing telecommunications capabilities in the first stage and satisfying the resulting demands in the second. To that end, two types of drones are considered. Hotspot drones have the capability of providing cell phone and internet reception, and hence are used to capture demands. Delivery drones are subsequently employed to satisfy the observed demand. The second framework, on the other hand, addresses the problem as a stochastic emergency aid delivery problem, which uses a two-stage robust optimization model to handle demand uncertainty. To solve the resulting models, we propose efficient and novel solution approaches

    Online Trajectory Optimization Using Inexact Gradient Feedback for Time-Varying Environments

    Full text link
    This paper considers the problem of online trajectory design under time-varying environments. We formulate the general trajectory optimization problem within the framework of time-varying constrained convex optimization and proposed a novel version of the online gradient ascent algorithm for such problems. Moreover, the gradient feedback is noisy and allows us to use the proposed algorithm for a range of practical applications where it is difficult to acquire the true gradient. In contrast to the most available literature, we present the offline sublinear regret of the proposed algorithm up to the path length variations of the optimal offline solution, the cumulative gradient, and the error in the gradient variations. Furthermore, we establish a lower bound on the offline dynamic regret, which defines the optimality of any trajectory. To show the efficacy of the proposed algorithm, we consider two practical problems of interest. First, we consider a device to device (D2D) communications setting, and the goal is to design a user trajectory while maximizing its connectivity to the internet. The second problem is associated with the online planning of energy-efficient trajectories for unmanned surface vehicles (USV) under strong disturbances in ocean environments with both static and dynamic goal locations. The detailed simulation results demonstrate the significance of the proposed algorithm on synthetic and real data sets. Video on the real-world datasets can be found at {https://www.youtube.com/watch?v=FcRqqWtpf\_0}Comment: arXiv admin note: text overlap with arXiv:1804.0486

    Wind-energy based path planning for Unmanned Aerial Vehicles using Markov Decision Processes

    Get PDF
    Exploiting wind-energy is one possible way to extend flight duration for Unmanned Arial Vehicles. Wind-energy can also be used to minimise energy consumption for a planned path. In this paper, we consider uncertain time-varying wind fields and plan a path through them. A Gaussian distribution is used to determine uncertainty in the Time-varying wind fields. We use Markov Decision Process to plan a path based upon the uncertainty of Gaussian distribution. Simulation results that compare the direct line of flight between start and target point and our planned path for energy consumption and time of travel are presented. The result is a robust path using the most visited cell while sampling the Gaussian distribution of the wind field in each cell

    WindBots: A Concept for Persistent In-Situ Science Explorers for Gas Giants

    Get PDF
    This report summarizes the study of a mission concept to Jupiter with one or multiple Wind Robots able to operate in the Jovian atmosphere, above and below the clouds - down to 10 bar, for long durations and using energy obtained from local sources. This concept would be a step towards persistent exploration of gas giants by robots performing in-situ atmospheric science, powered by locally harvested energy. The Wind Robots, referred in this report as WindBots (WBs), would ride the planetary winds and transform aeolian energy into kinetic energy of flight, and electrical energy for on-board equipment. Small shape adjustments modify the aerodynamic characteristics of their surfaces, allowing for changes in direction and a high movement autonomy. Specifically, we sought solutions to increase survivability to strong/turbulent winds, and mobility and autonomy compared to passive balloons

    A Novel and Inexpensive Solution to Build Autonomous Surface Vehicles Capable of Negotiating Highly Disturbed Environments

    Get PDF
    This dissertation has four main contributions. The first contribution is the design and build of a fleet of long-range, medium-duration deployable autonomous surface vehicles (ASV). The second is the development, implementation, and testing of inex-pensive sensors to accurately measure wind, current, and depth environmental vari- ables. The third leverages the first two contributions, and is modeling the effects of environmental variables on an ASV, finally leading to the development of a dynamic controller enabling deployment in more uncertain conditions. The motivation for designing and building a new ASV comes from the lack of availability of a flexible and modular platform capable of long-range deployment in current state of the art. We present a design of an autonomous surface vehicle (ASV) with the power to cover large areas, the payload capacity to carry sufficient batteries to power components and sensor equipment, and enough fuel to remain on task for extended periods. An analysis of the design, lessons learned during build and deployments, as well as a comprehensive build tutorial is provided in this thesis. The contributions from developing an inexpensive environmental sensor suite are multi-faceted. The ability to monitor, collect, and build models of depth, wind, and current in environmental applications proves to be valuable and challenging, where we illustrate our capability to provide an efficient, accurate, and inexpensive data collection platform for the community’s use. More selfishly, in order to enable our end- state goal of deploying our ASV in adverse environments, we realize the requirement to measure the same environmental characteristics in real-time and provide them as inputs to our effects model and dynamic controller. We present our methods for calibrating the sensors and the experimental results of measurement maps and prediction maps from a total of 70 field trials. Finally, we seek to inculcate our measured environmental variables along with previously available odometry information to increase the viability of the ASV to maneuver in highly dynamic wind and current environments. We present experimen- tal results in differing conditions, augmenting the trajectory tracking performance of the original way-point navigation controller with our external forces feed-forward algorithm
    corecore