12 research outputs found
Action-Constrained Markov Decision Processes With Kullback-Leibler Cost
This paper concerns computation of optimal policies in which the one-step
reward function contains a cost term that models Kullback-Leibler divergence
with respect to nominal dynamics. This technique was introduced by Todorov in
2007, where it was shown under general conditions that the solution to the
average-reward optimality equations reduce to a simple eigenvector problem.
Since then many authors have sought to apply this technique to control problems
and models of bounded rationality in economics.
A crucial assumption is that the input process is essentially unconstrained.
For example, if the nominal dynamics include randomness from nature (e.g., the
impact of wind on a moving vehicle), then the optimal control solution does not
respect the exogenous nature of this disturbance.
This paper introduces a technique to solve a more general class of
action-constrained MDPs. The main idea is to solve an entire parameterized
family of MDPs, in which the parameter is a scalar weighting the one-step
reward function. The approach is new and practical even in the original
unconstrained formulation
Two-Echelon Vehicle and UAV Routing for Post-Disaster Humanitarian Operations with Uncertain Demand
Humanitarian logistics service providers have two major responsibilities
immediately after a disaster: locating trapped people and routing aid to them.
These difficult operations are further hindered by failures in the
transportation and telecommunications networks, which are often rendered
unusable by the disaster at hand. In this work, we propose two-echelon vehicle
routing frameworks for performing these operations using aerial uncrewed
autonomous vehicles (UAVs or drones) to address the issues associated with
these failures. In our proposed frameworks, we assume that ground vehicles
cannot reach the trapped population directly, but they can only transport
drones from a depot to some intermediate locations. The drones launched from
these locations serve to both identify demands for medical and other aids
(e.g., epi-pens, medical supplies, dry food, water) and make deliveries to
satisfy them. Specifically, we present two decision frameworks, in which the
resulting optimization problem is formulated as a two-echelon vehicle routing
problem. The first framework addresses the problem in two stages: providing
telecommunications capabilities in the first stage and satisfying the resulting
demands in the second. To that end, two types of drones are considered. Hotspot
drones have the capability of providing cell phone and internet reception, and
hence are used to capture demands. Delivery drones are subsequently employed to
satisfy the observed demand. The second framework, on the other hand, addresses
the problem as a stochastic emergency aid delivery problem, which uses a
two-stage robust optimization model to handle demand uncertainty. To solve the
resulting models, we propose efficient and novel solution approaches
Online Trajectory Optimization Using Inexact Gradient Feedback for Time-Varying Environments
This paper considers the problem of online trajectory design under
time-varying environments. We formulate the general trajectory optimization
problem within the framework of time-varying constrained convex optimization
and proposed a novel version of the online gradient ascent algorithm for such
problems. Moreover, the gradient feedback is noisy and allows us to use the
proposed algorithm for a range of practical applications where it is difficult
to acquire the true gradient. In contrast to the most available literature, we
present the offline sublinear regret of the proposed algorithm up to the path
length variations of the optimal offline solution, the cumulative gradient, and
the error in the gradient variations. Furthermore, we establish a lower bound
on the offline dynamic regret, which defines the optimality of any trajectory.
To show the efficacy of the proposed algorithm, we consider two practical
problems of interest. First, we consider a device to device (D2D)
communications setting, and the goal is to design a user trajectory while
maximizing its connectivity to the internet. The second problem is associated
with the online planning of energy-efficient trajectories for unmanned surface
vehicles (USV) under strong disturbances in ocean environments with both static
and dynamic goal locations. The detailed simulation results demonstrate the
significance of the proposed algorithm on synthetic and real data sets. Video
on the real-world datasets can be found at
{https://www.youtube.com/watch?v=FcRqqWtpf\_0}Comment: arXiv admin note: text overlap with arXiv:1804.0486
Wind-energy based path planning for Unmanned Aerial Vehicles using Markov Decision Processes
Exploiting wind-energy is one possible way to extend flight duration for Unmanned Arial Vehicles. Wind-energy can also be used to minimise energy consumption for a planned path. In this paper, we consider uncertain time-varying wind fields and plan a path through them. A Gaussian distribution is used to determine uncertainty in the Time-varying wind fields. We use Markov Decision Process to plan a path based upon the uncertainty of Gaussian distribution. Simulation results that compare the direct line of flight between start and target point and our planned path for energy consumption and time of travel are presented. The result is a robust path using the most visited cell while sampling the Gaussian distribution of the wind field in each cell
WindBots: A Concept for Persistent In-Situ Science Explorers for Gas Giants
This report summarizes the study of a mission concept to Jupiter with one or multiple Wind Robots able to operate in the Jovian atmosphere, above and below the clouds - down to 10 bar, for long durations and using energy obtained from local sources. This concept would be a step towards persistent exploration of gas giants by robots performing in-situ atmospheric science, powered by locally harvested energy. The Wind Robots, referred in this report as WindBots (WBs), would ride the planetary winds and transform aeolian energy into kinetic energy of flight, and electrical energy for on-board equipment. Small shape adjustments modify the aerodynamic characteristics of their surfaces, allowing for changes in direction and a high movement autonomy. Specifically, we sought solutions to increase survivability to strong/turbulent winds, and mobility and autonomy compared to passive balloons
A Novel and Inexpensive Solution to Build Autonomous Surface Vehicles Capable of Negotiating Highly Disturbed Environments
This dissertation has four main contributions. The first contribution is the design and build of a fleet of long-range, medium-duration deployable autonomous surface vehicles (ASV). The second is the development, implementation, and testing of inex-pensive sensors to accurately measure wind, current, and depth environmental vari- ables. The third leverages the first two contributions, and is modeling the effects of environmental variables on an ASV, finally leading to the development of a dynamic controller enabling deployment in more uncertain conditions.
The motivation for designing and building a new ASV comes from the lack of availability of a flexible and modular platform capable of long-range deployment in current state of the art. We present a design of an autonomous surface vehicle (ASV) with the power to cover large areas, the payload capacity to carry sufficient batteries to power components and sensor equipment, and enough fuel to remain on task for extended periods. An analysis of the design, lessons learned during build and deployments, as well as a comprehensive build tutorial is provided in this thesis.
The contributions from developing an inexpensive environmental sensor suite are multi-faceted. The ability to monitor, collect, and build models of depth, wind, and current in environmental applications proves to be valuable and challenging, where we illustrate our capability to provide an efficient, accurate, and inexpensive data collection platform for the community’s use. More selfishly, in order to enable our end- state goal of deploying our ASV in adverse environments, we realize the requirement to measure the same environmental characteristics in real-time and provide them as inputs to our effects model and dynamic controller. We present our methods for calibrating the sensors and the experimental results of measurement maps and prediction maps from a total of 70 field trials.
Finally, we seek to inculcate our measured environmental variables along with previously available odometry information to increase the viability of the ASV to maneuver in highly dynamic wind and current environments. We present experimen- tal results in differing conditions, augmenting the trajectory tracking performance of the original way-point navigation controller with our external forces feed-forward algorithm