283 research outputs found

    Allocation of UAV search efforts using dynamic programming and Bayesian updating

    Get PDF
    As unmanned aerial vehicle (UAV) technology and availability improves, it becomes increasingly more important to operate UAVs efficiently. Utilizing one UAV at a time is a relatively simple task, but when multiple UAVs need to be coordinated, optimal search plans can be difficult to create in a timely manner. In this thesis, we create a decision aid that generates efficient routes for multiple UAVs using dynamic programming and a limited-lookahead heuristic. The goal is to give the user the best knowledge of the locations of an arbitrary number of targets operating on a specified graph of nodes and arcs. The decision aid incorporates information about detections and nondetections and determines the probabilities of target locations using Bayesian updating. Target movement is modeled by a Markov process. The decision aid has been tested in two multi-hour field experiments involving actual UAVs and moving targets on the ground.http://archive.org/details/allocationofuavs109454112Outstanding ThesisUS Navy (USN) author.Approved for public release; distribution is unlimited

    Optimal search in discrete locations:extensions and new findings

    Get PDF
    A hidden target needs to be found by a searcher in many real-life situations, some of which involve large costs and significant consequences with failure. Therefore, efficient search methods are paramount. In our search model, the target lies in one of several discrete locations according to some hiding distribution, and the searcher's goal is to discover the target in minimum expected time by making successive searches of individual locations. In Part I of the thesis, the searcher knows the hiding distribution. Here, if there is only one way to search each location, the solution to the search problem, discovered in the 1960s, is simple; search next any location with a maximal probability per unit time of detecting the target. An equivalent solution is derived by viewing the search problem as a multi-armed bandit and following a Gittins index policy. Motivated by modern search technology, we introduce two modes---fast and slow---to search each location. The fast mode takes less time, but the slow mode is more likely to find the target. An optimal policy is difficult to obtain in general, because it requires an optimal sequence of search modes for each location, in addition to a set of sequence-dependent Gittins indices for choosing between locations. For each mode, we identify a sufficient condition for a location to use only that search mode in an optimal policy. For locations meeting neither sufficient condition, an optimal choice of search mode is extremely complicated, depending both on the hiding distribution and the search parameters of the other locations. We propose several heuristic policies motivated by our analysis, and demonstrate their near-optimal performance in an extensive numerical study. In Part II of the thesis, the searcher has only one search mode per location, but does not know the hiding distribution, which is chosen by an intelligent hider who aims to maximise the expected time until the target is discovered. Such a search game, modelled via two-person, zero-sum game theory, is relevant if the target is a bomb, intruder, or, of increasing importance due to advances in technology, a computer hacker. By Part I, if the hiding distribution is known, an optimal counter strategy for the searcher is any corresponding Gittins index policy. To develop an optimal search strategy in the search game, the searcher must account for the hider’s motivation to choose an optimal hiding distribution, and consider the set of corresponding Gittins index policies. %It follows that an optimal search strategy in the search game must be some Gittins index policy if the hiding distribution is assumed to be chosen optimally by the hider. However, the searcher must choose carefully from this set of Gittins index policies to ensure the same expected time to discover the target regardless of where it is hidden by the hider. %It follows that an optimal search strategy in the search game must be a Gittins index policy applied to a hiding distribution which is optimal from the hider's perspective. However, to avoid giving the hider any advantage, the searcher must carefully choose such a Gittins index policy among the many available. As a result, finding an optimal search strategy, or even proving one exists, is difficult. We extend several results for special cases from the literature to the fully-general search game; in particular, we show an optimal search strategy exists and may take a simple form. Using a novel test, we investigate the frequency of the optimality of a particular hiding strategy that gives the searcher no preference over any location at the beginning of the search

    Utilizing Dual Information for Moving Target Search Trajectory Optimization

    Get PDF
    Various recent events have shown the enormous importance of maritime search-and-rescue missions. By reducing the time to find floating victims at sea, the number of casualties can be reduced. A major improvement can be achieved by employing autonomous aerial systems for autonomous search missions, allowed by the recent rise in technological development. In this context, the need for efficient search trajectory planning methods arises. The objective is to maximize the probability of detecting the target at a certain time k, which depends on the estimation of the position of the target. For stationary target search, this is a function of the observation at time k. When considering the target movement, this is a function of all previous observations up until time k. This is the main difficulty arising in solving moving target search problems when the duration of the search mission increases. We present an intermediate result for the single searcher single target case towards an efficient algorithm for longer missions with multiple aerial vehicles. Our primary aim in the development of this algorithm is to disconnect the networks of the target and platform, which we have achieved by applying Benders decomposition. Consequently, we solve two much smaller problems sequentially in iterations. Between the problems, primal and dual information is exchanged. To the best of our knowledge, this is the first approach utilizing dual information within the category of moving target search problems. We show the applicability in computational experiments and provide an analysis of the results. Furthermore, we propose well-founded improvements for further research towards solving real-life instances with multiple searchers

    Coverage & cooperation: Completing complex tasks as quickly as possible using teams of robots

    Get PDF
    As the robotics industry grows and robots enter our homes and public spaces, they are increasingly expected to work in cooperation with each other. My thesis focuses on multirobot planning, specifically in the context of coverage robots, such as robotic lawnmowers and vacuum cleaners. Two problems unique to multirobot teams are task allocation and search. I present a task allocation algorithm which balances the workload amongst all robots in the team with the objective of minimizing the overall mission time. I also present a search algorithm which robots can use to find lost teammates. It uses a probabilistic belief of a target robot’s position to create a planning tree and then searches by following the best path in the tree. For robust multirobot coverage, I use both the task allocation and search algorithms. First the coverage region is divided into a set of small coverage tasks which minimize the number of turns the robots will need to take. These tasks are then allocated to individual robots. During the mission, robots replan with nearby robots to rebalance the workload and, once a robot has finished its tasks, it searches for teammates to help them finish their tasks faster

    Multi-Agent Search for a Moving and Camouflaging Target

    Full text link
    In multi-agent search planning for a randomly moving and camouflaging target, we examine heterogeneous searchers that differ in terms of their endurance level, travel speed, and detection ability. This leads to a convex mixed-integer nonlinear program, which we reformulate using three linearization techniques. We develop preprocessing steps, outer approximations via lazy constraints, and bundle-based cutting plane methods to address large-scale instances. Further specializations emerge when the target moves according to a Markov chain. We carry out an extensive numerical study to show the computational efficiency of our methods and to derive insights regarding which approach should be favored for which type of problem instance

    Bounding an Optimal Search Path with a Game of Cop and Robber on Graphs

    Full text link
    Abstract. In search theory, the goal of the Optimal Search Path (OSP) problem is to find a finite length path maximizing the probability that a searcher detects a lost wanderer on a graph. We propose to bound the probability of finding the wanderer in the remaining search time by relaxing the problem into a stochastic game of cop and robber from graph theory. We discuss the validity of this bound and demonstrate its effectiveness on a constraint programming model of the problem. Ex-perimental results show how our novel bound compares favorably to the DMEAN bound from the literature, a state-of-the-art bound based on a relaxation of the OSP into a longest path problem

    Path Optimization for the Resource-Constrained Searcher

    Get PDF
    Naval Research LogisticsWe formulate and solve a discrete-time path-optimization problem where a single searcher, operating in a discretized 3-dimensional airspace, looks for a moving target in a finite set of cells. The searcher is constrained by maximum limits on the consumption of several resources such as time, fuel, and risk along any path. We develop a special- ized branch-and-bound algorithm for this problem that utilizes several network reduction procedures as well as a new bounding technique based on Lagrangian relaxation and net- work expansion. The resulting algorithm outperforms a state-of-the-art algorithm for solving time-constrained problems and also is the first algorithm to solve multi-constrained problems

    Convergence of a Reinforcement Learning Algorithm in Continuous Domains

    Get PDF
    In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and actions have been well studied, and there exist algorithms capable of producing a sequence of policies which converge to an optimal policy with probability one. Convergence guarantees for problems with continuous states also exist. Until recently, no online algorithm for continuous states and continuous actions has been proven to produce optimal policies. This Dissertation contains the results of research into reinforcement learning algorithms for problems in which both the state and action spaces are continuous. The problems to be solved are introduced formally as Markov Decision Processes. Also introduced is a value-function solution method known as Q-learning. The primary result of this Dissertation is the presentation of a Q-learning type algorithm adapted for continuous states and actions, and the proof that it asymptotically learns an optimal policy with probability one. While the algorithm is intended to advance the theory of continuous domain reinforcement learning, an example is given to show that with appropriate exploration policies, it can produce satisfactory solutions to non-trivial benchmark problems. Kernel regression based algorithms have excellent theoretical properties, but have high computational cost and do not adapt well to high-dimensional problems. A class of batch-mode regression tree-based algorithms is introduced. These algorithms are modular in the sense that different methods for partitioning, performing local regression, and choosing representative actions can be chosen. Experiments demonstrate superior performance over kernel methods. Batch algorithms possess superior computational efficiency, but pay the price of not being able to use past observations to inform exploration. A data structure useful for limited learning during the exploration phase is introduced. It is then demonstrated that this limited learning can outperform batch algorithms using totally random action exploration

    Essays in learning, optimization and game theory

    Get PDF
    • …
    corecore