158,504 research outputs found

    Planning with imperfect information : interceptor assignment

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2006.Includes bibliographical references (p. 121-123).We consider the problem of assigning a scarce number of interceptors to a wave of incoming atmospheric re-entry vehicles (RV). In this single wave, there is time to assign interceptors to a wave of incoming RVs, gain information on the intercept status, and then if necessary, assign interceptors once more. However, the status information of these RVs may not be reliable. This problem becomes challenging when considering the small inventory of interceptors, imperfect information from sensors, and the possibility of future waves of RVs. This work formulates the problem as a partially observable Markov decision process (POMDP) in order to account for the uncertainty in information. We use a POMDP solution algorithm to find an optimal policy for assigning interceptors to RVs in a single wave. From there, three cases are compared in a simulation of a single wave. These cases are perfect information from sensors; imperfect information from sensors, but acting as it were perfect; and accounting for imperfect information from sensors using the POMDP formulation. Using a variety of parameter variation tests, we examine the performance of the POMDP formulation by comparing the probability of an incoming RV avoiding intercept and the interceptor inventory remaining. We vary the reliability of the sensors, as well as the number of interceptors in inventory, and the number of incoming RVs in the wave. The POMDP formulation consistently provides a policy that conserves more interceptors and approaches the probability of intercept of the other cases. However, situations do exist where the POMDP formulation produces a policy that performs less effectively than a strategy assuming perfect information.by Daniel B. McAllister.S.M

    Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4

    Full text link
    Unlike perfect information games, where all elements are known to every player, imperfect information games emulate the real-world complexities of decision-making under uncertain or incomplete information. GPT-4, the recent breakthrough in large language models (LLMs) trained on massive passive data, is notable for its knowledge retrieval and reasoning abilities. This paper delves into the applicability of GPT-4's learned knowledge for imperfect information games. To achieve this, we introduce \textbf{Suspicion-Agent}, an innovative agent that leverages GPT-4's capabilities for performing in imperfect information games. With proper prompt engineering to achieve different functions, Suspicion-Agent based on GPT-4 demonstrates remarkable adaptability across a range of imperfect information card games. Importantly, GPT-4 displays a strong high-order theory of mind (ToM) capacity, meaning it can understand others and intentionally impact others' behavior. Leveraging this, we design a planning strategy that enables GPT-4 to competently play against different opponents, adapting its gameplay style as needed, while requiring only the game rules and descriptions of observations as input. In the experiments, we qualitatively showcase the capabilities of Suspicion-Agent across three different imperfect information games and then quantitatively evaluate it in Leduc Hold'em. The results show that Suspicion-Agent can potentially outperform traditional algorithms designed for imperfect information games, without any specialized training or examples. In order to encourage and foster deeper insights within the community, we make our game-related data publicly available

    Sampling-based reactive motion planning with temporal logic constraints and imperfect state information

    Get PDF
    © 2017, Springer International Publishing AG. This paper presents a method that allows mobile systems with uncertainty in motion and sensing to react to unknown environments while high-level specifications are satisfied. Although previous works have addressed the problem of synthesising controllers under uncertainty constraints and temporal logic specifications, reaction to dynamic environments has not been considered under this scenario. The method uses feedback-based information roadmaps (FIRMs) to break the curse of history associated with partially observable systems. A transition system is incrementally constructed based on the idea of FIRMs by adding nodes on the belief space. Then, a policy is found in the product Markov decision process created between the transition system and a Rabin automaton representing a linear temporal logic formula. The proposed solution allows the system to react to previously unknown elements in the environment. To achieve fast reaction time, a FIRM considering the probability of violating the specification in each transition is used to drive the system towards local targets or to avoid obstacles. The method is demonstrated with an illustrative example

    Safe Sequential Path Planning Under Disturbances and Imperfect Information

    Full text link
    Multi-UAV systems are safety-critical, and guarantees must be made to ensure no unsafe configurations occur. Hamilton-Jacobi (HJ) reachability is ideal for analyzing such safety-critical systems; however, its direct application is limited to small-scale systems of no more than two vehicles due to an exponentially-scaling computational complexity. Previously, the sequential path planning (SPP) method, which assigns strict priorities to vehicles, was proposed; SPP allows multi-vehicle path planning to be done with a linearly-scaling computational complexity. However, the previous formulation assumed that there are no disturbances, and that every vehicle has perfect knowledge of higher-priority vehicles' positions. In this paper, we make SPP more practical by providing three different methods to account for disturbances in dynamics and imperfect knowledge of higher-priority vehicles' states. Each method has different assumptions about information sharing. We demonstrate our proposed methods in simulations.Comment: American Control Conference, 201

    Provably Safe Robot Navigation with Obstacle Uncertainty

    Full text link
    As drones and autonomous cars become more widespread it is becoming increasingly important that robots can operate safely under realistic conditions. The noisy information fed into real systems means that robots must use estimates of the environment to plan navigation. Efficiently guaranteeing that the resulting motion plans are safe under these circumstances has proved difficult. We examine how to guarantee that a trajectory or policy is safe with only imperfect observations of the environment. We examine the implications of various mathematical formalisms of safety and arrive at a mathematical notion of safety of a long-term execution, even when conditioned on observational information. We present efficient algorithms that can prove that trajectories or policies are safe with much tighter bounds than in previous work. Notably, the complexity of the environment does not affect our methods ability to evaluate if a trajectory or policy is safe. We then use these safety checking methods to design a safe variant of the RRT planning algorithm.Comment: RSS 201

    The Update Equivalence Framework for Decision-Time Planning

    Full text link
    The process of revising (or constructing) a policy immediately prior to execution -- known as decision-time planning -- is key to achieving superhuman performance in perfect-information settings like chess and Go. A recent line of work has extended decision-time planning to more general imperfect-information settings, leading to superhuman performance in poker. However, these methods requires considering subgames whose sizes grow quickly in the amount of non-public information, making them unhelpful when the amount of non-public information is large. Motivated by this issue, we introduce an alternative framework for decision-time planning that is not based on subgames but rather on the notion of update equivalence. In this framework, decision-time planning algorithms simulate updates of synchronous learning algorithms. This framework enables us to introduce a new family of principled decision-time planning algorithms that do not rely on public information, opening the door to sound and effective decision-time planning in settings with large amounts of non-public information. In experiments, members of this family produce comparable or superior results compared to state-of-the-art approaches in Hanabi and improve performance in 3x3 Abrupt Dark Hex and Phantom Tic-Tac-Toe

    The Specification of the Planning Systems; Report of the 1993 Questionnaire Survey

    Get PDF
    This report describes a survey carried out as part of a research project undeaaken hy the Institute for Transport Studies and the School of Computer Studies at the University of Leeds, funded by the Science and Engineering Research Council. The project was concerned with the specification of trip planning systems, which are systems which provide information to travellers and potential travellers ahout all aspects of their journey, but in this case principally route and timetable information for public transport users and route information for car travellers before the journey is made. Previous evidence had suggested that travellers may make sub-optimal travel decisions, meaning that they may make longer, slower or more expensive journeys than necessary because of imperfect information. Other parts of the project addressed sub-optimality in the choice of mode and time of travel but a main objective of the survey described in this report was to examine sub-optimality of route choice separately for journeys with which respondents were familiar and journeys with which they were unfamiliar. Other objectives were concerned with the travel information currently used, or desired by, the respondents, who were randomly-selected travellers from the West Yorkshire town of Mirfield. Maps were widely used by car drivers - about one in five used them for familiar trips and ahout three quarters used them for unfamiliar trips. For public transport trips, timetables were used hy about half of the travellers making familiar trips and 95 per cent making unfamiliar trips. Information on delays would have been welcomed by both private and puhlic transport travellers: nearly three-quarters of familiar and unfamiliar car trips would have liked congestion information as would a significant minority of bus users. Most of all, public transport-users would have welcomed information on service delays and cancellations. It would seem that real-time information on delays would be a key feature of a successful trip planning system. The sub-optimality for car journeys averaged at 2.6 minutes per trip for familiar trips and 6 minutes per trip for unfamiliar trips. Sub-optimality was directly related to trip distance for familiar trips hut not for unfamiliar trips. This indicates a modest hut significant reduction in car journey times could be brought about by trip planning systems. Public transport trip sample sizes were too small to permit reliable estimates to be made of their sub-optimality
    • …
    corecore