3,413 research outputs found
Spacecraft Autonomous Decision-Planning for Collision Avoidance: a Reinforcement Learning Approach
The space environment around the Earth is becoming increasingly populated by
both active spacecraft and space debris. To avoid potential collision events,
significant improvements in Space Situational Awareness (SSA) activities and
Collision Avoidance (CA) technologies are allowing the tracking and maneuvering
of spacecraft with increasing accuracy and reliability. However, these
procedures still largely involve a high level of human intervention to make the
necessary decisions. For an increasingly complex space environment, this
decision-making strategy is not likely to be sustainable. Therefore, it is
important to successfully introduce higher levels of automation for key Space
Traffic Management (STM) processes to ensure the level of reliability needed
for navigating a large number of spacecraft. These processes range from
collision risk detection to the identification of the appropriate action to
take and the execution of avoidance maneuvers. This work proposes an
implementation of autonomous CA decision-making capabilities on spacecraft
based on Reinforcement Learning (RL) techniques. A novel methodology based on a
Partially Observable Markov Decision Process (POMDP) framework is developed to
train the Artificial Intelligence (AI) system on board the spacecraft,
considering epistemic and aleatory uncertainties. The proposed framework
considers imperfect monitoring information about the status of the debris in
orbit and allows the AI system to effectively learn stochastic policies to
perform accurate Collision Avoidance Maneuvers (CAMs). The objective is to
successfully delegate the decision-making process for autonomously implementing
a CAM to the spacecraft without human intervention. This approach would allow
for a faster response in the decision-making process and for highly
decentralized operations.Comment: Preprint accepted in the 74th International Astronautical Congress
(IAC) - Baku, Azerbaijan, 2-6 October 202
Hazard Avoidance Alerting With Markov Decision Processes
This thesis describes an approach to designing hazard avoidance alerting systems based on a
Markov decision process (MDP) model of the alerting process, and shows its benefits over
standard design methods. One benefit of the MDP method is that it accounts for future decision
opportunities when choosing whether or not to alert, or in determining resolution guidance.
Another benefit is that it provides a means of modeling uncertain state information, such as
knowledge about unmeasurable mode variables, so that decisions are more informed.
A mode variable is an index for distinct types of behavior that a system exhibits at different
times. For example, in many situations normal system behavior is safe, but rare deviations from
the normal increase the likelihood of a harmful incident. Accurate modeling of mode
information is needed to minimize alerting system errors such as unnecessary or late alerts.
The benefits of the method are illustrated with two alerting scenarios where a pair of aircraft
must avoid collisions when passing one another. The first scenario has a fully observable state
and the second includes an uncertain mode describing whether an intruder aircraft levels off
safely above the evader or is in a hazardous blunder mode.
In MDP theory, outcome preferences are described in terms of utilities of different state
trajectories. In keeping with this, alerting system requirements are stated in the form of a reward
function. This is then used with probabilistic dynamic and sensor models to compute an alerting
logic (policy) that maximizes expected utility. Performance comparisons are made between the
MDP-based logics and alternate logics generated with current methods. It is found that in terms
of traditional performance measures (incident rate and unnecessary alert rate), the MDP-based
logic can meet or exceed that of alternate logics
Learning obstacle avoidance with an operant behavioral model
Artificial intelligence researchers have been attracted by the idea of having robots learn how to accomplish a task, rather than being told explicitly. Reinforcement learning has been proposed as an appealing framework to be used in controlling mobile agents. Robot learning research, as well as research in biological systems, face many similar problems in order to display high flexibility in performing a variety of tasks. In this work, the controlling of a vehicle in an avoidance task by a previously developed operant learning model (a form of animal learning) is studied. An environment in which a mobile robot with proximity sensors has to minimize the punishment for colliding against obstacles is simulated. The results were compared with the Q-Learning algorithm, and the proposed model had better performance. In this way a new artificial intelligence agent inspired by neurobiology, psychology, and ethology research is proposed.Fil: Gutnisky, D. A.. Universidad de Buenos Aires. Facultad de Ingeniería.Instituto de Ingeniería Biomédica; ArgentinaFil: Zanutto, Bonifacio Silvano. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Biología y Medicina Experimental. Fundación de Instituto de Biología y Medicina Experimental. Instituto de Biología y Medicina Experimental; Argentina. Universidad de Buenos Aires. Facultad de Ingeniería.Instituto de Ingeniería Biomédica; Argentin
Verification of Uncertain POMDPs Using Barrier Certificates
We consider a class of partially observable Markov decision processes
(POMDPs) with uncertain transition and/or observation probabilities. The
uncertainty takes the form of probability intervals. Such uncertain POMDPs can
be used, for example, to model autonomous agents with sensors with limited
accuracy, or agents undergoing a sudden component failure, or structural damage
[1]. Given an uncertain POMDP representation of the autonomous agent, our goal
is to propose a method for checking whether the system will satisfy an optimal
performance, while not violating a safety requirement (e.g. fuel level,
velocity, and etc.). To this end, we cast the POMDP problem into a switched
system scenario. We then take advantage of this switched system
characterization and propose a method based on barrier certificates for
optimality and/or safety verification. We then show that the verification task
can be carried out computationally by sum-of-squares programming. We illustrate
the efficacy of our method by applying it to a Mars rover exploration example.Comment: 8 pages, 4 figure
Decentralized Motion Planning with Collision Avoidance for a Team of UAVs under High Level Goals
This paper addresses the motion planning problem for a team of aerial agents
under high level goals. We propose a hybrid control strategy that guarantees
the accomplishment of each agent's local goal specification, which is given as
a temporal logic formula, while guaranteeing inter-agent collision avoidance.
In particular, by defining 3-D spheres that bound the agents' volume, we extend
previous work on decentralized navigation functions and propose control laws
that navigate the agents among predefined regions of interest of the workspace
while avoiding collision with each other. This allows us to abstract the motion
of the agents as finite transition systems and, by employing standard formal
verification techniques, to derive a high-level control algorithm that
satisfies the agents' specifications. Simulation and experimental results with
quadrotors verify the validity of the proposed method.Comment: Submitted to the IEEE International Conference on Robotics and
Automation (ICRA), Singapore, 201
- …