170 research outputs found

    Neural Network Algorithm for Intercepting Targets Moving Along Known Trajectories by a Dubins' Car

    Full text link
    The task of intercepting a target moving along a rectilinear or circular trajectory by a Dubins' car is formulated as a time-optimal control problem with an arbitrary direction of the car's velocity at the interception moment. To solve this problem and to synthesize interception trajectories, neural network methods of unsupervised learning based on the Deep Deterministic Policy Gradient algorithm are used. The analysis of the obtained control laws and interception trajectories in comparison with the analytical solutions of the interception problem is performed. The mathematical modeling for the parameters of the target movement that the neural network had not seen before during training is carried out. Model experiments are conducted to test the stability of the neural solution. The effectiveness of using neural network methods for the synthesis of interception trajectories for given classes of target movements is shown

    Two-stage pursuit strategy for incomplete-information impulsive space pursuit-evasion mission using reinforcement learning

    Get PDF
    This paper presents a novel and robust two-stage pursuit strategy for the incomplete-information impulsive space pursuit-evasion missions considering the J2 perturbation. The strategy firstly models the impulsive pursuit-evasion game problem into a far-distance rendezvous stage and a close-distance game stage according to the perception range of the evader. For the far-distance rendezvous stage, it is transformed into a rendezvous trajectory optimization problem and a new objective function is proposed to obtain the pursuit trajectory with the optimal terminal pursuit capability. For the close-distance game stage, a closed-loop pursuit approach is proposed using one of the reinforcement learning algorithms, i.e. the deep deterministic policy gradient algorithm, to solve and update the pursuit trajectory for the incomplete-information impulsive pursuit-evasion missions. The feasibility of this novel strategy and its robustness to different initial states of the pursuer and evader and to the evasion strategies are demonstrated for the sun-synchronous orbit pursuit-evasion game scenarios. The results of the Monte Carlo tests show that the successful pursuit ratio of the proposed method is over 91% for all the given scenario

    Towards Trust and Transparency in Deep Learning Systems through Behavior Introspection & Online Competency Prediction

    Get PDF
    Deep neural networks are naturally “black boxes”, offering little insight into how or why they make decisions. These limitations diminish the adoption likelihood of such systems for important tasks and as trusted teammates. We employ introspective techniques to abstract machine activation patterns into human-interpretable strategies and identify relationships between environmental conditions (why), strategies (how), and performance (result) on both a deep reinforcement learning two-dimensional pursuit game application and image-based deep supervised learning obstacle recognition application. Pursuit-evasion games have been studied for decades under perfect information and analytically-derived policies for static environments. We incorporate uncertainty in a target’s position via simulated measurements and demonstrate a novel continuous deep reinforcement learning approach against speed-advantaged targets. The resulting approach was tested under many scenarios and performance exceeded that of a baseline course-aligned strategy. We manually observed separation of learned pursuit behaviors into strategy groups and manually hypothesized environmental conditions that affected performance. These manual observations motivated automation and abstraction of conditions, performance and strategy relationships. Next, we found that deep network activation patterns could be abstracted into human-interpretable strategies for two separate deep learning approaches. We characterized machine commitment by the introduction of a novel measure and revealed significant correlations between machine commitment, strategies, environmental conditions, and task performance. As such, we motivated online exploitation of machine behavior estimation for competency-aware intelligent systems. And finally, we realized online prediction capabilities for conditions, strategies, and performance. Our competency-aware machine learning approach is easily portable to new applications due to its Bayesian nonparametric foundation, wherein all inputs are compactly transformed into the same compact data representation. In particular, image data is transformed into a probability distribution over features extracted from the data. The resulting transformation forms a common representation for comparing two images, possibly from different types of sensors. By uncovering relationships between environmental conditions (why), machine strategies (how), & performance (result) and by giving rise to online estimation of machine competency, we increase transparency and trust in machine learning systems, contributing to the overarching explainable artificial intelligence initiative.

    Intelligent Escape of Robotic Systems: A Survey of Methodologies, Applications, and Challenges

    Full text link
    Intelligent escape is an interdisciplinary field that employs artificial intelligence (AI) techniques to enable robots with the capacity to intelligently react to potential dangers in dynamic, intricate, and unpredictable scenarios. As the emphasis on safety becomes increasingly paramount and advancements in robotic technologies continue to advance, a wide range of intelligent escape methodologies has been developed in recent years. This paper presents a comprehensive survey of state-of-the-art research work on intelligent escape of robotic systems. Four main methods of intelligent escape are reviewed, including planning-based methodologies, partitioning-based methodologies, learning-based methodologies, and bio-inspired methodologies. The strengths and limitations of existing methods are summarized. In addition, potential applications of intelligent escape are discussed in various domains, such as search and rescue, evacuation, military security, and healthcare. In an effort to develop new approaches to intelligent escape, this survey identifies current research challenges and provides insights into future research trends in intelligent escape.Comment: This paper is accepted by Journal of Intelligent and Robotic System

    Методи навчання з підкріпленням в динамічних іграх

    Get PDF
    Магістерська дисертація: 90 c., 4 ч., 23 табл., 14 рис., 12 джерел. Об‘єкт дослідження - навчання з підкріпленням, диференційні ігри такі як ігри переслідування. Мета роботи – доказати доцільність використання навчання з підкріпленням для розв‘язання диференційних ігор. Методи дослідження – моделювання різних форм ігор переслідування наприклад в яких є 1 переслідувач і 1 втікач та коли є один переслідувач та декілька втікачів, вирішення цих задач за допомогою теоретичних методів та методів навчання з підкріпленням. На основі зроблених досліджень були побудовані графіки і таблиці для порівняння алгоритмів та аналізу тренування алгоритму навчання з підкріпленням. Запропоновані автором методи можуть бути застосовані для моделювання та вирішення описаних задач ігрової взаємодії.Masters‘ thesis: 90 p., 4 p., 23 tables., 14 drawing., 12 sources. Object of study - reinforcement training, differential games such as pursuit games. The purpose of the work is to prove the feasibility of using reinforcement learning to solve differential games. Research Methods - Modeling various forms of pursuit games such as 1 pursuer and 1 fugitive and one persecutor and multiple fugitives, solving these problems using theoretical and reinforcement training methods. Based on the research, graphs and tables were constructed to compare the algorithms and to analyze the training of the reinforcement learning algorithm. The methods proposed by the author can be applied to simulate and solve the described problems of game interaction

    Parallel evolutionary programming techniques for strategy optimisation in air combat scenarios

    Get PDF
    Air combat between fighter missiles and aircraft can be categorised as a pursuit-evasion problem. One aircraft acts as a pursuer and the other as an evader. Generally, the pursuer will try to capture the evader as quickly as possible and the evader tries to evade capture for as long as possible. For an experienced human pilot, it is trivial to discuss this methodology, but to simulate it, the mathematics involved is very complex and difficult to implement in a computer environment. Classical methods, though very accurate in their analysis, are not suited to solve a complex 6DOF pursuit-evasion problem and they have limitations in representing real-world problems such as discontinuities, discrete, stochastic, chaotic, temporal information or lack of information. In this thesis, evolutionary programming (EP) is applied to determine the optimum maneuvering strategy for an aircraft (evader) to avoid interception by an incoming missile (pursuer). EP is a class of algorithms known as Evolutionary Algorithm (EA). EA has an ability to find an optimal solution in a complex problem which involves discontinuities, discrete, nondifferential parameters and noise. In addition, the methodology was implemented on parallel computer architecture to improve the computing time and expanding the search space. A sensitivity analysis was carried out to determine the best configuration and to understand the effect of parameters, such as number of processors, population size, number of generations, etc., on the results. The effects of sensor and instrument errors were also considered. The method enabled feasible solutions to be found in a relatively short period of time. However, the ability to search for feasible solutions is dependent on various parameters such as initial conditions, aircraft configurations and aerodynamic constraints. It is concluded that, in general, EP is able to determine feasible maneuvering strategies for an evader to avoid interception with and without instrument errors. The methodology has the potential to be used as a training tool for pilots in air combat or as an intelligent engagement strategy for autonomous systems, such as Unmanned Air Combat Vehicles (UCAV)

    The Effect of Malaysia General Election on Financial Network: An Evidence from Shariah-Compliant Stocks on Bursa Malaysia

    Get PDF
    Instead of focusing the volatility of the market, the market participants should consider on how the general election affects the correlation between the stocks during 14th general election Malaysia. The 14th general election of Malaysia was held on 9th May 2018. This event has a great impact towards the stocks listed on Bursa Malaysia. Thus, this study investigates the effect of 14th general election Malaysia towards the correlation between stock in Bursa Malaysia specifically the shariah-compliant stock. In addition, this paper examines the changes in terms of network topology for the duration, sixth months before and after the general election. The minimum spanning tree was used to visualize the correlation between the stocks. Also, the centrality measure, namely degree, closeness and betweenness were computed to identify if any changes of stocks that plays a crucial role in the network for the duration of before and after 14th general election Malaysia

    Single- and multiobjective reinforcement learning in dynamic adversarial games

    Get PDF
    This thesis uses reinforcement learning (RL) to address dynamic adversarial games in the context of air combat manoeuvring simulation. A sequential decision problem commonly encountered in the field of operations research, air combat manoeuvring simulation conventionally relied on agent programming methods that required significant domain knowledge to be manually encoded into the simulation environment. These methods are appropriate for determining the effectiveness of existing tactics in different simulated scenarios. However, in order to maximise the advantages provided by new technologies (such as autonomous aircraft), new tactics will need to be discovered. A proven technique for solving sequential decision problems, RL has the potential to discover these new tactics. This thesis explores four RL approaches—tabular, deep, discrete-to-deep and multiobjective— as mechanisms for discovering new behaviours in simulations of air combat manoeuvring. Itimplements and tests several methods for each approach and compares those methods in terms of the learning time, baseline and comparative performances, and implementation complexity. In addition to evaluating the utility of existing approaches to the specific task of air combat manoeuvring, this thesis proposes and investigates two novel methods, discrete-to-deep supervised policy learning (D2D-SPL) and discrete-to-deep supervised Q-value learning (D2D-SQL), which can be applied more generally. D2D-SPL and D2D-SQL offer the generalisability of deep RL at a cost closer to the tabular approach.Doctor of Philosoph
    corecore