Search CORE

66 research outputs found

Particle Filtering for Model-Based Anomaly Detection in Sensor Networks

Author: Banerjee Bikramjit
Kraemer Landon
Solano Wanda
Publication venue
Publication date: 01/12/2012
Field of study

A novel technique has been developed for anomaly detection of rocket engine test stand (RETS) data. The objective was to develop a system that postprocesses a csv file containing the sensor readings and activities (time-series) from a rocket engine test, and detects any anomalies that might have occurred during the test. The output consists of the names of the sensors that show anomalous behavior, and the start and end time of each anomaly. In order to reduce the involvement of domain experts significantly, several data-driven approaches have been proposed where models are automatically acquired from the data, thus bypassing the cost and effort of building system models. Many supervised learning methods can efficiently learn operational and fault models, given large amounts of both nominal and fault data. However, for domains such as RETS data, the amount of anomalous data that is actually available is relatively small, making most supervised learning methods rather ineffective, and in general met with limited success in anomaly detection. The fundamental problem with existing approaches is that they assume that the data are iid, i.e., independent and identically distributed, which is violated in typical RETS data. None of these techniques naturally exploit the temporal information inherent in time series data from the sensor networks. There are correlations among the sensor readings, not only at the same time, but also across time. However, these approaches have not explicitly identified and exploited such correlations. Given these limitations of model-free methods, there has been renewed interest in model-based methods, specifically graphical methods that explicitly reason temporally. The Gaussian Mixture Model (GMM) in a Linear Dynamic System approach assumes that the multi-dimensional test data is a mixture of multi-variate Gaussians, and fits a given number of Gaussian clusters with the help of the wellknown Expectation Maximization (EM) algorithm. The parameters thus learned are used for calculating the joint distribution of the observations. However, this GMM assumption is essentially an approximation and signals the potential viability of non-parametric density estimators. This is the key idea underlying the new approach

Aquila Digital Community (University of Southern Mississippi, USM)

NASA Technical Reports Server

Exact and Heuristic Algorithms for Risk-Aware Stochastic Physical Search

Author: Banerjee Bikramjit
Brown Daniel S.
Gemelli Nathaniel
Hudack Jeffrey
Publication venue: The Aquila Digital Community
Publication date: 01/08/2017
Field of study

We consider an intelligent agent seeking to obtain an item from one of several physical locations, where the cost to obtain the item at each location is stochastic. We study risk-aware stochastic physical search (RA-SPS), where both the cost to travel and the cost to obtain the item are taken from the same budget and where the objective is to maximize the probability of success while minimizing the required budget. This type of problem models many task-planning scenarios, such as space exploration, shopping, or surveillance. In these types of scenarios, the actual cost of completing an objective at a location may only be revealed when an agent physically arrives at the location, and the agent may need to use a single resource to both search for and acquire the item of interest. We present exact and heuristic algorithms for solving RA-SPS problems on complete metric graphs. We first formulate the problem as mixed integer linear programming problem. We then develop custom branch and bound algorithms that result in a dramatic reduction in computation time. Using these algorithms, we generate empirical insights into the hardness landscape of the RA-SPS problem and compare the performance of several heuristics

Aquila Digital Community (University of Southern Mississippi, USM)

Multi-Agent Reinforcement Learning as a Rehearsal for Decentralized Planning

Author: Aras
Auer
Bernstein
Bikramjit Banerjee
Busoniu
Farahmand
Landon Kraemer
Mataric
Oliehoek
Price
Sutton
Publication venue: The Aquila Digital Community
Publication date: 19/05/2016
Field of study

Decentralized partially observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Multi-agent reinforcement learning (MARL) based approaches have been recently proposed for distributed solution of Dec-POMDPs without full prior knowledge of the model, but these methods assume that conditions during learning and policy execution are identical. In some practical scenarios this may not be the case. We propose a novel MARL approach in which agents are allowed to rehearse with information that will not be available during policy execution. The key is for the agents to learn policies that do not explicitly rely on these rehearsal features. We also establish a weak convergence result for our algorithm, RLaR, demonstrating that RLaR converges in probability when certain conditions are met. We show experimentally that incorporating rehearsal features can enhance the learning rate compared to non-rehearsal-based learners, and demonstrate fast, (near) optimal performance on many existing benchmark Dec-POMDP problems. We also compare RLaR against an existing approximate Dec-POMDP solver which, like RLaR, does not assume a priori knowledge of the model. While RLaR׳s policy representation is not as scalable, we show that RLaR produces higher quality policies for most problems and horizons studied

Aquila Digital Community (University of Southern Mississippi, USM)

Crossref

Autonomous Acquisition of Behavior Trees for Robot Control

Author: Banerjee Bikramjit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/01/2019
Field of study

Behavior trees (BT) are a popular control architecture in the computer game industry, and have been more recently applied in robotics. One open question is how can intelligent agents/robots autonomously acquire their behavior trees for task level control? In contrast with existing approaches that either refine an initially given BT, or directly build the BT based on human feedback/demonstration, we leverage reinforcement learning (RL) that allows robots to autonomously learn control policies by repeated task interaction, but often expressed in a language more difficult to interpret than BTs. The learned control policy is then converted to a behavior tree via our proposed decanonicalization algorithm. The feasibility of this idea is based on a proposed notion of canonical behavior trees (CBT). In particular, we show (1) CBTs are sufficiently expressive to capture RL control policies, and (2) that RL can be independent of an optimal behavior permutation, despite the BT convention of left-to-right priority, thus obviating the need for a combinatorial search. Two evaluation domains help illustrate our approach

Aquila Digital Community (University of Southern Mississippi, USM)

Crossref

Concurrent Learning of Control in Multi-agent Sequential Decision Tasks

Author: Banerjee Bikramjit
Publication venue: The Aquila Digital Community
Publication date: 17/04/2018
Field of study

The overall objective of this project was to develop multi-agent reinforcement learning (MARL) approaches for intelligent agents to autonomously learn distributed control policies in decentralized partially observable Markov decision processes (Dec-POMDPs), without prior knowledge of the model parameters

Aquila Digital Community (University of Southern Mississippi, USM)

Pruning for Monte Carlo Distributed Reinforcement Learning In Decentralized POMDPs

Author: Banerjee Bikramjit
Publication venue: The Aquila Digital Community
Publication date: 30/06/2013
Field of study

Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a powerful modeling technique for realistic multi-agent coordination problems under uncertainty. Prevalent solution techniques are centralized and assume prior knowledge of the model. Recently a Monte Carlo based distributed reinforcement learning approach was proposed, where agents take turns to learn best responses to each other\u27s policies. This promotes decentralization of the policy computation problem, and relaxes reliance on the full knowledge of the problem parameters. However, this Monte Carlo approach has a large sample complexity, which we address in this paper. In particular, we propose and analyze a modified version of the previous algorithm that adaptively eliminates parts of the experience tree from further exploration, thus requiring fewer samples while ensuring unchanged confidence in the learned value function. Experiments demonstrate significant reduction in sample complexity - the maximum reductions ranging from 61% to 91% over different benchmark Dec-POMDP problems - with the final policies being often better due to more focused exploration. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved

Aquila Digital Community (University of Southern Mississippi, USM)

Association for the Advancement of Artificial Intelligence: AAAI Publications