13,338 research outputs found
DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
A major difficulty of solving continuous POMDPs is to infer the multi-modal
distribution of the unobserved true states and to make the planning algorithm
dependent on the perceived uncertainty. We cast POMDP filtering and planning
problems as two closely related Sequential Monte Carlo (SMC) processes, one
over the real states and the other over the future optimal trajectories, and
combine the merits of these two parts in a new model named the DualSMC network.
In particular, we first introduce an adversarial particle filter that leverages
the adversarial relationship between its internal components. Based on the
filtering results, we then propose a planning algorithm that extends the
previous SMC planning approach [Piche et al., 2018] to continuous POMDPs with
an uncertainty-dependent policy. Crucially, not only can DualSMC handle complex
observations such as image input but also it remains highly interpretable. It
is shown to be effective in three continuous POMDP domains: the floor
positioning domain, the 3D light-dark navigation domain, and a modified Reacher
domain.Comment: IJCAI 202
Monte Carlo optimization of decentralized estimation networks over directed acyclic graphs under communication constraints
Motivated by the vision of sensor networks, we consider decentralized estimation networks over bandwidth–limited communication links, and are particularly interested in the tradeoff between the estimation accuracy and the cost of communications due to, e.g., energy consumption. We employ a class of in–network processing strategies that admits directed acyclic graph representations and yields a tractable Bayesian risk that comprises the cost of communications and estimation error penalty. This perspective captures a broad range of possibilities for processing under network constraints and enables a rigorous design problem in the form of constrained optimization. A similar scheme and the structures exhibited by the solutions have been previously studied in the context of decentralized detection. Under reasonable assumptions, the optimization can be carried out in a message passing fashion. We adopt
this framework for estimation, however, the corresponding optimization scheme involves integral operators that cannot be evaluated exactly in general. We develop an approximation framework using Monte Carlo methods and obtain
particle representations and approximate computational schemes for both the in–network processing strategies and their optimization. The proposed Monte Carlo optimization procedure operates in a scalable and efficient fashion and,
owing to the non-parametric nature, can produce results for any distributions provided that samples can be produced from the marginals. In addition, this approach exhibits graceful degradation of the estimation accuracy asymptotically
as the communication becomes more costly, through a parameterized Bayesian risk
Monte Carlo optimization approach for decentralized estimation networks under communication constraints
We consider designing decentralized estimation schemes over bandwidth limited communication links with a particular interest in the tradeoff between the estimation accuracy and the cost of communications due to, e.g., energy
consumption. We take two classes of in–network processing strategies into account which yield graph representations through modeling the sensor platforms as the vertices and the communication links by edges as well as a tractable
Bayesian risk that comprises the cost of transmissions and penalty for the estimation errors. This approach captures a broad range of possibilities for “online” processing of observations as well as the constraints imposed and enables a rigorous design setting in the form of a constrained optimization problem. Similar schemes as well as the structures exhibited by the solutions to the design problem has been studied previously in the context of decentralized detection. Under reasonable assumptions, the optimization can be carried out in a message passing fashion. We adopt this framework for estimation, however, the corresponding optimization schemes involve integral operators that cannot
be evaluated exactly in general. We develop an approximation framework using Monte Carlo methods and obtain particle representations and approximate computational schemes for both classes of in–network processing strategies
and their optimization. The proposed Monte Carlo optimization procedures operate in a scalable and efficient fashion and, owing to the non-parametric nature, can produce results for any distributions provided that samples can be
produced from the marginals. In addition, this approach exhibits graceful degradation of the estimation accuracy asymptotically as the communication becomes more costly, through a parameterized Bayesian risk
Belief Tree Search for Active Object Recognition
Active Object Recognition (AOR) has been approached as an unsupervised
learning problem, in which optimal trajectories for object inspection are not
known and are to be discovered by reducing label uncertainty measures or
training with reinforcement learning. Such approaches have no guarantees of the
quality of their solution. In this paper, we treat AOR as a Partially
Observable Markov Decision Process (POMDP) and find near-optimal policies on
training data using Belief Tree Search (BTS) on the corresponding belief Markov
Decision Process (MDP). AOR then reduces to the problem of knowledge transfer
from near-optimal policies on training set to the test set. We train a Long
Short Term Memory (LSTM) network to predict the best next action on the
training set rollouts. We sho that the proposed AOR method generalizes well to
novel views of familiar objects and also to novel objects. We compare this
supervised scheme against guided policy search, and find that the LSTM network
reaches higher recognition accuracy compared to the guided policy method. We
further look into optimizing the observation function to increase the total
collected reward of optimal policy. In AOR, the observation function is known
only approximately. We propose a gradient-based method update to this
approximate observation function to increase the total reward of any policy. We
show that by optimizing the observation function and retraining the supervised
LSTM network, the AOR performance on the test set improves significantly.Comment: IROS 201
Monte Carlo optimization approach for decentralized estimation networks under communication constraints
We consider designing decentralized estimation schemes over bandwidth limited communication links with a particular interest in the tradeoff between the estimation accuracy and the cost of communications due to, e.g., energy
consumption. We take two classes of in–network processing strategies into account which yield graph representations through modeling the sensor platforms as the vertices and the communication links by edges as well as a tractable
Bayesian risk that comprises the cost of transmissions and penalty for the estimation errors. This approach captures a broad range of possibilities for “online” processing of observations as well as the constraints imposed and enables a rigorous design setting in the form of a constrained optimization problem. Similar schemes as well as the structures exhibited by the solutions to the design problem has been studied previously in the context of decentralized detection. Under reasonable assumptions, the optimization can be carried out in a message passing fashion. We adopt this framework for estimation, however, the corresponding optimization schemes involve integral operators that cannot
be evaluated exactly in general. We develop an approximation framework using Monte Carlo methods and obtain particle representations and approximate computational schemes for both classes of in–network processing strategies
and their optimization. The proposed Monte Carlo optimization procedures operate in a scalable and efficient fashion and, owing to the non-parametric nature, can produce results for any distributions provided that samples can be
produced from the marginals. In addition, this approach exhibits graceful degradation of the estimation accuracy asymptotically as the communication becomes more costly, through a parameterized Bayesian risk
- …