Search CORE

12 research outputs found

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Author: Dwaracherla Vikranth R.
Tan Tian
Xiong Zhihan
Publication venue
Publication date: 19/03/2020
Field of study

It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient exploration in reinforcement learning. Ensemble sampling offers a relatively computationally tractable way of doing this using randomized value functions. However, it still requires a huge amount of computational resources for complex problems. In this paper, we present an alternative, computationally efficient way to induce exploration using index sampling. We use an indexed value function to represent uncertainty in our action-value estimates. We first present an algorithm to learn parameterized indexed value function through a distributional version of temporal difference in a tabular setting and prove its regret bound. Then, in a computational point of view, we propose a dual-network architecture, Parameterized Indexed Networks (PINs), comprising one mean network and one uncertainty network to learn the indexed value function. Finally, we show the efficacy of PINs through computational experiments.Comment: 17 pages, 4 figures, Proceedings of the 34th AAAI Conference on Artificial Intelligenc

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Reinforcement Learning, Bit by Bit

Author: Dwaracherla Vikranth
Ibrahimi Morteza
Lu Xiuyuan
Osband Ian
Van Roy Benjamin
Wen Zheng
Publication venue
Publication date: 12/04/2021
Field of study

Reinforcement learning agents have demonstrated remarkable achievements in simulated environments. Data efficiency poses an impediment to carrying this success over to real environments. The design of data-efficient agents calls for a deeper understanding of information acquisition and representation. We develop concepts and establish a regret bound that together offer principled guidance. The bound sheds light on questions of what information to seek, how to seek that information, and it what information to retain. To illustrate concepts, we design simple agents that build on them and present computational results that demonstrate improvements in data efficiency

arXiv.org e-Print Archive

Epistemic Neural Networks

Author: Asghari Seyed Mohammad
Dwaracherla Vikranth
Ibrahimi Morteza
Lu Xiuyuan
Osband Ian
Van Roy Benjamin
Wen Zheng
Publication venue
Publication date: 06/07/2022
Field of study

Intelligence relies on an agent's knowledge of what it does not know. This capability can be assessed based on the quality of joint predictions of labels across multiple inputs. Conventional neural networks lack this capability and, since most research has focused on marginal predictions, this shortcoming has been largely overlooked. We introduce the epistemic neural network (ENN) as an interface for models that represent uncertainty as required to generate useful joint predictions. While prior approaches to uncertainty modeling such as Bayesian neural networks can be expressed as ENNs, this new interface facilitates comparison of joint predictions and the design of novel architectures and algorithms. In particular, we introduce the epinet: an architecture that can supplement any conventional neural network, including large pretrained models, and can be trained with modest incremental computation to estimate uncertainty. With an epinet, conventional neural networks outperform very large ensembles, consisting of hundreds or more particles, with orders of magnitude less computation. We demonstrate this efficacy across synthetic data, ImageNet, and some reinforcement learning tasks. As part of this effort we open-source experiment code

arXiv.org e-Print Archive

The Neural Testbed: Evaluating Joint Predictions

Author: Asghari Seyed Mohammad
Dwaracherla Vikranth
Hao Botao
Ibrahimi Morteza
Lawson Dieterich
Lu Xiuyuan
O'Donoghue Brendan
Osband Ian
Van Roy Benjamin
Wen Zheng
Publication venue
Publication date: 01/11/2022
Field of study

Predictive distributions quantify uncertainties ignored by point estimates. This paper introduces The Neural Testbed: an open-source benchmark for controlled and principled evaluation of agents that generate such predictions. Crucially, the testbed assesses agents not only on the quality of their marginal predictions per input, but also on their joint predictions across many inputs. We evaluate a range of agents using a simple neural network data generating process. Our results indicate that some popular Bayesian deep learning agents do not fare well with joint predictions, even when they can produce accurate marginal predictions. We also show that the quality of joint predictions drives performance in downstream decision tasks. We find these results are robust across choice a wide range of generative models, and highlight the practical importance of joint predictions to the community

arXiv.org e-Print Archive

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Author: Dwaracherla Vikranth R.
Tan Tian
Xiong Zhihan
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 03/04/2020
Field of study

Association for the Advancement of Artificial Intelligence: AAAI Publications

Probabilistic Approach for Visual Homing of a Mobile Robot in the Presence of Dynamic Obstacles

Author: Anupa Sabnis
G. K. Arunkumar
Leena Vachhani
Vikranth Dwaracherla
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref