Search CORE

332,188 research outputs found

AID-RL: Active information-directed reinforcement learning for autonomous source seeking and estimation

Author: Chen WH
Li Z
Yan Y
Yang J
Publication venue: 'Elsevier BV'
Publication date: 01/08/2023
Field of study

This paper proposes an active information-directed reinforcement learning (AID-RL) framework for autonomous source seeking and estimation problem. Source seeking requires the search agent to move towards the true source, and source estimation demands the agent to maintain and update its knowledge regarding the source properties such as release rate and source position. These two objectives give rise to the newly developed framework, namely, dual control for exploration and exploitation. In this paper, the greedy RL forms an exploitation search strategy that navigates the agent to the source position, while the information-directed search commands the agent to explore most informative positions to reduce belief uncertainty. Extensive results are presented using a high-fidelity dataset for autonomous search, which validates the effectiveness of the proposed AID-RL and highlights the importance of active exploration in improving sampling efficiency and search performance

UCL Discovery

GAP Safe screening rules for sparse multi-task and multi-class models

Author: Fercoq Olivier
Gramfort Alexandre
Ndiaye Eugene
Salmon Joseph
Publication venue
Publication date: 18/11/2015
Field of study

High dimensional regression benefits from sparsity promoting regularizations. Screening rules leverage the known sparsity of the solution by ignoring some variables in the optimization, hence speeding up solvers. When the procedure is proven not to discard features wrongly the rules are said to be \emph{safe}. In this paper we derive new safe rules for generalized linear models regularized with

\ell_1

and

\ell_1/\ell_2

norms. The rules are based on duality gap computations and spherical safe regions whose diameters converge to zero. This allows to discard safely more variables, in particular for low regularization parameters. The GAP Safe rule can cope with any iterative solver and we illustrate its performance on coordinate descent for multi-task Lasso, binary and multinomial logistic regression, demonstrating significant speed ups on all tested datasets with respect to previous safe rules.Comment: in Proceedings of the 29-th Conference on Neural Information Processing Systems (NIPS), 201

arXiv.org e-Print Archive