332,188 research outputs found
AID-RL: Active information-directed reinforcement learning for autonomous source seeking and estimation
This paper proposes an active information-directed reinforcement learning (AID-RL) framework for autonomous source seeking and estimation problem. Source seeking requires the search agent to move towards the true source, and source estimation demands the agent to maintain and update its knowledge regarding the source properties such as release rate and source position. These two objectives give rise to the newly developed framework, namely, dual control for exploration and exploitation. In this paper, the greedy RL forms an exploitation search strategy that navigates the agent to the source position, while the information-directed search commands the agent to explore most informative positions to reduce belief uncertainty. Extensive results are presented using a high-fidelity dataset for autonomous search, which validates the effectiveness of the proposed AID-RL and highlights the importance of active exploration in improving sampling efficiency and search performance
GAP Safe screening rules for sparse multi-task and multi-class models
High dimensional regression benefits from sparsity promoting regularizations.
Screening rules leverage the known sparsity of the solution by ignoring some
variables in the optimization, hence speeding up solvers. When the procedure is
proven not to discard features wrongly the rules are said to be \emph{safe}. In
this paper we derive new safe rules for generalized linear models regularized
with and norms. The rules are based on duality gap
computations and spherical safe regions whose diameters converge to zero. This
allows to discard safely more variables, in particular for low regularization
parameters. The GAP Safe rule can cope with any iterative solver and we
illustrate its performance on coordinate descent for multi-task Lasso, binary
and multinomial logistic regression, demonstrating significant speed ups on all
tested datasets with respect to previous safe rules.Comment: in Proceedings of the 29-th Conference on Neural Information
Processing Systems (NIPS), 201
- …