Search CORE

24,512 research outputs found

Black-box Generalization of Machine Teaching

Author: Cao Xiaofeng
Guo Yaming
Kwok James T.
Tsang Ivor W.
Publication venue
Publication date: 20/09/2023
Field of study

Hypothesis-pruning maximizes the hypothesis updates for active learning to find those desired unlabeled data. An inherent assumption is that this learning manner can derive those updates into the optimal hypothesis. However, its convergence may not be guaranteed well if those incremental updates are negative and disordered. In this paper, we introduce a black-box teaching hypothesis

h^\mathcal{T}

employing a tighter slack term

\left(1+\mathcal{F}^{\mathcal{T}}(\widehat{h}_t)\right)\Delta_t

to replace the typical

2\Delta_t

for pruning. Theoretically, we prove that, under the guidance of this teaching hypothesis, the learner can converge into a tighter generalization error and label complexity bound than those non-educated learners who do not receive any guidance from a teacher:1) the generalization error upper bound can be reduced from

R(h^*)+4\Delta_{T-1}

to approximately

R(h^{\mathcal{T}})+2\Delta_{T-1}

, and 2) the label complexity upper bound can be decreased from

4 \theta\left(TR(h^{*})+2O(\sqrt{T})\right)

to approximately

2\theta\left(2TR(h^{\mathcal{T}})+3 O(\sqrt{T})\right)

. To be strict with our assumption, self-improvement of teaching is firstly proposed when

h^\mathcal{T}

loosely approximates

h^*

. Against learning, we further consider two teaching scenarios: teaching a white-box and black-box learner. Experiments verify this idea and show better generalization performance than the fundamental active learning strategies, such as IWAL, IWAL-D, etc

arXiv.org e-Print Archive

Competitive function approximation for reinforcement learning

Author: Agostini Alejandro Gabriel
Celaya Llover Enric
Publication venue
Publication date: 01/01/2014
Field of study

The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions. We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Temporal Model Adaptation for Person Re-Identification

Author: AJ Joshi
B Settles
D Tao
D Tao
EP Xing
G Chechik
G Lisanti
H Xia
J Chen
J García
KQ Weinberger
M Hirzer
M Pavan
N Martinel
N Martinel
N Martinel
Peter M. Roth
R Johnson
R Vezzani
R Zhang
S Boyd
WS Zheng
WS Zheng
Xiaochun Cao
Z Wang
Z Wu
ZC Guo
Publication venue
Publication date: 25/07/2016
Field of study

Person re-identification is an open and challenging problem in computer vision. Majority of the efforts have been spent either to design the best feature representation or to learn the optimal matching metric. Most approaches have neglected the problem of adapting the selected features or the learned model over time. To address such a problem, we propose a temporal model adaptation scheme with human in the loop. We first introduce a similarity-dissimilarity learning method which can be trained in an incremental fashion by means of a stochastic alternating directions methods of multipliers optimization procedure. Then, to achieve temporal adaptation with limited human effort, we exploit a graph-based approach to present the user only the most informative probe-gallery matches that should be used to update the model. Results on three datasets have shown that our approach performs on par or even better than state-of-the-art approaches while reducing the manual pairwise labeling effort by about 80%

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Intrinsic Motivation Systems for Autonomous Mental Development

Author: Hafner Véréna
Kaplan Frédéric
Oudeyer Pierre-Yves
Publication venue
Publication date: 01/01/2007
Field of study

Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development.The complexity of the robot’s activities autonomously increases and complex developmental sequences self-organize without being constructed in a supervised manner. Two experiments are presented illustrating the stage-like organization emerging with this mechanism. In one of them, a physical robot is placed on a baby play mat with objects that it can learn to manipulate. Experimental results show that the robot first spends time in situations which are easy to learn, then shifts its attention progressively to situations of increasing difficulty, avoiding situations in which nothing can be learned. Finally, these various results are discussed in relation to more complex forms of behavioral organization and data coming from developmental psychology. Key words: Active learning, autonomy, behavior, complexity, curiosity, development, developmental trajectory, epigenetic robotics, intrinsic motivation, learning, reinforcement learning, values

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

INRIA a CCSD electronic archive server

CogPrints Cognitive Sciences Eprint Archive