24,512 research outputs found

    Black-box Generalization of Machine Teaching

    Full text link
    Hypothesis-pruning maximizes the hypothesis updates for active learning to find those desired unlabeled data. An inherent assumption is that this learning manner can derive those updates into the optimal hypothesis. However, its convergence may not be guaranteed well if those incremental updates are negative and disordered. In this paper, we introduce a black-box teaching hypothesis hTh^\mathcal{T} employing a tighter slack term (1+FT(h^t))Δt\left(1+\mathcal{F}^{\mathcal{T}}(\widehat{h}_t)\right)\Delta_t to replace the typical 2Δt2\Delta_t for pruning. Theoretically, we prove that, under the guidance of this teaching hypothesis, the learner can converge into a tighter generalization error and label complexity bound than those non-educated learners who do not receive any guidance from a teacher:1) the generalization error upper bound can be reduced from R(h∗)+4ΔT−1R(h^*)+4\Delta_{T-1} to approximately R(hT)+2ΔT−1R(h^{\mathcal{T}})+2\Delta_{T-1}, and 2) the label complexity upper bound can be decreased from 4ξ(TR(h∗)+2O(T))4 \theta\left(TR(h^{*})+2O(\sqrt{T})\right) to approximately 2ξ(2TR(hT)+3O(T))2\theta\left(2TR(h^{\mathcal{T}})+3 O(\sqrt{T})\right). To be strict with our assumption, self-improvement of teaching is firstly proposed when hTh^\mathcal{T} loosely approximates h∗h^*. Against learning, we further consider two teaching scenarios: teaching a white-box and black-box learner. Experiments verify this idea and show better generalization performance than the fundamental active learning strategies, such as IWAL, IWAL-D, etc

    Competitive function approximation for reinforcement learning

    Get PDF
    The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions. We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.Preprin

    Temporal Model Adaptation for Person Re-Identification

    Full text link
    Person re-identification is an open and challenging problem in computer vision. Majority of the efforts have been spent either to design the best feature representation or to learn the optimal matching metric. Most approaches have neglected the problem of adapting the selected features or the learned model over time. To address such a problem, we propose a temporal model adaptation scheme with human in the loop. We first introduce a similarity-dissimilarity learning method which can be trained in an incremental fashion by means of a stochastic alternating directions methods of multipliers optimization procedure. Then, to achieve temporal adaptation with limited human effort, we exploit a graph-based approach to present the user only the most informative probe-gallery matches that should be used to update the model. Results on three datasets have shown that our approach performs on par or even better than state-of-the-art approaches while reducing the manual pairwise labeling effort by about 80%

    Intrinsic Motivation Systems for Autonomous Mental Development

    Get PDF
    Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development.The complexity of the robot’s activities autonomously increases and complex developmental sequences self-organize without being constructed in a supervised manner. Two experiments are presented illustrating the stage-like organization emerging with this mechanism. In one of them, a physical robot is placed on a baby play mat with objects that it can learn to manipulate. Experimental results show that the robot first spends time in situations which are easy to learn, then shifts its attention progressively to situations of increasing difficulty, avoiding situations in which nothing can be learned. Finally, these various results are discussed in relation to more complex forms of behavioral organization and data coming from developmental psychology. Key words: Active learning, autonomy, behavior, complexity, curiosity, development, developmental trajectory, epigenetic robotics, intrinsic motivation, learning, reinforcement learning, values
    • 

    corecore