Search CORE

15,195 research outputs found

Learning feed-forward one-shot learners

Author: Bertinetto Luca
Henriques João F.
Torr Philip H. S.
Valmadre Jack
Vedaldi Andrea
Publication venue
Publication date: 01/01/2016
Field of study

One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a second deep network, called a learnet, which predicts the parameters of a pupil network from a single exemplar. In this manner we obtain an efficient feed-forward one-shot learner, trained end-to-end by minimizing a one-shot classification objective in a learning to learn formulation. In order to make the construction feasible, we propose a number of factorizations of the parameters of the pupil network. We demonstrate encouraging results by learning characters from single exemplars in Omniglot, and by tracking visual objects from a single initial exemplar in the Visual Object Tracking benchmark.Comment: The first three authors contributed equally, and are listed in alphabetical orde

arXiv.org e-Print Archive

Oxford University Research Archive

Learning feed-forward one-shot learners

Author: Andrea Vedaldi
Jack Valmadre
João F Henriques
Luca Bertinetto
Philip H S Torr
Publication venue
Publication date: 24/04/2020
Field of study

Abstract One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a second deep network, called a learnet, which predicts the parameters of a pupil network from a single exemplar. In this manner we obtain an efficient feed-forward one-shot learner, trained end-to-end by minimizing a one-shot classification objective in a learning to learn formulation. In order to make the construction feasible, we propose a number of factorizations of the parameters of the pupil network. We demonstrate encouraging results by learning characters from single exemplars in Omniglot, and by tracking visual objects from a single initial exemplar in the Visual Object Tracking benchmark

CiteSeerX

Zero Shot Learning for Code Education: Rubric Sampling with Deep Learning Inference

Author: Goodman Noah
Mosse Milan
Piech Chris
Wu Mike
Publication venue
Publication date: 16/12/2018
Field of study

In modern computer science education, massive open online courses (MOOCs) log thousands of hours of data about how students solve coding challenges. Being so rich in data, these platforms have garnered the interest of the machine learning community, with many new algorithms attempting to autonomously provide feedback to help future students learn. But what about those first hundred thousand students? In most educational contexts (i.e. classrooms), assignments do not have enough historical data for supervised learning. In this paper, we introduce a human-in-the-loop "rubric sampling" approach to tackle the "zero shot" feedback challenge. We are able to provide autonomous feedback for the first students working on an introductory programming assignment with accuracy that substantially outperforms data-hungry algorithms and approaches human level fidelity. Rubric sampling requires minimal teacher effort, can associate feedback with specific parts of a student's solution and can articulate a student's misconceptions in the language of the instructor. Deep learning inference enables rubric sampling to further improve as more assignment specific student data is acquired. We demonstrate our results on a novel dataset from Code.org, the world's largest programming education platform.Comment: To appear at AAAI 2019; 9 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sustaining K-12 Professional Development in Geology: Recurrent Participation in RockCamp

Author: Deb Hemler
Robert Behling
Publication venue: Journal of Geoscience Education (JGE), National Association of Geoscience Teachers (NAGT)
Publication date: 01/03/2004
Field of study

Researchers surveyed repeat attendees in a geology professional development program known as RockCamp in order to determine the reasons for their sustained involvement in this program. This article describes their findings, which suggest that the teachers' sustained involvement in the RockCamp Program is stimulated by situated learning experiences stressing a compare, contrast, connect, and construct pedagogy within a supportive learning community. Most teachers cited such reasons as efficacy, fun, right time of life, and support, as well as content, friendship, and methodology as reasons for their continued participation in the program. Educational levels: Graduate or professional

Digital Library for Earth System Education

Making Good on LSTMs' Unfulfilled Promise

Author: Garcez A.
Philps D.
Weyde T.
Publication venue
Publication date: 01/12/2019
Field of study

LSTMs promise much to financial time-series analysis, temporal and cross-sectional inference, but we find that they do not deliver in a real-world financial management task. We examine an alternative called Continual Learning (CL), a memory-augmented approach, which can provide transparent explanations, i.e. which memory did what and when. This work has implications for many financial applications including credit, time-varying fairness in decision making and more. We make three important new observations. Firstly, as well as being more explainable, time-series CL approaches outperform LSTMs as well as a simple sliding window learner using feed-forward neural networks (FFNN). Secondly, we show that CL based on a sliding window learner (FFNN) is more effective than CL based on a sequential learner (LSTM). Thirdly, we examine how real-world, time-series noise impacts several similarity approaches used in CL memory addressing. We provide these insights using an approach called Continual Learning Augmentation (CLA) tested on a complex real-world problem, emerging market equities investment decision making. CLA provides a test-bed as it can be based on different types of time-series learners, allowing testing of LSTM and FFNN learners side by side. CLA is also used to test several distance approaches used in a memory recall-gate: Euclidean distance (ED), dynamic time warping (DTW), auto-encoders (AE) and a novel hybrid approach, warp-AE. We find that ED under-performs DTW and AE but warp-AE shows the best overall performance in a real-world financial task

arXiv.org e-Print Archive

City Research Online

Warwick Research Archives Portal Repository