8,187 research outputs found
Competence-based Curriculum Learning for Neural Machine Translation
Current state-of-the-art NMT systems use large neural networks that are not
only slow to train, but also often require many heuristics and optimization
tricks, such as specialized learning rate schedules and large batch sizes. This
is undesirable as it requires extensive hyperparameter tuning. In this paper,
we propose a curriculum learning framework for NMT that reduces training time,
reduces the need for specialized heuristics or large batch sizes, and results
in overall better performance. Our framework consists of a principled way of
deciding which training samples are shown to the model at different times
during training, based on the estimated difficulty of a sample and the current
competence of the model. Filtering training samples in this manner prevents the
model from getting stuck in bad local optima, making it converge faster and
reach a better solution than the common approach of uniformly sampling training
examples. Furthermore, the proposed method can be easily applied to existing
NMT models by simply modifying their input data pipelines. We show that our
framework can help improve the training time and the performance of both
recurrent neural network models and Transformers, achieving up to a 70%
decrease in training time, while at the same time obtaining accuracy
improvements of up to 2.2 BLEU
Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning
Intrinsically motivated spontaneous exploration is a key enabler of
autonomous lifelong learning in human children. It enables the discovery and
acquisition of large repertoires of skills through self-generation,
self-selection, self-ordering and self-experimentation of learning goals. We
present an algorithmic approach called Intrinsically Motivated Goal Exploration
Processes (IMGEP) to enable similar properties of autonomous or self-supervised
learning in machines. The IMGEP algorithmic architecture relies on several
principles: 1) self-generation of goals, generalized as fitness functions; 2)
selection of goals based on intrinsic rewards; 3) exploration with incremental
goal-parameterized policy search and exploitation of the gathered data with a
batch learning algorithm; 4) systematic reuse of information acquired when
targeting a goal for improving towards other goals. We present a particularly
efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a
population-based policy and an object-centered modularity in goals and
mutations. We provide several implementations of this architecture and
demonstrate their ability to automatically generate a learning curriculum
within several experimental setups including a real humanoid robot that can
explore multiple spaces of goals with several hundred continuous dimensions.
While no particular target goal is provided to the system, this curriculum
allows the discovery of skills that act as stepping stone for learning more
complex skills, e.g. nested tool use. We show that learning diverse spaces of
goals with intrinsic motivations is more efficient for learning complex skills
than only trying to directly learn these complex skills
Competence-based Multimodal Curriculum Learning for Medical Report Generation
Medical report generation task, which targets to produce long and coherent
descriptions of medical images, has attracted growing research interests
recently. Different from the general image captioning tasks, medical report
generation is more challenging for data-driven neural models. This is mainly
due to 1) the serious data bias and 2) the limited medical data. To alleviate
the data bias and make best use of available data, we propose a
Competence-based Multimodal Curriculum Learning framework (CMCL). Specifically,
CMCL simulates the learning process of radiologists and optimizes the model in
a step by step manner. Firstly, CMCL estimates the difficulty of each training
instance and evaluates the competence of current model; Secondly, CMCL selects
the most suitable batch of training instances considering current model
competence. By iterating above two steps, CMCL can gradually improve the
model's performance. The experiments on the public IU-Xray and MIMIC-CXR
datasets show that CMCL can be incorporated into existing models to improve
their performance.Comment: Accepted by ACL 2021 (Oral
Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach
A curriculum is a planned sequence of learning materials and an effective one
can make learning efficient and effective for both humans and machines. Recent
studies developed effective data-driven curriculum learning approaches for
training graph neural networks in language applications. However, existing
curriculum learning approaches often employ a single criterion of difficulty in
their training paradigms. In this paper, we propose a new perspective on
curriculum learning by introducing a novel approach that builds on graph
complexity formalisms (as difficulty criteria) and model competence during
training. The model consists of a scheduling scheme which derives effective
curricula by accounting for different views of sample difficulty and model
competence during training. The proposed solution advances existing research in
curriculum learning for graph neural networks with the ability to incorporate a
fine-grained spectrum of graph difficulty criteria in their training paradigms.
Experimental results on real-world link prediction and node classification
tasks illustrate the effectiveness of the proposed approach.Comment: ACL 202
- …