23,758 research outputs found
On-the-Fly Adaptation of Source Code Models using Meta-Learning
The ability to adapt to unseen, local contexts is an important challenge that
successful models of source code must overcome. One of the most popular
approaches for the adaptation of such models is dynamic evaluation. With
dynamic evaluation, when running a model on an unseen file, the model is
updated immediately after having observed each token in that file. In this
work, we propose instead to frame the problem of context adaptation as a
meta-learning problem. We aim to train a base source code model that is best
able to learn from information in a file to deliver improved predictions of
missing tokens. Unlike dynamic evaluation, this formulation allows us to select
more targeted information (support tokens) for adaptation, that is both before
and after a target hole in a file. We consider an evaluation setting that we
call line-level maintenance, designed to reflect the downstream task of code
auto-completion in an IDE. Leveraging recent developments in meta-learning such
as first-order MAML and Reptile, we demonstrate improved performance in
experiments on a large scale Java GitHub corpus, compared to other adaptation
baselines including dynamic evaluation. Moreover, our analysis shows that,
compared to a non-adaptive baseline, our approach improves performance on
identifiers and literals by 44\% and 15\%, respectively.Comment: This paper has been withdrawn because we found a bug in the FOMAML
implementation that invalidates some of the key claims in the pape
DoShiCo Challenge: Domain Shift in Control Prediction
Training deep neural network policies end-to-end for real-world applications
so far requires big demonstration datasets in the real world or big sets
consisting of a large variety of realistic and closely related 3D CAD models.
These real or virtual data should, moreover, have very similar characteristics
to the conditions expected at test time. These stringent requirements and the
time consuming data collection processes that they entail, are currently the
most important impediment that keeps deep reinforcement learning from being
deployed in real-world applications. Therefore, in this work we advocate an
alternative approach, where instead of avoiding any domain shift by carefully
selecting the training data, the goal is to learn a policy that can cope with
it. To this end, we propose the DoShiCo challenge: to train a model in very
basic synthetic environments, far from realistic, in a way that it can be
applied in more realistic environments as well as take the control decisions on
real-world data. In particular, we focus on the task of collision avoidance for
drones. We created a set of simulated environments that can be used as
benchmark and implemented a baseline method, exploiting depth prediction as an
auxiliary task to help overcome the domain shift. Even though the policy is
trained in very basic environments, it can learn to fly without collisions in a
very different realistic simulated environment. Of course several benchmarks
for reinforcement learning already exist - but they never include a large
domain shift. On the other hand, several benchmarks in computer vision focus on
the domain shift, but they take the form of a static datasets instead of
simulated environments. In this work we claim that it is crucial to take the
two challenges together in one benchmark.Comment: Published at SIMPAR 2018. Please visit the paper webpage for more
information, a movie and code for reproducing results:
https://kkelchte.github.io/doshic
Obvious: a meta-toolkit to encapsulate information visualization toolkits. One toolkit to bind them all
This article describes “Obvious”: a meta-toolkit that abstracts and encapsulates information visualization toolkits implemented in the Java language. It intends to unify their use and postpone the choice of which concrete toolkit(s) to use later-on in the development of visual analytics applications. We also report on the lessons we have learned when wrapping popular toolkits with Obvious, namely Prefuse, the InfoVis Toolkit, partly Improvise, JUNG and other data management libraries. We show several examples on the uses of Obvious, how the different toolkits can be combined, for instance sharing their data models. We also show how Weka and RapidMiner, two popular machine-learning toolkits, have been wrapped with Obvious and can be used directly with all the other wrapped toolkits. We expect Obvious to start a co-evolution process: Obvious is meant to evolve when more components of Information Visualization systems will become consensual. It is also designed to help information visualization systems adhere to the best practices to provide a higher level of interoperability and leverage the domain of visual analytics
Generalizing Supervised Deep Learning MRI Reconstruction to Multiple and Unseen Contrasts using Meta-Learning Hypernetworks
Meta-learning has recently been an emerging data-efficient learning technique
for various medical imaging operations and has helped advance contemporary deep
learning models. Furthermore, meta-learning enhances the knowledge
generalization of the imaging tasks by learning both shared and discriminative
weights for various configurations of imaging tasks. However, existing
meta-learning models attempt to learn a single set of weight initializations of
a neural network that might be restrictive for multimodal data. This work aims
to develop a multimodal meta-learning model for image reconstruction, which
augments meta-learning with evolutionary capabilities to encompass diverse
acquisition settings of multimodal data. Our proposed model called KM-MAML
(Kernel Modulation-based Multimodal Meta-Learning), has hypernetworks that
evolve to generate mode-specific weights. These weights provide the
mode-specific inductive bias for multiple modes by re-calibrating each kernel
of the base network for image reconstruction via a low-rank kernel modulation
operation. We incorporate gradient-based meta-learning (GBML) in the contextual
space to update the weights of the hypernetworks for different modes. The
hypernetworks and the reconstruction network in the GBML setting provide
discriminative mode-specific features and low-level image features,
respectively. Experiments on multi-contrast MRI reconstruction show that our
model, (i) exhibits superior reconstruction performance over joint training,
other meta-learning methods, and context-specific MRI reconstruction methods,
and (ii) better adaptation capabilities with improvement margins of 0.5 dB in
PSNR and 0.01 in SSIM. Besides, a representation analysis with U-Net shows that
kernel modulation infuses 80% of mode-specific representation changes in the
high-resolution layers. Our source code is available at
https://github.com/sriprabhar/KM-MAML/.Comment: Accepted for publication in Elsevier Applied Soft Computing Journal,
36 pages, 18 figure
- …