228,190 research outputs found
Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning
Deep neural networks require a large amount of labeled training data during
supervised learning. However, collecting and labeling so much data might be
infeasible in many cases. In this paper, we introduce a source-target selective
joint fine-tuning scheme for improving the performance of deep learning tasks
with insufficient training data. In this scheme, a target learning task with
insufficient training data is carried out simultaneously with another source
learning task with abundant training data. However, the source learning task
does not use all existing training data. Our core idea is to identify and use a
subset of training images from the original source learning task whose
low-level characteristics are similar to those from the target learning task,
and jointly fine-tune shared convolutional layers for both tasks. Specifically,
we compute descriptors from linear or nonlinear filter bank responses on
training images from both tasks, and use such descriptors to search for a
desired subset of training samples for the source learning task.
Experiments demonstrate that our selective joint fine-tuning scheme achieves
state-of-the-art performance on multiple visual classification tasks with
insufficient training data for deep learning. Such tasks include Caltech 256,
MIT Indoor 67, Oxford Flowers 102 and Stanford Dogs 120. In comparison to
fine-tuning without a source domain, the proposed method can improve the
classification accuracy by 2% - 10% using a single model.Comment: To appear in 2017 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2017
Knowledge Transfer with Jacobian Matching
Classical distillation methods transfer representations from a "teacher"
neural network to a "student" network by matching their output activations.
Recent methods also match the Jacobians, or the gradient of output activations
with the input. However, this involves making some ad hoc decisions, in
particular, the choice of the loss function.
In this paper, we first establish an equivalence between Jacobian matching
and distillation with input noise, from which we derive appropriate loss
functions for Jacobian matching. We then rely on this analysis to apply
Jacobian matching to transfer learning by establishing equivalence of a recent
transfer learning procedure to distillation.
We then show experimentally on standard image datasets that Jacobian-based
penalties improve distillation, robustness to noisy inputs, and transfer
learning
The SAMPLE Experiment and Weak Nucleon Structure
One of the key elements to understanding the structure of the nucleon is the
role of its quark-antiquark sea in its ground state properties such as charge,
mass, magnetism and spin. In the last decade, parity-violating electron
scattering has emerged as an important tool in this area, because of its
ability to isolate the contribution of strange quark-antiquark pairs to the
nucleon's charge and magnetism. The SAMPLE experiment at the MIT-Bates
Laboratory, which has been focused on s-sbar contributions to the proton's
magnetic moment, was the first of such experiments and its program has recently
been completed. In this paper we give an overview of some of the experimental
aspects of parity-violating electron scattering, briefly review the theoretical
predictions for strange quark form factors, summarize the SAMPLE measurements,
and place them in context with the program of experiments being carried out at
other electron scattering facilities such as Jefferson Laboratory and the Mainz
Microtron.Comment: 61 pages, review articl
Measuring information-transfer delays
In complex networks such as gene networks, traffic systems or brain circuits it is important to understand how long it takes for the different parts of the network to effectively influence one another. In the brain, for example, axonal delays between brain areas can amount to several tens of milliseconds, adding an intrinsic component to any timing-based processing of information. Inferring neural interaction delays is thus needed to interpret the information transfer revealed by any analysis of directed interactions across brain structures. However, a robust estimation of interaction delays from neural activity faces several challenges if modeling assumptions on interaction mechanisms are wrong or cannot be made. Here, we propose a robust estimator for neuronal interaction delays rooted in an information-theoretic framework, which allows a model-free exploration of interactions. In particular, we extend transfer entropy to account for delayed source-target interactions, while crucially retaining the conditioning on the embedded target state at the immediately previous time step. We prove that this particular extension is indeed guaranteed to identify interaction delays between two coupled systems and is the only relevant option in keeping with Wiener’s principle of causality. We demonstrate the performance of our approach in detecting interaction delays on finite data by numerical simulations of stochastic and deterministic processes, as well as on local field potential recordings. We also show the ability of the extended transfer entropy to detect the presence of multiple delays, as well as feedback loops. While evaluated on neuroscience data, we expect the estimator to be useful in other fields dealing with network dynamics
Self-Supervised Intrinsic Image Decomposition
Intrinsic decomposition from a single image is a highly challenging task, due
to its inherent ambiguity and the scarcity of training data. In contrast to
traditional fully supervised learning approaches, in this paper we propose
learning intrinsic image decomposition by explaining the input image. Our
model, the Rendered Intrinsics Network (RIN), joins together an image
decomposition pipeline, which predicts reflectance, shape, and lighting
conditions given a single image, with a recombination function, a learned
shading model used to recompose the original input based off of intrinsic image
predictions. Our network can then use unsupervised reconstruction error as an
additional signal to improve its intermediate representations. This allows
large-scale unlabeled data to be useful during training, and also enables
transferring learned knowledge to images of unseen object categories, lighting
conditions, and shapes. Extensive experiments demonstrate that our method
performs well on both intrinsic image decomposition and knowledge transfer.Comment: NIPS 2017 camera-ready version, project page:
http://rin.csail.mit.edu
- …