7,151 research outputs found
Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning
Deep neural networks require a large amount of labeled training data during
supervised learning. However, collecting and labeling so much data might be
infeasible in many cases. In this paper, we introduce a source-target selective
joint fine-tuning scheme for improving the performance of deep learning tasks
with insufficient training data. In this scheme, a target learning task with
insufficient training data is carried out simultaneously with another source
learning task with abundant training data. However, the source learning task
does not use all existing training data. Our core idea is to identify and use a
subset of training images from the original source learning task whose
low-level characteristics are similar to those from the target learning task,
and jointly fine-tune shared convolutional layers for both tasks. Specifically,
we compute descriptors from linear or nonlinear filter bank responses on
training images from both tasks, and use such descriptors to search for a
desired subset of training samples for the source learning task.
Experiments demonstrate that our selective joint fine-tuning scheme achieves
state-of-the-art performance on multiple visual classification tasks with
insufficient training data for deep learning. Such tasks include Caltech 256,
MIT Indoor 67, Oxford Flowers 102 and Stanford Dogs 120. In comparison to
fine-tuning without a source domain, the proposed method can improve the
classification accuracy by 2% - 10% using a single model.Comment: To appear in 2017 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2017
Gabor frames and deep scattering networks in audio processing
This paper introduces Gabor scattering, a feature extractor based on Gabor
frames and Mallat's scattering transform. By using a simple signal model for
audio signals specific properties of Gabor scattering are studied. It is shown
that for each layer, specific invariances to certain signal characteristics
occur. Furthermore, deformation stability of the coefficient vector generated
by the feature extractor is derived by using a decoupling technique which
exploits the contractivity of general scattering networks. Deformations are
introduced as changes in spectral shape and frequency modulation. The
theoretical results are illustrated by numerical examples and experiments.
Numerical evidence is given by evaluation on a synthetic and a "real" data set,
that the invariances encoded by the Gabor scattering transform lead to higher
performance in comparison with just using Gabor transform, especially when few
training samples are available.Comment: 26 pages, 8 figures, 4 tables. Repository for reproducibility:
https://gitlab.com/hararticles/gs-gt . Keywords: machine learning; scattering
transform; Gabor transform; deep learning; time-frequency analysis; CNN.
Accepted and published after peer revisio
Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?
When convolutional neural networks are used to tackle learning problems based
on music or, more generally, time series data, raw one-dimensional data are
commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients,
which are then used as input to the actual neural network. In this
contribution, we investigate, both theoretically and experimentally, the
influence of this pre-processing step on the network's performance and pose the
question, whether replacing it by applying adaptive or learned filters directly
to the raw data, can improve learning success. The theoretical results show
that approximately reproducing mel-spectrogram coefficients by applying
adaptive filters and subsequent time-averaging is in principle possible. We
also conducted extensive experimental work on the task of singing voice
detection in music. The results of these experiments show that for
classification based on Convolutional Neural Networks the features obtained
from adaptive filter banks followed by time-averaging perform better than the
canonical Fourier-transform-based mel-spectrogram coefficients. Alternative
adaptive approaches with center frequencies or time-averaging lengths learned
from training data perform equally well.Comment: Completely revised version; 21 pages, 4 figure
Procedural Noise Adversarial Examples for Black-Box Attacks on Deep Convolutional Networks
Deep Convolutional Networks (DCNs) have been shown to be vulnerable to
adversarial examples---perturbed inputs specifically designed to produce
intentional errors in the learning algorithms at test time. Existing
input-agnostic adversarial perturbations exhibit interesting visual patterns
that are currently unexplained. In this paper, we introduce a structured
approach for generating Universal Adversarial Perturbations (UAPs) with
procedural noise functions. Our approach unveils the systemic vulnerability of
popular DCN models like Inception v3 and YOLO v3, with single noise patterns
able to fool a model on up to 90% of the dataset. Procedural noise allows us to
generate a distribution of UAPs with high universal evasion rates using only a
few parameters. Additionally, we propose Bayesian optimization to efficiently
learn procedural noise parameters to construct inexpensive untargeted black-box
attacks. We demonstrate that it can achieve an average of less than 10 queries
per successful attack, a 100-fold improvement on existing methods. We further
motivate the use of input-agnostic defences to increase the stability of models
to adversarial perturbations. The universality of our attacks suggests that DCN
models may be sensitive to aggregations of low-level class-agnostic features.
These findings give insight on the nature of some universal adversarial
perturbations and how they could be generated in other applications.Comment: 16 pages, 10 figures. In Proceedings of the 2019 ACM SIGSAC
Conference on Computer and Communications Security (CCS '19
- …