7,072 research outputs found
Differentiable Time-Frequency Scattering on GPU
Joint time-frequency scattering (JTFS) is a convolutional operator in the
time-frequency domain which extracts spectrotemporal modulations at various
rates and scales. It offers an idealized model of spectrotemporal receptive
fields (STRF) in the primary auditory cortex, and thus may serve as a
biological plausible surrogate for human perceptual judgments at the scale of
isolated audio events. Yet, prior implementations of JTFS and STRF have
remained outside of the standard toolkit of perceptual similarity measures and
evaluation methods for audio generation. We trace this issue down to three
limitations: differentiability, speed, and flexibility. In this paper, we
present an implementation of time-frequency scattering in Python. Unlike prior
implementations, ours accommodates NumPy, PyTorch, and TensorFlow as backends
and is thus portable on both CPU and GPU. We demonstrate the usefulness of JTFS
via three applications: unsupervised manifold learning of spectrotemporal
modulations, supervised classification of musical instruments, and texture
resynthesis of bioacoustic sounds.Comment: 8 pages, 6 figures. Submitted to the International Conference on
Digital Audio Effects (DAFX) 202
Gabor frames and deep scattering networks in audio processing
This paper introduces Gabor scattering, a feature extractor based on Gabor
frames and Mallat's scattering transform. By using a simple signal model for
audio signals specific properties of Gabor scattering are studied. It is shown
that for each layer, specific invariances to certain signal characteristics
occur. Furthermore, deformation stability of the coefficient vector generated
by the feature extractor is derived by using a decoupling technique which
exploits the contractivity of general scattering networks. Deformations are
introduced as changes in spectral shape and frequency modulation. The
theoretical results are illustrated by numerical examples and experiments.
Numerical evidence is given by evaluation on a synthetic and a "real" data set,
that the invariances encoded by the Gabor scattering transform lead to higher
performance in comparison with just using Gabor transform, especially when few
training samples are available.Comment: 26 pages, 8 figures, 4 tables. Repository for reproducibility:
https://gitlab.com/hararticles/gs-gt . Keywords: machine learning; scattering
transform; Gabor transform; deep learning; time-frequency analysis; CNN.
Accepted and published after peer revisio
Geometric Wavelet Scattering Networks on Compact Riemannian Manifolds
The Euclidean scattering transform was introduced nearly a decade ago to
improve the mathematical understanding of convolutional neural networks.
Inspired by recent interest in geometric deep learning, which aims to
generalize convolutional neural networks to manifold and graph-structured
domains, we define a geometric scattering transform on manifolds. Similar to
the Euclidean scattering transform, the geometric scattering transform is based
on a cascade of wavelet filters and pointwise nonlinearities. It is invariant
to local isometries and stable to certain types of diffeomorphisms. Empirical
results demonstrate its utility on several geometric learning tasks. Our
results generalize the deformation stability and local translation invariance
of Euclidean scattering, and demonstrate the importance of linking the used
filter structures to the underlying geometry of the data.Comment: 35 pages; 3 figures; 2 tables; v3: Revisions based on reviewer
comment
- …