748 research outputs found
Geometric Wavelet Scattering Networks on Compact Riemannian Manifolds
The Euclidean scattering transform was introduced nearly a decade ago to
improve the mathematical understanding of convolutional neural networks.
Inspired by recent interest in geometric deep learning, which aims to
generalize convolutional neural networks to manifold and graph-structured
domains, we define a geometric scattering transform on manifolds. Similar to
the Euclidean scattering transform, the geometric scattering transform is based
on a cascade of wavelet filters and pointwise nonlinearities. It is invariant
to local isometries and stable to certain types of diffeomorphisms. Empirical
results demonstrate its utility on several geometric learning tasks. Our
results generalize the deformation stability and local translation invariance
of Euclidean scattering, and demonstrate the importance of linking the used
filter structures to the underlying geometry of the data.Comment: 35 pages; 3 figures; 2 tables; v3: Revisions based on reviewer
comment
Applications of sub-audible speech recognition based upon electromyographic signals
Method and system for generating electromyographic or sub-audible signals (''SAWPs'') and for transmitting and recognizing the SAWPs that represent the original words and/or phrases. The SAWPs may be generated in an environment that interferes excessively with normal speech or that requires stealth communications, and may be transmitted using encoded, enciphered or otherwise transformed signals that are less subject to signal distortion or degradation in the ambient environment
Deep Learning for Audio Signal Processing
Given the recent surge in developments of deep learning, this article
provides a review of the state-of-the-art deep learning techniques for audio
signal processing. Speech, music, and environmental sound processing are
considered side-by-side, in order to point out similarities and differences
between the domains, highlighting general methods, problems, key references,
and potential for cross-fertilization between areas. The dominant feature
representations (in particular, log-mel spectra and raw waveform) and deep
learning models are reviewed, including convolutional neural networks, variants
of the long short-term memory architecture, as well as more audio-specific
neural network models. Subsequently, prominent deep learning application areas
are covered, i.e. audio recognition (automatic speech recognition, music
information retrieval, environmental sound detection, localization and
tracking) and synthesis and transformation (source separation, audio
enhancement, generative models for speech, sound, and music synthesis).
Finally, key issues and future questions regarding deep learning applied to
audio signal processing are identified.Comment: 15 pages, 2 pdf figure
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
Ensemble of convolutional neural networks to improve animal audio classification
Abstract In this work, we present an ensemble for automated audio classification that fuses different types of features extracted from audio files. These features are evaluated, compared, and fused with the goal of producing better classification accuracy than other state-of-the-art approaches without ad hoc parameter optimization. We present an ensemble of classifiers that performs competitively on different types of animal audio datasets using the same set of classifiers and parameter settings. To produce this general-purpose ensemble, we ran a large number of experiments that fine-tuned pretrained convolutional neural networks (CNNs) for different audio classification tasks (bird, bat, and whale audio datasets). Six different CNNs were tested, compared, and combined. Moreover, a further CNN, trained from scratch, was tested and combined with the fine-tuned CNNs. To the best of our knowledge, this is the largest study on CNNs in animal audio classification. Our results show that several CNNs can be fine-tuned and fused for robust and generalizable audio classification. Finally, the ensemble of CNNs is combined with handcrafted texture descriptors obtained from spectrograms for further improvement of performance. The MATLAB code used in our experiments will be provided to other researchers for future comparisons at https://github.com/LorisNanni
- …