4,380 research outputs found
New Trends in Biologically-Inspired Audio Coding
This book chapter deals with the generation of auditory-inspired spectro-temporal features aimed at audio coding. To do so, we first generate sparse audio representations we call spikegrams, using projections on gammatone or gammachirp kernels that generate neural spikes. Unlike Fourier-based representations, these representations are powerful at identifying auditory events, such as onsets, offsets, transients and harmonic structures. We show that the introduction of adaptiveness in the selection of gammachirp kernels enhances the compression rate compared to the case where the kernels are non-adaptive. We also integrate a masking model that helps reduce bitrate without loss of perceptible audio quality. We then quantize coding values using the genetic algorithm that is more optimal than uniform quantization for this framework. We finally propose a method to extract frequent auditory objects (patterns) in the aforementioned sparse representations. The extracted frequency-domain patterns (auditory objects) help us address spikes (auditory events) collectively rather than individually. When audio compression is needed, the different patterns are stored in a small codebook that can be used to efficiently encode audio materials in a lossless way. The approach is applied to different audio signals and results are discussed and compared. This work is a first step towards the design of a high-quality auditory-inspired \"object-based\" audio coder
Decoding the Encoding of Functional Brain Networks: an fMRI Classification Comparison of Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA), and Sparse Coding Algorithms
Brain networks in fMRI are typically identified using spatial independent
component analysis (ICA), yet mathematical constraints such as sparse coding
and positivity both provide alternate biologically-plausible frameworks for
generating brain networks. Non-negative Matrix Factorization (NMF) would
suppress negative BOLD signal by enforcing positivity. Spatial sparse coding
algorithms ( Regularized Learning and K-SVD) would impose local
specialization and a discouragement of multitasking, where the total observed
activity in a single voxel originates from a restricted number of possible
brain networks.
The assumptions of independence, positivity, and sparsity to encode
task-related brain networks are compared; the resulting brain networks for
different constraints are used as basis functions to encode the observed
functional activity at a given time point. These encodings are decoded using
machine learning to compare both the algorithms and their assumptions, using
the time series weights to predict whether a subject is viewing a video,
listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects.
For classifying cognitive activity, the sparse coding algorithm of
Regularized Learning consistently outperformed 4 variations of ICA across
different numbers of networks and noise levels (p0.001). The NMF algorithms,
which suppressed negative BOLD signal, had the poorest accuracy. Within each
algorithm, encodings using sparser spatial networks (containing more
zero-valued voxels) had higher classification accuracy (p0.001). The success
of sparse coding algorithms may suggest that algorithms which enforce sparse
coding, discourage multitasking, and promote local specialization may capture
better the underlying source processes than those which allow inexhaustible
local processes such as ICA
Analysis, Visualization, and Transformation of Audio Signals Using Dictionary-based Methods
date-added: 2014-01-07 09:15:58 +0000 date-modified: 2014-01-07 09:15:58 +0000date-added: 2014-01-07 09:15:58 +0000 date-modified: 2014-01-07 09:15:58 +000
Parametric dictionary design for sparse coding
Abstract—This paper introduces a new dictionary design method for sparse coding of a class of signals. It has been shown that one can sparsely approximate some natural signals using an overcomplete set of parametric functions, e.g. [1], [2]. A problem in using these parametric dictionaries is how to choose the parameters. In practice these parameters have been chosen by an expert or through a set of experiments. In the sparse approximation context, it has been shown that an incoherent dictionary is appropriate for the sparse approximation methods. In this paper we first characterize the dictionary design problem, subject to a constraint on the dictionary. Then we briefly explain that equiangular tight frames have minimum coherence. The complexity of the problem does not allow it to be solved exactly. We introduce a practical method to approximately solve it. Some experiments show the advantages one gets by using these dictionaries
Frame Theory for Signal Processing in Psychoacoustics
This review chapter aims to strengthen the link between frame theory and
signal processing tasks in psychoacoustics. On the one side, the basic concepts
of frame theory are presented and some proofs are provided to explain those
concepts in some detail. The goal is to reveal to hearing scientists how this
mathematical theory could be relevant for their research. In particular, we
focus on frame theory in a filter bank approach, which is probably the most
relevant view-point for audio signal processing. On the other side, basic
psychoacoustic concepts are presented to stimulate mathematicians to apply
their knowledge in this field
- …