1,003 research outputs found
A Generative Product-of-Filters Model of Audio
We propose the product-of-filters (PoF) model, a generative model that
decomposes audio spectra as sparse linear combinations of "filters" in the
log-spectral domain. PoF makes similar assumptions to those used in the classic
homomorphic filtering approach to signal processing, but replaces hand-designed
decompositions built of basic signal processing operations with a learned
decomposition based on statistical inference. This paper formulates the PoF
model and derives a mean-field method for posterior inference and a variational
EM algorithm to estimate the model's free parameters. We demonstrate PoF's
potential for audio processing on a bandwidth expansion task, and show that PoF
can serve as an effective unsupervised feature extractor for a speaker
identification task.Comment: ICLR 2014 conference-track submission. Added link to the source cod
Advancements of MultiRate Signal processing for Wireless Communication Networks: Current State Of the Art
With the hasty growth of internet contact and voice and information centric communications, many contact technologies have been urbanized to meet the stringent insist of high speed information transmission and viaduct the wide bandwidth gap among ever-increasing high-data-rate core system and bandwidth-hungry end-user complex. To make efficient consumption of the limited bandwidth of obtainable access routes and cope with the difficult channel environment, several standards have been projected for a variety of broadband access scheme over different access situation (twisted pairs, coaxial cables, optical fibers, and unchanging or mobile wireless admittance). These access situations may create dissimilar channel impairments and utter unique sets of signal dispensation algorithms and techniques to combat precise impairments. In the intended and implementation sphere of those systems, many research issues arise. In this paper we present advancements of multi-rate indication processing methodologies that are aggravated by this design trend. The thesis covers the contemporary confirmation of the current literature on intrusion suppression using multi-rate indication in wireless communiquE9; networks
Collaborative adaptive filtering for machine learning
Quantitative performance criteria for the analysis of machine learning architectures
and algorithms have long been established. However, qualitative performance criteria,
which identify fundamental signal properties and ensure any processing preserves the
desired properties, are still emerging. In many cases, whilst offline statistical tests
exist such as assessment of nonlinearity or stochasticity, online tests which not only
characterise but also track changes in the nature of the signal are lacking. To that end,
by employing recent developments in signal characterisation, criteria are derived for
the assessment of the changes in the nature of the processed signal.
Through the fusion of the outputs of adaptive filters a single collaborative hybrid
filter is produced. By tracking the dynamics of the mixing parameter of this filter,
rather than the actual filter performance, a clear indication as to the current nature of
the signal is given. Implementations of the proposed method show that it is possible to
quantify the degree of nonlinearity within both real- and complex-valued data. This is
then extended (in the real domain) from dealing with nonlinearity in general, to a more
specific example, namely sparsity. Extensions of adaptive filters from the real to the
complex domain are non-trivial and the differences between the statistics in the real
and complex domains need to be taken into account. In terms of signal characteristics,
nonlinearity can be both split- and fully-complex and complex-valued data can be
considered circular or noncircular. Furthermore, by combining the information obtained
from hybrid filters of different natures it is possible to use this method to gain a more
complete understanding of the nature of the nonlinearity within a signal. This also
paves the way for building multidimensional feature spaces and their application in
data/information fusion.
To produce online tests for sparsity, adaptive filters for sparse environments are
investigated and a unifying framework for the derivation of proportionate normalised
least mean square (PNLMS) algorithms is presented. This is then extended to derive
variants with an adaptive step-size. In order to create an online test for noncircularity,
a study of widely linear autoregressive modelling is presented, from which a proof of
the convergence of the test for noncircularity can be given. Applications of this method
are illustrated on examples such as biomedical signals, speech and wind data
Deep Learning for Single Image Super-Resolution: A Brief Review
Single image super-resolution (SISR) is a notoriously challenging ill-posed
problem, which aims to obtain a high-resolution (HR) output from one of its
low-resolution (LR) versions. To solve the SISR problem, recently powerful deep
learning algorithms have been employed and achieved the state-of-the-art
performance. In this survey, we review representative deep learning-based SISR
methods, and group them into two categories according to their major
contributions to two essential aspects of SISR: the exploration of efficient
neural network architectures for SISR, and the development of effective
optimization objectives for deep SISR learning. For each category, a baseline
is firstly established and several critical limitations of the baseline are
summarized. Then representative works on overcoming these limitations are
presented based on their original contents as well as our critical
understandings and analyses, and relevant comparisons are conducted from a
variety of perspectives. Finally we conclude this review with some vital
current challenges and future trends in SISR leveraging deep learning
algorithms.Comment: Accepted by IEEE Transactions on Multimedia (TMM
Complex Neural Networks for Audio
Audio is represented in two mathematically equivalent ways: the real-valued time domain (i.e., waveform) and the complex-valued frequency domain (i.e., spectrum). There are advantages to the frequency-domain representation, e.g., the human auditory system is known to process sound in the frequency-domain. Furthermore, linear time-invariant systems are convolved with sources in the time-domain, whereas they may be factorized in the frequency-domain. Neural networks have become rather useful when applied to audio tasks such as machine listening and audio synthesis, which are related by their dependencies on high quality acoustic models. They ideally encapsulate fine-scale temporal structure, such as that encoded in the phase of frequency-domain audio, yet there are no authoritative deep learning methods for complex audio. This manuscript is dedicated to addressing the shortcoming. Chapter 2 motivates complex networks by their affinity with complex-domain audio, while Chapter 3 contributes methods for building and optimizing complex networks. We show that the naive implementation of Adam optimization is incorrect for complex random variables and show that selection of input and output representation has a significant impact on the performance of a complex network. Experimental results with novel complex neural architectures are provided in the second half of this manuscript. Chapter 4 introduces a complex model for binaural audio source localization. We show that, like humans, the complex model can generalize to different anatomical filters, which is important in the context of machine listening. The complex model\u27s performance is better than that of the real-valued models, as well as real- and complex-valued baselines. Chapter 5 proposes a two-stage method for speech enhancement. In the first stage, a complex-valued stochastic autoencoder projects complex vectors to a discrete space. In the second stage, long-term temporal dependencies are modeled in the discrete space. The autoencoder raises the performance ceiling for state of the art speech enhancement, but the dynamic enhancement model does not outperform other baselines. We discuss areas for improvement and note that the complex Adam optimizer improves training convergence over the naive implementation
- …