Search CORE

1,003 research outputs found

A Generative Product-of-Filters Model of Audio

Author: Hoffman Matthew D.
Liang Dawen
Mysore Gautham J.
Publication venue
Publication date: 25/11/2014
Field of study

We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain. PoF makes similar assumptions to those used in the classic homomorphic filtering approach to signal processing, but replaces hand-designed decompositions built of basic signal processing operations with a learned decomposition based on statistical inference. This paper formulates the PoF model and derives a mean-field method for posterior inference and a variational EM algorithm to estimate the model's free parameters. We demonstrate PoF's potential for audio processing on a bandwidth expansion task, and show that PoF can serve as an effective unsupervised feature extractor for a speaker identification task.Comment: ICLR 2014 conference-track submission. Added link to the source cod

arXiv.org e-Print Archive

CiteSeerX

Advancements of MultiRate Signal processing for Wireless Communication Networks: Current State Of the Art

Author: Dr. D V Srihari Babu
Publication venue: Global Journals Inc. (US)
Publication date: 02/06/2012
Field of study

With the hasty growth of internet contact and voice and information centric communications, many contact technologies have been urbanized to meet the stringent insist of high speed information transmission and viaduct the wide bandwidth gap among ever-increasing high-data-rate core system and bandwidth-hungry end-user complex. To make efficient consumption of the limited bandwidth of obtainable access routes and cope with the difficult channel environment, several standards have been projected for a variety of broadband access scheme over different access situation (twisted pairs, coaxial cables, optical fibers, and unchanging or mobile wireless admittance). These access situations may create dissimilar channel impairments and utter unique sets of signal dispensation algorithms and techniques to combat precise impairments. In the intended and implementation sphere of those systems, many research issues arise. In this paper we present advancements of multi-rate indication processing methodologies that are aggravated by this design trend. The thesis covers the contemporary confirmation of the current literature on intrusion suppression using multi-rate indication in wireless communiquE9; networks

Global Journal of Computer Science and Technology (GJCST)

Fast Algorithms for High-Order Sparse Linear Prediction with Applications to Speech Processing

Author: Christensen Mads Græsbøll
Giacobello Daniele
Jensen Tobias Lindstrøm
van Waterschoot Toon
Publication venue: 'Elsevier BV'
Publication date: 01/02/2016
Field of study

Crossref

VBN

Collaborative adaptive filtering for machine learning

Author: Jelfs Beth
Jelfs Beth
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/04/2010
Field of study

Quantitative performance criteria for the analysis of machine learning architectures and algorithms have long been established. However, qualitative performance criteria, which identify fundamental signal properties and ensure any processing preserves the desired properties, are still emerging. In many cases, whilst offline statistical tests exist such as assessment of nonlinearity or stochasticity, online tests which not only characterise but also track changes in the nature of the signal are lacking. To that end, by employing recent developments in signal characterisation, criteria are derived for the assessment of the changes in the nature of the processed signal. Through the fusion of the outputs of adaptive filters a single collaborative hybrid filter is produced. By tracking the dynamics of the mixing parameter of this filter, rather than the actual filter performance, a clear indication as to the current nature of the signal is given. Implementations of the proposed method show that it is possible to quantify the degree of nonlinearity within both real- and complex-valued data. This is then extended (in the real domain) from dealing with nonlinearity in general, to a more specific example, namely sparsity. Extensions of adaptive filters from the real to the complex domain are non-trivial and the differences between the statistics in the real and complex domains need to be taken into account. In terms of signal characteristics, nonlinearity can be both split- and fully-complex and complex-valued data can be considered circular or noncircular. Furthermore, by combining the information obtained from hybrid filters of different natures it is possible to use this method to gain a more complete understanding of the nature of the nonlinearity within a signal. This also paves the way for building multidimensional feature spaces and their application in data/information fusion. To produce online tests for sparsity, adaptive filters for sparse environments are investigated and a unifying framework for the derivation of proportionate normalised least mean square (PNLMS) algorithms is presented. This is then extended to derive variants with an adaptive step-size. In order to create an online test for noncircularity, a study of widely linear autoregressive modelling is presented, from which a proof of the convergence of the test for noncircularity can be given. Applications of this method are illustrated on examples such as biomedical signals, speech and wind data

Spiral - Imperial College Digital Repository

Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering

Author: Airaksinen Manu
Alku Paavo
Juvela Lauri
Kameok Hirokazu
Yamagishi Junichi
Publication venue: 'International Speech Communication Association'
Publication date: 12/09/2016
Field of study

Edinburgh Research Explorer

Deep Learning for Single Image Super-Resolution: A Brief Review

Author: Liao Q
Tian Y
Wang W
Xue J-H
Yang W
Zhang X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/07/2019
Field of study

Single image super-resolution (SISR) is a notoriously challenging ill-posed problem, which aims to obtain a high-resolution (HR) output from one of its low-resolution (LR) versions. To solve the SISR problem, recently powerful deep learning algorithms have been employed and achieved the state-of-the-art performance. In this survey, we review representative deep learning-based SISR methods, and group them into two categories according to their major contributions to two essential aspects of SISR: the exploration of efficient neural network architectures for SISR, and the development of effective optimization objectives for deep SISR learning. For each category, a baseline is firstly established and several critical limitations of the baseline are summarized. Then representative works on overcoming these limitations are presented based on their original contents as well as our critical understandings and analyses, and relevant comparisons are conducted from a variety of perspectives. Finally we conclude this review with some vital current challenges and future trends in SISR leveraging deep learning algorithms.Comment: Accepted by IEEE Transactions on Multimedia (TMM

arXiv.org e-Print Archive

UCL Discovery

Complex Neural Networks for Audio

Author: Sarroff Andy M
Publication venue: Dartmouth Digital Commons
Publication date: 01/05/2018
Field of study

Audio is represented in two mathematically equivalent ways: the real-valued time domain (i.e., waveform) and the complex-valued frequency domain (i.e., spectrum). There are advantages to the frequency-domain representation, e.g., the human auditory system is known to process sound in the frequency-domain. Furthermore, linear time-invariant systems are convolved with sources in the time-domain, whereas they may be factorized in the frequency-domain. Neural networks have become rather useful when applied to audio tasks such as machine listening and audio synthesis, which are related by their dependencies on high quality acoustic models. They ideally encapsulate fine-scale temporal structure, such as that encoded in the phase of frequency-domain audio, yet there are no authoritative deep learning methods for complex audio. This manuscript is dedicated to addressing the shortcoming. Chapter 2 motivates complex networks by their affinity with complex-domain audio, while Chapter 3 contributes methods for building and optimizing complex networks. We show that the naive implementation of Adam optimization is incorrect for complex random variables and show that selection of input and output representation has a significant impact on the performance of a complex network. Experimental results with novel complex neural architectures are provided in the second half of this manuscript. Chapter 4 introduces a complex model for binaural audio source localization. We show that, like humans, the complex model can generalize to different anatomical filters, which is important in the context of machine listening. The complex model\u27s performance is better than that of the real-valued models, as well as real- and complex-valued baselines. Chapter 5 proposes a two-stage method for speech enhancement. In the first stage, a complex-valued stochastic autoencoder projects complex vectors to a discrete space. In the second stage, long-term temporal dependencies are modeled in the discrete space. The autoencoder raises the performance ceiling for state of the art speech enhancement, but the dynamic enhancement model does not outperform other baselines. We discuss areas for improvement and note that the complex Adam optimizer improves training convergence over the naive implementation

Dartmouth Digital Commons (Dartmouth College)