184 research outputs found

    Probabilistic Modeling Paradigms for Audio Source Separation

    Get PDF
    This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems

    Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

    Get PDF
    Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks

    Massive MIMO is a Reality -- What is Next? Five Promising Research Directions for Antenna Arrays

    Full text link
    Massive MIMO (multiple-input multiple-output) is no longer a "wild" or "promising" concept for future cellular networks - in 2018 it became a reality. Base stations (BSs) with 64 fully digital transceiver chains were commercially deployed in several countries, the key ingredients of Massive MIMO have made it into the 5G standard, the signal processing methods required to achieve unprecedented spectral efficiency have been developed, and the limitation due to pilot contamination has been resolved. Even the development of fully digital Massive MIMO arrays for mmWave frequencies - once viewed prohibitively complicated and costly - is well underway. In a few years, Massive MIMO with fully digital transceivers will be a mainstream feature at both sub-6 GHz and mmWave frequencies. In this paper, we explain how the first chapter of the Massive MIMO research saga has come to an end, while the story has just begun. The coming wide-scale deployment of BSs with massive antenna arrays opens the door to a brand new world where spatial processing capabilities are omnipresent. In addition to mobile broadband services, the antennas can be used for other communication applications, such as low-power machine-type or ultra-reliable communications, as well as non-communication applications such as radar, sensing and positioning. We outline five new Massive MIMO related research directions: Extremely large aperture arrays, Holographic Massive MIMO, Six-dimensional positioning, Large-scale MIMO radar, and Intelligent Massive MIMO.Comment: 20 pages, 9 figures, submitted to Digital Signal Processin

    Mixture of beamformers for speech separation and extraction

    Get PDF
    In many audio applications, the signal of interest is corrupted by acoustic background noise, interference, and reverberation. The presence of these contaminations can significantly degrade the quality and intelligibility of the audio signal. This makes it important to develop signal processing methods that can separate the competing sources and extract a source of interest. The estimated signals may then be either directly listened to, transmitted, or further processed, giving rise to a wide range of applications such as hearing aids, noise-cancelling headphones, human-computer interaction, surveillance, and hands-free telephony. Many of the existing approaches to speech separation/extraction relied on beamforming techniques. These techniques approach the problem from a spatial point of view; a microphone array is used to form a spatial filter which can extract a signal from a specific direction and reduce the contamination of signals from other directions. However, when there are fewer microphones than sources (the underdetermined case), perfect attenuation of all interferers becomes impossible and only partial interference attenuation is possible. In this thesis, we present a framework which extends the use of beamforming techniques to underdetermined speech mixtures. We describe frequency domain non-linear mixture of beamformers that can extract a speech source from a known direction. Our approach models the data in each frequency bin via Gaussian mixture distributions, which can be learned using the expectation maximization algorithm. The model learning is performed using the observed mixture signals only, and no prior training is required. The signal estimator comprises of a set of minimum mean square error (MMSE), minimum variance distortionless response (MVDR), or minimum power distortionless response (MPDR) beamformers. In order to estimate the signal, all beamformers are concurrently applied to the observed signal, and the weighted sum of the beamformers’ outputs is used as the signal estimator, where the weights are the estimated posterior probabilities of the Gaussian mixture states. These weights are specific to each timefrequency point. The resulting non-linear beamformers do not need to know or estimate the number of sources, and can be applied to microphone arrays with two or more microphones with arbitrary array configuration. We test and evaluate the described methods on underdetermined speech mixtures. Experimental results for the non-linear beamformers in underdetermined mixtures with room reverberation confirm their capability to successfully extract speech sources

    MIMO Systems

    Get PDF
    In recent years, it was realized that the MIMO communication systems seems to be inevitable in accelerated evolution of high data rates applications due to their potential to dramatically increase the spectral efficiency and simultaneously sending individual information to the corresponding users in wireless systems. This book, intends to provide highlights of the current research topics in the field of MIMO system, to offer a snapshot of the recent advances and major issues faced today by the researchers in the MIMO related areas. The book is written by specialists working in universities and research centers all over the world to cover the fundamental principles and main advanced topics on high data rates wireless communications systems over MIMO channels. Moreover, the book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Dynamic Compressive Sensing of Time-Varying Signals via Approximate Message Passing

    Full text link
    In this work the dynamic compressive sensing (CS) problem of recovering sparse, correlated, time-varying signals from sub-Nyquist, non-adaptive, linear measurements is explored from a Bayesian perspective. While there has been a handful of previously proposed Bayesian dynamic CS algorithms in the literature, the ability to perform inference on high-dimensional problems in a computationally efficient manner remains elusive. In response, we propose a probabilistic dynamic CS signal model that captures both amplitude and support correlation structure, and describe an approximate message passing algorithm that performs soft signal estimation and support detection with a computational complexity that is linear in all problem dimensions. The algorithm, DCS-AMP, can perform either causal filtering or non-causal smoothing, and is capable of learning model parameters adaptively from the data through an expectation-maximization learning procedure. We provide numerical evidence that DCS-AMP performs within 3 dB of oracle bounds on synthetic data under a variety of operating conditions. We further describe the result of applying DCS-AMP to two real dynamic CS datasets, as well as a frequency estimation task, to bolster our claim that DCS-AMP is capable of offering state-of-the-art performance and speed on real-world high-dimensional problems.Comment: 32 pages, 7 figure

    Adaptive Communications for Next Generation Broadband Wireless Access Systems

    Get PDF
    Un dels aspectes claus en el disseny i gestió de les xarxes sense fils d'accés de banda ampla és l'ús eficient dels recursos radio. Des del punt de vista de l'operador, l'ample de banda és un bé escàs i preuat que s´ha d'explotar i gestionar de la forma més eficient possible tot garantint la qualitat del servei que es vol proporcionar. Per altra banda, des del punt de vista del usuari, la qualitat del servei ofert ha de ser comparable al de les xarxes fixes, requerint així un baix retard i una baixa pèrdua de paquets per cadascun dels fluxos de dades entre la xarxa i l'usuari. Durant els darrers anys s´han desenvolupat nombroses tècniques i algoritmes amb l'objectiu d'incrementar l'eficiència espectral. Entre aquestes tècniques destaca l'ús de múltiples antenes al transmissor i al receptor amb l'objectiu de transmetre diferents fluxos de dades simultaneament sense necessitat d'augmentar l'ample de banda. Per altra banda, la optimizació conjunta de la capa d'accés al medi i la capa física (fent ús de l'estat del canal per tal de gestionar de manera optima els recursos) també permet incrementar sensiblement l'eficiència espectral del sistema.L'objectiu d'aquesta tesi és l'estudi i desenvolupament de noves tècniques d'adaptació de l'enllaç i gestió dels recursos ràdio aplicades sobre sistemes d'accés ràdio de propera generació (Beyond 3G). Els estudis realitzats parteixen de la premissa que el transmisor coneix (parcialment) l'estat del canal i que la transmissió es realitza fent servir un esquema multiportadora amb múltiples antenes al transmisor i al receptor. En aquesta tesi es presenten dues línies d'investigació, la primera per casos d'una sola antenna a cada banda de l'enllaç, i la segona en cas de múltiples antenes. En el cas d'una sola antena al transmissor i al receptor, un nou esquema d'assignació de recursos ràdio i priorització dels paquets (scheduling) és proposat i analitzat integrant totes dues funcions sobre una mateixa entitat (cross-layer). L'esquema proposat té com a principal característica la seva baixa complexitat i que permet operar amb transmissions multimedia. Alhora, posteriors millores realitzades per l'autor sobre l'esquema proposat han permès també reduir els requeriments de senyalització i combinar de forma óptima usuaris d'alta i baixa mobilitat sobre el mateix accés ràdio, millorant encara més l'eficiència espectral del sistema. En cas d'enllaços amb múltiples antenes es proposa un nou esquema que combina la selecció del conjunt optim d'antenes transmissores amb la selecció de la codificació espai- (frequència-) temps. Finalment es donen una sèrie de recomanacions per tal de combinar totes dues línies d'investigació, així con un estat de l'art de les tècniques proposades per altres autors que combinen en part la gestió dels recursos ràdio i els esquemes de transmissió amb múltiples antenes.Uno de los aspectos claves en el diseño y gestión de las redes inalámbricas de banda ancha es el uso eficiente de los recursos radio. Desde el punto de vista del operador, el ancho de banda es un bien escaso y valioso que se debe explotar y gestionar de la forma más eficiente posible sin afectar a la calidad del servicio ofrecido. Por otro lado, desde el punto de vista del usuario, la calidad del servicio ha de ser comparable al ofrecido por las redes fijas, requiriendo así un bajo retardo y una baja tasa de perdida de paquetes para cada uno de los flujos de datos entre la red y el usuario. Durante los últimos años el número de técnicas y algoritmos que tratan de incrementar la eficiencia espectral en dichas redes es bastante amplio. Entre estas técnicas destaca el uso de múltiples antenas en el transmisor y en el receptor con el objetivo de poder transmitir simultáneamente diferentes flujos de datos sin necesidad de incrementar el ancho de banda. Por otro lado, la optimización conjunta de la capa de acceso al medio y la capa física (utilizando información de estado del canal para gestionar de manera óptima los recursos) también permite incrementar sensiblemente la eficiencia espectral del sistema.El objetivo de esta tesis es el estudio y desarrollo de nuevas técnicas de adaptación del enlace y la gestión de los recursos radio, y su posterior aplicación sobre los sistemas de acceso radio de próxima generación (Beyond 3G). Los estudios realizados parten de la premisa de que el transmisor conoce (parcialmente) el estado del canal a la vez que se considera que la transmisión se realiza sobre un sistema de transmisión multiportadora con múltiple antenas en el transmisor y el receptor. La tesis se centra sobre dos líneas de investigación, la primera para casos de una única antena en cada lado del enlace, y la segunda en caso de múltiples antenas en cada lado. Para el caso de una única antena en el transmisor y en el receptor, se ha desarrollado un nuevo esquema de asignación de los recursos radio así como de priorización de los paquetes de datos (scheduling) integrando ambas funciones sobre una misma entidad (cross-layer). El esquema propuesto tiene como principal característica su bajo coste computacional a la vez que se puede aplicar en caso de transmisiones multimedia. Posteriores mejoras realizadas por el autor sobre el esquema propuesto han permitido también reducir los requisitos de señalización así como combinar de forma óptima usuarios de alta y baja movilidad. Por otro lado, en caso de enlaces con múltiples antenas en transmisión y recepción, se presenta un nuevo esquema de adaptación en el cual se combina la selección de la(s) antena(s) transmisora(s) con la selección del esquema de codificación espacio-(frecuencia-) tiempo. Para finalizar, se dan una serie de recomendaciones con el objetivo de combinar ambas líneas de investigación, así como un estado del arte de las técnicas propuestas por otros autores que combinan en parte la gestión de los recursos radio y los esquemas de transmisión con múltiples antenas.In Broadband Wireless Access systems the efficient use of the resources is crucial from many points of views. From the operator point of view, the bandwidth is a scarce, valuable, and expensive resource which must be exploited in an efficient manner while the Quality of Service (QoS) provided to the users is guaranteed. On the other hand, a tight delay and link quality constraints are imposed on each data flow hence the user experiences the same quality as in fixed networks. During the last few years many techniques have been developed in order to increase the spectral efficiency and the throughput. Among them, the use of multiple antennas at the transmitter and the receiver (exploiting spatial multiplexing) with the joint optimization of the medium access control layer and the physical layer parameters.In this Ph.D. thesis, different adaptive techniques for B3G multicarrier wireless systems are developed and proposed focusing on the SS-MC-MA and the OFDM(A) (IEEE 802.16a/e/m standards) communication schemes. The research lines emphasize into the adaptation of the transmission having (Partial) knowledge of the Channel State Information for both; single antenna and multiple antenna links. For single antenna links, the implementation of a joint resource allocation and scheduling strategy by including adaptive modulation and coding is investigated. A low complexity resource allocation and scheduling algorithm is proposed with the objective to cope with real- and/or non-real- time requirements and constraints. A special attention is also devoted in reducing the required signalling. However, for multiple antenna links, the performance of a proposed adaptive transmit antenna selection scheme jointly with space-time block coding selection is investigated and compared with conventional structures. In this research line, mainly two optimizations criteria are proposed for spatial link adaptation, one based on the minimum error rate for fixed throughput, and the second focused on the maximisation of the rate for fixed error rate. Finally, some indications are given on how to include the spatial adaptation into the investigated and proposed resource allocation and scheduling process developed for single antenna transmission

    Protocols for multi-antenna ad-hoc wireless networking in interference environments

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 231-242).A fundamental question for the design of future wireless networks concerns the nature of spectrum management and the protocols that govern use of the spectrum. In the oligopoly model, spectrum is owned and centrally managed, and the protocols tend to reflect this centralized nature. In the common's model, spectrum is a public good, and protocols must support ad hoc communication. This work presents the design, tradeoffs and parameter optimization for a new protocol (Simultaneous Transmissions in Interference (STI-MAC)) for ad hoc wireless networks. The key idea behind the STI-MAC protocol is 'channel stuffing,' that is, allowing network nodes to more efficiently use spatial, time and frequency degrees of freedom. This is achieved in three key ways. First, 'channel stuffing' is achieved through multiple antennas that are used at the receiver to mitigate interference using Minimum-Mean-Squared-Error (MMSE) receivers, allowing network nodes to transmit simultaneously in interference limited environments. The protocol also supports the use of multiple transmit antennas to beamform to the target receiver. Secondly, 'channel stuffing' is achieved through the use of a control channel that is orthogonal in time to the data channel, where nodes contend in order to participate on the data channel. And thirdly, 'channel stuffing' is achieved through a protest scheme that prevents data channel overloading. The STI-MAC protocol is analyzed via Monte-Carlo simulations in which transmitter nodes are uniformly distributed in a plane, each at a fixed distance from their target receiver; and as a function of network parameters including the number of transmit and receive antennas, the distance between a transmitter-receiver pair (link-length), the average number of transmitters whose received signal is stronger at a given receiver than its target transmitter (link-rank), number of transmitter-receiver pairs, the distribution on the requested rate, the offered load, and the transmit scheme. The STI-MAC protocol is benchmarked relative to simulations of the 802.11(n) (Wi-Fi) protocol. The key results of this work show a 3X gain in throughput relative to 802.11(n) in typical multi-antenna wireless networks that have 20 transmitter-receiver pairs, a link-length of 10 meters, four receive antennas and a single transmit antenna. We also show a reduction in delay by a factor of two when the networks are heavily loaded. We find that the link-rank is a key parameter affecting STIMAC gains over Wi-Fi. In simulations of networks with 40 transmit-receiver pairs, link-rank of three, a link-length of 10 meters, and eight transmit and receive antennas in which the transmitter beamforms to its target receiver in its strongest target channel mode, we find gains in throughput of at least 5X over the 802.11(n) protocol.by Danielle A. Hinton.Ph.D

    Measuring Directed Functional Connectivity Using Non-Parametric Directionality Analysis : Validation and Comparison with Non-Parametric Granger Causality

    Get PDF
    BACKGROUND: 'Non-parametric directionality' (NPD) is a novel method for estimation of directed functional connectivity (dFC) in neural data. The method has previously been verified in its ability to recover causal interactions in simulated spiking networks in Halliday et al. (2015). METHODS: This work presents a validation of NPD in continuous neural recordings (e.g. local field potentials). Specifically, we use autoregressive models to simulate time delayed correlations between neural signals. We then test for the accurate recovery of networks in the face of several confounds typically encountered in empirical data. We examine the effects of NPD under varying: a) signal-to-noise ratios, b) asymmetries in signal strength, c) instantaneous mixing, d) common drive, e) data length, and f) parallel/convergent signal routing. We also apply NPD to data from a patient who underwent simultaneous magnetoencephalography and deep brain recording. RESULTS: We demonstrate that NPD can accurately recover directed functional connectivity from simulations with known patterns of connectivity. The performance of the NPD measure is compared with non-parametric estimators of Granger causality (NPG), a well-established methodology for model-free estimation of dFC. A series of simulations investigating synthetically imposed confounds demonstrate that NPD provides estimates of connectivity that are equivalent to NPG, albeit with an increased sensitivity to data length. However, we provide evidence that: i) NPD is less sensitive than NPG to degradation by noise; ii) NPD is more robust to the generation of false positive identification of connectivity resulting from SNR asymmetries; iii) NPD is more robust to corruption via moderate amounts of instantaneous signal mixing. CONCLUSIONS: The results in this paper highlight that to be practically applied to neural data, connectivity metrics should not only be accurate in their recovery of causal networks but also resistant to the confounding effects often encountered in experimental recordings of multimodal data. Taken together, these findings position NPD at the state-of-the-art with respect to the estimation of directed functional connectivity in neuroimaging
    • …
    corecore