4,920 research outputs found

    Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

    Get PDF
    We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition.Comment: 31 page

    A Non-Local Structure Tensor Based Approach for Multicomponent Image Recovery Problems

    Full text link
    Non-Local Total Variation (NLTV) has emerged as a useful tool in variational methods for image recovery problems. In this paper, we extend the NLTV-based regularization to multicomponent images by taking advantage of the Structure Tensor (ST) resulting from the gradient of a multicomponent image. The proposed approach allows us to penalize the non-local variations, jointly for the different components, through various 1,p\ell_{1,p} matrix norms with p1p \ge 1. To facilitate the choice of the hyper-parameters, we adopt a constrained convex optimization approach in which we minimize the data fidelity term subject to a constraint involving the ST-NLTV regularization. The resulting convex optimization problem is solved with a novel epigraphical projection method. This formulation can be efficiently implemented thanks to the flexibility offered by recent primal-dual proximal algorithms. Experiments are carried out for multispectral and hyperspectral images. The results demonstrate the interest of introducing a non-local structure tensor regularization and show that the proposed approach leads to significant improvements in terms of convergence speed over current state-of-the-art methods

    Spherical deconvolution of multichannel diffusion MRI data with non-Gaussian noise models and spatial regularization

    Get PDF
    Spherical deconvolution (SD) methods are widely used to estimate the intra-voxel white-matter fiber orientations from diffusion MRI data. However, while some of these methods assume a zero-mean Gaussian distribution for the underlying noise, its real distribution is known to be non-Gaussian and to depend on the methodology used to combine multichannel signals. Indeed, the two prevailing methods for multichannel signal combination lead to Rician and noncentral Chi noise distributions. Here we develop a Robust and Unbiased Model-BAsed Spherical Deconvolution (RUMBA-SD) technique, intended to deal with realistic MRI noise, based on a Richardson-Lucy (RL) algorithm adapted to Rician and noncentral Chi likelihood models. To quantify the benefits of using proper noise models, RUMBA-SD was compared with dRL-SD, a well-established method based on the RL algorithm for Gaussian noise. Another aim of the study was to quantify the impact of including a total variation (TV) spatial regularization term in the estimation framework. To do this, we developed TV spatially-regularized versions of both RUMBA-SD and dRL-SD algorithms. The evaluation was performed by comparing various quality metrics on 132 three-dimensional synthetic phantoms involving different inter-fiber angles and volume fractions, which were contaminated with noise mimicking patterns generated by data processing in multichannel scanners. The results demonstrate that the inclusion of proper likelihood models leads to an increased ability to resolve fiber crossings with smaller inter-fiber angles and to better detect non-dominant fibers. The inclusion of TV regularization dramatically improved the resolution power of both techniques. The above findings were also verified in brain data

    A superior edge preserving filter with a systematic analysis

    Get PDF
    A new, adaptive, edge preserving filter for use in image processing is presented. It had superior performance when compared to other filters. Termed the contiguous K-average, it aggregates pixels by examining all pixels contiguous to an existing cluster and adding the pixel closest to the mean of the existing cluster. The process is iterated until K pixels were accumulated. Rather than simply compare the visual results of processing with this operator to other filters, some approaches were developed which allow quantitative evaluation of how well and filter performs. Particular attention is given to the standard deviation of noise within a feature and the stability of imagery under iterative processing. Demonstrations illustrate the performance of several filters to discriminate against noise and retain edges, the effect of filtering as a preprocessing step, and the utility of the contiguous K-average filter when used with remote sensing data

    A robust nonlinear scale space change detection approach for SAR images

    Get PDF
    In this paper, we propose a change detection approach based on nonlinear scale space analysis of change images for robust detection of various changes incurred by natural phenomena and/or human activities in Synthetic Aperture Radar (SAR) images using Maximally Stable Extremal Regions (MSERs). To achieve this, a variant of the log-ratio image of multitemporal images is calculated which is followed by Feature Preserving Despeckling (FPD) to generate nonlinear scale space images exhibiting different trade-offs in terms of speckle reduction and shape detail preservation. MSERs of each scale space image are found and then combined through a decision level fusion strategy, namely "selective scale fusion" (SSF), where contrast and boundary curvature of each MSER are considered. The performance of the proposed method is evaluated using real multitemporal high resolution TerraSAR-X images and synthetically generated multitemporal images composed of shapes with several orientations, sizes, and backscatter amplitude levels representing a variety of possible signatures of change. One of the main outcomes of this approach is that different objects having different sizes and levels of contrast with their surroundings appear as stable regions at different scale space images thus the fusion of results from scale space images yields a good overall performance
    corecore