36 research outputs found

    Robust Nearfield Wideband Beamforming Design Based on Adaptive-Weighted Convex Optimization

    Get PDF
    Nearfield wideband beamformers for microphone arrays have wide applications in multichannel speech enhancement. The nearfield wideband beamformer design based on convex optimization is one of the typical representatives of robust approaches. However, in this approach, the coefficient of convex optimization is a constant, which has not used all the freedom provided by the weighting coefficient efficiently. Therefore, it is still necessary to further improve the performance. To solve this problem, we developed a robust nearfield wideband beamformer design approach based on adaptive-weighted convex optimization. The proposed approach defines an adaptive-weighted function by the adaptive array signal processing theory and adjusts its value flexibly, which has improved the beamforming performance. During each process of the adaptive updating of the weighting function, the convex optimization problem can be formulated as a SOCP (Second-Order Cone Program) problem, which could be solved efficiently using the well-established interior-point methods. This method is suitable for the case where the sound source is in the nearfield range, can work well in the presence of microphone mismatches, and is applicable to arbitrary array geometries. Several design examples are presented to verify the effectiveness of the proposed approach and the correctness of the theoretical analysis

    Effects of a near-field rigid sphere scatterer on the performance of linear microphone array beamformers

    Full text link
    © 2016 Acoustical Society of America. Beamformers enable a microphone array to capture acoustic signals from a sound source with high signal to noise ratio in a noisy environment, and the linear microphone array is of particular importance, in practice, due to its simplicity and easy implementation. A linear microphone array sometimes is used near some scattering objects, which affect its beamforming performance. This paper develops a numerical model with a linear microphone array near a rigid sphere for both far-field plane wave and near-field sources. The effects of the scatterer on two typical beamformers, i.e., the delay-and-sum beamformer and the superdirective beamformer, are investigated by both simulations and experiments. It is found that the directivity factor of both beamformers improves due to the increased equivalent array aperture when the size of the array is no larger than that of the scatter. With the increase of the array size, the directivity factor tends to deteriorate at high frequencies because of the rising side-lobes. When the array size is significantly larger than that of the scatterer, the scattering has hardly any influence on the beamforming performance

    Sparse Array Design for Wideband Beamforming with Reduced Complexity in Tapped Delay-lines

    Get PDF
    Sparse wideband array design for sensor location optimization is highly nonlinear and it is traditionally solved by genetic algorithms (GAs) or other similar optimization methods. This is an extremely time-consuming process and an optimum solution is not always guaranteed. In this work, this problem is studied from the viewpoint of compressive sensing (CS). Although there have been CS-based methods proposed for the design of sparse narrowband arrays, its extension to the wideband case is not straightforward, as there are multiple coefficients associated with each sensor and they have to be simultaneously minimized in order to discard the corresponding sensor locations. At first, sensor location optimization for both general wideband beamforming and frequency invariant beamforming is considered. Then, sparsity in the tapped delay-line (TDL) coefficients associated with each sensor is considered in order to reduce the implementation complexity of each TDL. Finally, design of robust wideband arrays against norm-bounded steering vector errors is addressed. Design examples are provided to verify the effectiveness of the proposed methods, with comparisons drawn with a GA-based design method

    Acoustic Speaker Localization with Strong Reverberation and Adaptive Feature Filtering with a Bayes RFS Framework

    Get PDF
    The thesis investigates the challenges of speaker localization in presence of strong reverberation, multi-speaker tracking, and multi-feature multi-speaker state filtering, using sound recordings from microphones. Novel reverberation-robust speaker localization algorithms are derived from the signal and room acoustics models. A multi-speaker tracking filter and a multi-feature multi-speaker state filter are developed based upon the generalized labeled multi-Bernoulli random finite set framework. Experiments and comparative studies have verified and demonstrated the benefits of the proposed methods

    A study into the design of steerable microphones arrays

    Get PDF
    Beamforming, being a multi-channel signal processing technique, can offer both spatial and temporal selective filtering. It has much more potential than single channel signal processing in various commercial applications. This thesis presents a study on steerable robust broadband beamformers together with a number of their design formulations. The design formulations allow a simple steering mechanism and yet maintain a frequency invariant property as well as achieve robustness against practical imperfectio

    Robust Multichannel Microphone Beamforming

    No full text
    In this thesis, a method for the design and implementation of a spatially robust multichannel microphone beamforming system is presented. A set of spatial correlation functions are derived for 2D and 3D far-field/near-field scenarios based on von Mises(-Fisher), Gaussian, and uniform source location distributions. These correlation functions are used to design spatially robust beamformers and blocking beamformers (nullformers) designed to enhance or suppress a known source, where the target source location is not perfectly known due to either an incorrect location estimate or movement of the target while the beamformers are active. The spatially robust beam/null-formers form signal and interferer plus noise references which can be further processed via a blind source separation algorithm to remove mutual components - removing the interference and sensor noise from the signal path and vice versa. The noise reduction performance of the combined beamforming and blind source separation system approaches that of a perfect information MVDR beamformer under reverberant conditions. It is demonstrated that the proposed algorithm can be implemented on low-power hardware with good performance on hardware similar to current mobile platforms using a four-element microphone array

    Acoustic event detection and localization using distributed microphone arrays

    Get PDF
    Automatic acoustic scene analysis is a complex task that involves several functionalities: detection (time), localization (space), separation, recognition, etc. This thesis focuses on both acoustic event detection (AED) and acoustic source localization (ASL), when several sources may be simultaneously present in a room. In particular, the experimentation work is carried out with a meeting-room scenario. Unlike previous works that either employed models of all possible sound combinations or additionally used video signals, in this thesis, the time overlapping sound problem is tackled by exploiting the signal diversity that results from the usage of multiple microphone array beamformers. The core of this thesis work is a rather computationally efficient approach that consists of three processing stages. In the first, a set of (null) steering beamformers is used to carry out diverse partial signal separations, by using multiple arbitrarily located linear microphone arrays, each of them composed of a small number of microphones. In the second stage, each of the beamformer output goes through a classification step, which uses models for all the targeted sound classes (HMM-GMM, in the experiments). Then, in a third stage, the classifier scores, either being intra- or inter-array, are combined using a probabilistic criterion (like MAP) or a machine learning fusion technique (fuzzy integral (FI), in the experiments). The above-mentioned processing scheme is applied in this thesis to a set of complexity-increasing problems, which are defined by the assumptions made regarding identities (plus time endpoints) and/or positions of sounds. In fact, the thesis report starts with the problem of unambiguously mapping the identities to the positions, continues with AED (positions assumed) and ASL (identities assumed), and ends with the integration of AED and ASL in a single system, which does not need any assumption about identities or positions. The evaluation experiments are carried out in a meeting-room scenario, where two sources are temporally overlapped; one of them is always speech and the other is an acoustic event from a pre-defined set. Two different databases are used, one that is produced by merging signals actually recorded in the UPC¿s department smart-room, and the other consists of overlapping sound signals directly recorded in the same room and in a rather spontaneous way. From the experimental results with a single array, it can be observed that the proposed detection system performs better than either the model based system or a blind source separation based system. Moreover, the product rule based combination and the FI based fusion of the scores resulting from the multiple arrays improve the accuracies further. On the other hand, the posterior position assignment is performed with a very small error rate. Regarding ASL and assuming an accurate AED system output, the 1-source localization performance of the proposed system is slightly better than that of the widely-used SRP-PHAT system, working in an event-based mode, and it even performs significantly better than the latter one in the more complex 2-source scenario. Finally, though the joint system suffers from a slight degradation in terms of classification accuracy with respect to the case where the source positions are known, it shows the advantage of carrying out the two tasks, recognition and localization, with a single system, and it allows the inclusion of information about the prior probabilities of the source positions. It is worth noticing also that, although the acoustic scenario used for experimentation is rather limited, the approach and its formalism were developed for a general case, where the number and identities of sources are not constrained

    Broadband adaptive beamforming with low complexity and frequency invariant response

    No full text
    This thesis proposes different methods to reduce the computational complexity as well as increasing the adaptation rate of adaptive broadband beamformers. This is performed exemplarily for the generalised sidelobe canceller (GSC) structure. The GSC is an alternative implementation of the linearly constrained minimum variance beamformer, which can utilise well-known adaptive filtering algorithms, such as the least mean square (LMS) or the recursive least squares (RLS) to perform unconstrained adaptive optimisation.A direct DFT implementation, by which broadband signals are decomposed into frequency bins and processed by independent narrowband beamforming algorithms, is thought to be computationally optimum. However, this setup fail to converge to the time domain minimum mean square error (MMSE) if signal components are not aligned to frequency bins, resulting in a large worst case error. To mitigate this problem of the so-called independent frequency bin (IFB) processor, overlap-save based GSC beamforming structures have been explored. This system address the minimisation of the time domain MMSE, with a significant reduction in computational complexity when compared to time-domain implementations, and show a better convergence behaviour than the IFB beamformer. By studying the effects that the blocking matrix has on the adaptive process for the overlap-save beamformer, several modifications are carried out to enhance both the simplicity of the algorithm as well as its convergence speed. These modifications result in the GSC beamformer utilising a significantly lower computational complexity compare to the time domain approach while offering similar convergence characteristics.In certain applications, especially in the areas of acoustics, there is a need to maintain constant resolution across a wide operating spectrum that may extend across several octaves. To attain constant beamwidth is difficult, particularly if uniformly spaced linear sensor array are employed for beamforming, since spatial resolution is reciprocally proportional to both the array aperture and the frequency. A scaled aperture arrangement is introduced for the subband based GSC beamformer to achieve near uniform resolution across a wide spectrum, whereby an octave-invariant design is achieved. This structure can also be operated in conjunction with adaptive beamforming algorithms. Frequency dependent tapering of the sensor signals is proposed in combination with the overlap-save GSC structure in order to achieve an overall frequency-invariant characteristic. An adaptive version is proposed for frequency-invariant overlap-save GSC beamformer. Broadband adaptive beamforming algorithms based on the family of least mean squares (LMS) algorithms are known to exhibit slow convergence if the input signal is correlated. To improve the convergence of the GSC when based on LMS-type algorithms, we propose the use of a broadband eigenvalue decomposition (BEVD) to decorrelate the input of the adaptive algorithm in the spatial dimension, for which an increase in convergence speed can be demonstrated over other decorrelating measures, such as the Karhunen-Loeve transform. In order to address the remaining temporal correlation after BEVD processing, this approach is combined with subband decomposition through the use of oversampled filter banks. The resulting spatially and temporally decorrelated GSC beamformer provides further enhanced convergence speed over spatial or temporal decorrelation methods on their own
    corecore