791 research outputs found

    Joint Pitch and DOA Estimation Using the ESPRIT method

    Get PDF

    Broadband adaptive beamforming with low complexity and frequency invariant response

    No full text
    This thesis proposes different methods to reduce the computational complexity as well as increasing the adaptation rate of adaptive broadband beamformers. This is performed exemplarily for the generalised sidelobe canceller (GSC) structure. The GSC is an alternative implementation of the linearly constrained minimum variance beamformer, which can utilise well-known adaptive filtering algorithms, such as the least mean square (LMS) or the recursive least squares (RLS) to perform unconstrained adaptive optimisation.A direct DFT implementation, by which broadband signals are decomposed into frequency bins and processed by independent narrowband beamforming algorithms, is thought to be computationally optimum. However, this setup fail to converge to the time domain minimum mean square error (MMSE) if signal components are not aligned to frequency bins, resulting in a large worst case error. To mitigate this problem of the so-called independent frequency bin (IFB) processor, overlap-save based GSC beamforming structures have been explored. This system address the minimisation of the time domain MMSE, with a significant reduction in computational complexity when compared to time-domain implementations, and show a better convergence behaviour than the IFB beamformer. By studying the effects that the blocking matrix has on the adaptive process for the overlap-save beamformer, several modifications are carried out to enhance both the simplicity of the algorithm as well as its convergence speed. These modifications result in the GSC beamformer utilising a significantly lower computational complexity compare to the time domain approach while offering similar convergence characteristics.In certain applications, especially in the areas of acoustics, there is a need to maintain constant resolution across a wide operating spectrum that may extend across several octaves. To attain constant beamwidth is difficult, particularly if uniformly spaced linear sensor array are employed for beamforming, since spatial resolution is reciprocally proportional to both the array aperture and the frequency. A scaled aperture arrangement is introduced for the subband based GSC beamformer to achieve near uniform resolution across a wide spectrum, whereby an octave-invariant design is achieved. This structure can also be operated in conjunction with adaptive beamforming algorithms. Frequency dependent tapering of the sensor signals is proposed in combination with the overlap-save GSC structure in order to achieve an overall frequency-invariant characteristic. An adaptive version is proposed for frequency-invariant overlap-save GSC beamformer. Broadband adaptive beamforming algorithms based on the family of least mean squares (LMS) algorithms are known to exhibit slow convergence if the input signal is correlated. To improve the convergence of the GSC when based on LMS-type algorithms, we propose the use of a broadband eigenvalue decomposition (BEVD) to decorrelate the input of the adaptive algorithm in the spatial dimension, for which an increase in convergence speed can be demonstrated over other decorrelating measures, such as the Karhunen-Loeve transform. In order to address the remaining temporal correlation after BEVD processing, this approach is combined with subband decomposition through the use of oversampled filter banks. The resulting spatially and temporally decorrelated GSC beamformer provides further enhanced convergence speed over spatial or temporal decorrelation methods on their own

    Polynomial eigenvalue decomposition for multichannel broadband signal processing

    Get PDF
    This article is devoted to the polynomial eigenvalue decomposition (PEVD) and its applications in broadband multichannel signal processing, motivated by the optimum solutions provided by the eigenvalue decomposition (EVD) for the narrow-band case [1], [2]. In general, the successful techniques from narrowband problems can also be applied to broadband ones, leading to improved solutions. Multichannel broadband signals arise at the core of many essential commercial applications such as telecommunications, speech processing, healthcare monitoring, astronomy and seismic surveillance, and military technologies like radar, sonar and communications [3]. The success of these applications often depends on the performance of signal processing tasks, including data compression [4], source localization [5], channel coding [6], signal enhancement [7], beamforming [8], and source separation [9]. In most cases and for narrowband signals, performing an EVD is the key to the signal processing algorithm. Therefore, this paper aims to introduce PEVD as a novel mathematical technique suitable for many broadband signal processing applications

    Graph Signal Processing: Overview, Challenges and Applications

    Full text link
    Research in Graph Signal Processing (GSP) aims to develop tools for processing data defined on irregular graph domains. In this paper we first provide an overview of core ideas in GSP and their connection to conventional digital signal processing. We then summarize recent developments in developing basic GSP tools, including methods for sampling, filtering or graph learning. Next, we review progress in several application areas using GSP, including processing and analysis of sensor network data, biological data, and applications to image processing and machine learning. We finish by providing a brief historical perspective to highlight how concepts recently developed in GSP build on top of prior research in other areas.Comment: To appear, Proceedings of the IEE

    ‘Did the speaker change?’: Temporal tracking for overlapping speaker segmentation in multi-speaker scenarios

    Get PDF
    Diarization systems are an essential part of many speech processing applications, such as speaker indexing, improving automatic speech recognition (ASR) performance and making single speaker-based algorithms available for use in multi-speaker domains. This thesis will focus on the first task of the diarization process, that being the task of speaker segmentation which can be thought of as trying to answer the question ‘Did the speaker change?’ in an audio recording. This thesis starts by showing that time-varying pitch properties can be used advantageously within the segmentation step of a multi-talker diarization system. It is then highlighted that an individual’s pitch is smoothly varying and, therefore, can be predicted by means of a Kalman filter. Subsequently, it is shown that if the pitch is not predictable, then this is most likely due to a change in the speaker. Finally, a novel system is proposed that uses this approach of pitch prediction for speaker change detection. This thesis then goes on to demonstrate how voiced harmonics can be useful in detecting when more than one speaker is talking, such as during overlapping speaker activity. A novel system is proposed to track multiple harmonics simultaneously, allowing for the determination of onsets and end-points of a speaker’s utterance in the presence of an additional active speaker. This thesis then extends this work to explore the use of a new multimodal approach for overlapping speaker segmentation that tracks both the fundamental frequency (F0) and direction of arrival (DoA) of each speaker simultaneously. The proposed multiple hypothesis tracking system, which simultaneously tracks both features, shows an improvement in segmentation performance when compared to tracking these features separately. Lastly, this thesis focuses on the DoA estimation part of the newly proposed multimodal approach. It does this by exploring a polynomial extension to the multiple signal classification (MUSIC) algorithm, spatio-spectral polynomial (SSP)-MUSIC, and evaluating its performance when using speech sound sources.Open Acces
    • 

    corecore