17 research outputs found

    Audio source separation into the wild

    Get PDF
    International audienceThis review chapter is dedicated to multichannel audio source separation in real-life environment. We explore some of the major achievements in the field and discuss some of the remaining challenges. We will explore several important practical scenarios, e.g. moving sources and/or microphones, varying number of sources and sensors, high reverberation levels, spatially diffuse sources, and synchronization problems. Several applications such as smart assistants, cellular phones, hearing aids and robots, will be discussed. Our perspectives on the future of the field will be given as concluding remarks of this chapter

    Distributed GSC beamforming using the relative transfer function

    No full text
    ABSTRACT A speech enhancement algorithm in a noisy and reverberant enclosure for a wireless acoustic sensor network (WASN) is derived. The proposed algorithm is structured as a two stage beamformers (BFs) scheme, where the outputs of the first stage are transmitted in the network. Designing the second stage BF requires estimating the desired signal components at the transmitted signals. The contribution here is twofold. First, in spatially static scenarios, the first stage BFs are designed to maintain a fixed response towards the desired signal. As opposed to competing algorithms, where the response changes and repeated estimation thereof is required. Second, the proposed algorithm is implemented in a generalized sidelobe canceler (GSC) form, separating the treatment of the desired speech and the interferences and enabling a simple timerecursive implementation of the algorithm. A comprehensive experimental study demonstrates the equivalent performance of the centralized GSC and of the proposed algorithm for both narrowband and speech signals

    A consolidated perspective on multi-microphone speech enhancement and source separation

    Get PDF
    Added equation (108)International audienceSpeech enhancement and separation are core problems in audio signal processing, with commercial applications in devices as diverse as mobile phones, conference call systems, hands-free systems, or hearing aids. In addition, they are crucial pre-processing steps for noise-robust automatic speech and speaker recognition. Many devices now have two to eight microphones. The enhancement and separation capabilities offered by these multichannel interfaces are usually greater than those of single-channel interfaces. Research in speech enhancement and separation has followed two convergent paths, starting with microphone array processing and blind source separation, respectively. These communities are now strongly interrelated and routinely borrow ideas from each other. Yet, a comprehensive overview of the common foundations and the differences between these approaches is lacking at present. In this article, we propose to fill this gap by analyzing a large number of established and recent techniques according to four transverse axes: a) the acoustic impulse response model, b) the spatial filter design criterion, c) the parameter estimation algorithm, and d) optional postfiltering. We conclude this overview paper by providing a list of software and data resources and by discussing perspectives and future trends in the field

    Optimal distributed minimum-variance beamforming approaches for speech enhancement in wireless acoustic sensor networks

    No full text
    © 2014 Elsevier B.V. In multiple speaker scenarios, the linearly constrained minimum variance (LCMV) beamformer is a popular microphone array-based speech enhancement technique, as it allows minimizing the noise power while maintaining a set of desired responses towards different speakers. Here, we address the algorithmic challenges arising when applying the LCMV beamformer in wireless acoustic sensor networks (WASNs), which are a next-generation technology for audio acquisition and processing. We review three optimal distributed LCMV-based algorithms, which compute a network-wide LCMV beamformer output at each node without centralizing the microphone signals. Optimality here refers to equivalence to a centralized realization where a single processor has access to all signals. We derive and motivate the algorithms in an accessible top-down framework that reveals their underlying relations. We explain how their differences result from their different design criterion (node-specific versus common constraints sets), and their different priorities for communication bandwidth, computational power, and adaptivity. Furthermore, although originally proposed for a fully connected WASN, we also explain how to extend the reviewed algorithms to the case of a partially connected WASN, which is assumed to be pruned to a tree topology. Finally, we discuss the advantages and disadvantages of the various algorithmsstatus: publishe

    Near-field source extraction using speech presence probabilities for ad hoc microphone arrays

    No full text
    Ad hoc wireless acoustic sensor networks (WASNs) hold great potential for improved performance in speech processing applications, thanks to better coverage and higher diversity of the received signals. We consider a multiple speaker scenario where each of the WASN nodes, an autonomous system comprising of sensing, processing and communicating capabilities, is positioned in the near-field of one of the speakers. Each node aims at extracting its nearest speaker while suppressing other speakers and noise. The ad hoc network is characterized by an arbitrary number of speakers/nodes with uncontrolled microphone constellation. In this paper we propose a distributed algorithm which shares information between nodes. The algorithm requires each node to transmit a single audio channel in addition to a soft time-frequency (TF) activity mask for its nearest speaker. The TF activity masks are computed as a combination of estimates of a model-based speech presence probability (SPP), direct to reverberant ratio (DRR) and direction of arrival (DOA) per TF bin. The proposed algorithm, although sub-optimal compared to the centralized solution, is superior to the single-node solution
    corecore