175 research outputs found

    An iterative model-based approach to cochannel speech separation

    Get PDF

    Toward the pre-cocktail party problem with TasTas++

    Full text link
    Deep neural network with dual-path bi-directional long short-term memory (BiLSTM) block has been proved to be very effective in sequence modeling, especially in speech separation, e.g. DPRNN-TasNet \cite{luo2019dual}, TasTas \cite{shi2020speech}. In this paper, we propose two improvements of TasTas \cite{shi2020speech} for end-to-end approach to monaural speech separation in pre-cocktail party problems, which consists of 1) generate new training data through the original training batch in real time, and 2) train each module in TasTas separately. The new approach is called TasTas++, which takes the mixed utterance of five speakers and map it to five separated utterances, where each utterance contains only one speaker's voice. For the objective, we train the network by directly optimizing the utterance level scale-invariant signal-to-distortion ratio (SI-SDR) in a permutation invariant training (PIT) style. Our experiments on the public WSJ0-5mix data corpus results in 11.14dB SDR improvement, which shows our proposed networks can lead to performance improvement on the speaker separation task. We have open-sourced our re-implementation of the DPRNN-TasNet in https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation, and our TasTas++ is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.Comment: arXiv admin note: substantial text overlap with arXiv:2001.08998, arXiv:1902.04891, arXiv:1902.00651, arXiv:2008.0314

    Robust Transceiver Design for MISO Interference Channel with Energy Harvesting

    Full text link
    In this paper, we consider multiuser multiple-input single-output (MISO) interference channel where the received signal is divided into two parts for information decoding and energy harvesting (EH), respectively. The transmit beamforming vectors and receive power splitting (PS) ratios are jointly designed in order to minimize the total transmission power subject to both signal-to-interference-plus-noise ratio (SINR) and EH constraints. Most joint beamforming and power splitting (JBPS) designs assume that perfect channel state information (CSI) is available; however CSI errors are inevitable in practice. To overcome this limitation, we study the robust JBPS design problem assuming a norm-bounded error (NBE) model for the CSI. Three different solution approaches are proposed for the robust JBPS problem, each one leading to a different computational algorithm. Firstly, an efficient semidefinite relaxation (SDR)-based approach is presented to solve the highly non-convex JBPS problem, where the latter can be formulated as a semidefinite programming (SDP) problem. A rank-one recovery method is provided to recover a robust feasible solution to the original problem. Secondly, based on second order cone programming (SOCP) relaxation, we propose a low complexity approach with the aid of a closed-form robust solution recovery method. Thirdly, a new iterative method is also provided which can achieve near-optimal performance when the SDR-based algorithm results in a higher-rank solution. We prove that this iterative algorithm monotonically converges to a Karush-Kuhn-Tucker (KKT) solution of the robust JBPS problem. Finally, simulation results are presented to validate the robustness and efficiency of the proposed algorithms.Comment: 13 pages, 8 figures. arXiv admin note: text overlap with arXiv:1407.0474 by other author
    • …
    corecore