119 research outputs found

    Grouping Based Blind Interference Alignment for KK-user MISO Interference Channels

    Full text link
    We propose a blind interference alignment (BIA) through staggered antenna switching scheme with no ideal channel assumption. Contrary to the ideal assumption that channels remain constant during BIA symbol extension period, when the coherence time of the channel is relatively short, channel coefficients may change during a given symbol extension period. To perform BIA perfectly with realistic channel assumption, we propose a grouping based supersymbol structure for KK-user interference channels which can adjust a supersymbol length to given coherence time. It is proved that the supersymbol length could be reduced significantly by an appropriate grouping. Furthermore, it is also shown that the grouping based supersymbol achieves higher degrees of freedom than the conventional method with given coherence time.Comment: 5 pages, 3 figures, to appear in IEEE ISIT 201

    Retrospective Interference Alignment for Two-Cell Uplink MIMO Cellular Networks with Delayed CSIT

    Full text link
    In this paper, we propose a new retrospective interference alignment for two-cell multiple-input multiple-output (MIMO) interfering multiple access channels (IMAC) with the delayed channel state information at the transmitters (CSIT). It is shown that having delayed CSIT can strictly increase the sum-DoF compared to the case of no CSIT. The key idea is to align multiple interfering signals from adjacent cells onto a small dimensional subspace over time by fully exploiting the previously received signals as side information with outdated CSIT in a distributed manner. Remarkably, we show that the retrospective interference alignment can achieve the optimal sum-DoF in the context of two-cell two-user scenario by providing a new outer bound.Comment: 7 pages, 2 figures, to appear in IEEE ICC 201

    Two Methods for Spoofing-Aware Speaker Verification: Multi-Layer Perceptron Score Fusion Model and Integrated Embedding Projector

    Full text link
    The use of deep neural networks (DNN) has dramatically elevated the performance of automatic speaker verification (ASV) over the last decade. However, ASV systems can be easily neutralized by spoofing attacks. Therefore, the Spoofing-Aware Speaker Verification (SASV) challenge is designed and held to promote development of systems that can perform ASV considering spoofing attacks by integrating ASV and spoofing countermeasure (CM) systems. In this paper, we propose two back-end systems: multi-layer perceptron score fusion model (MSFM) and integrated embedding projector (IEP). The MSFM, score fusion back-end system, derived SASV score utilizing ASV and CM scores and embeddings. On the other hand,IEP combines ASV and CM embeddings into SASV embedding and calculates final SASV score based on the cosine similarity. We effectively integrated ASV and CM systems through proposed MSFM and IEP and achieved the SASV equal error rates 0.56%, 1.32% on the official evaluation trials of the SASV 2022 challenge.Comment: 5 pages, 4 figures, 5 tables, accepted to 2022 Interspeech as a conference pape

    Sea-Rise Flooding on Massive Dynamic Terrains

    Get PDF

    One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification

    Full text link
    The application of speech self-supervised learning (SSL) models has achieved remarkable performance in speaker verification (SV). However, there is a computational cost hurdle in employing them, which makes development and deployment difficult. Several studies have simply compressed SSL models through knowledge distillation (KD) without considering the target task. Consequently, these methods could not extract SV-tailored features. This paper suggests One-Step Knowledge Distillation and Fine-Tuning (OS-KDFT), which incorporates KD and fine-tuning (FT). We optimize a student model for SV during KD training to avert the distillation of inappropriate information for the SV. OS-KDFT could downsize Wav2Vec 2.0 based ECAPA-TDNN size by approximately 76.2%, and reduce the SSL model's inference time by 79% while presenting an EER of 0.98%. The proposed OS-KDFT is validated across VoxCeleb1 and VoxCeleb2 datasets and W2V2 and HuBERT SSL models. Experiments are available on our GitHub

    Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

    Full text link
    The advent of hyper-scale and general-purpose pre-trained models is shifting the paradigm of building task-specific models for target tasks. In the field of audio research, task-agnostic pre-trained models with high transferability and adaptability have achieved state-of-the-art performances through fine-tuning for downstream tasks. Nevertheless, re-training all the parameters of these massive models entails an enormous amount of time and cost, along with a huge carbon footprint. To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain. We also propose an integrated parameter-efficient tuning (IPET) framework by aggregating the embedding prompt (a prompt-based learning approach), and the adapter (an effective transfer learning method). We demonstrate the efficacy of the proposed framework using two backbone pre-trained audio models with different characteristics: the audio spectrogram transformer and wav2vec 2.0. The proposed IPET framework exhibits remarkable performance compared to fine-tuning method with fewer trainable parameters in four downstream tasks: sound event classification, music genre classification, keyword spotting, and speaker verification. Furthermore, the authors identify and analyze the shortcomings of the IPET framework, providing lessons and research directions for parameter efficient tuning in the audio domain.Comment: 5 pages, 3 figures, submit to ICASSP202
    • …
    corecore