25 research outputs found

    Representation Selective Self-distillation and wav2vec 2.0 Feature Exploration for Spoof-aware Speaker Verification

    Full text link
    Text-to-speech and voice conversion studies are constantly improving to the extent where they can produce synthetic speech almost indistinguishable from bona fide human speech. In this regrad, the importance of countermeasures (CM) against synthetic voice attacks of the automatic speaker verification (ASV) systems emerges. Nonetheless, most end-to-end spoofing detection networks are black box systems, and the answer to what is an effective representation for finding artifacts still remains veiled. In this paper, we examine which feature space can effectively represent synthetic artifacts using wav2vec 2.0, and study which architecture can effectively utilize the space. Our study allows us to analyze which attribute of speech signals is advantageous for the CM systems. The proposed CM system achieved 0.31% equal error rate (EER) on ASVspoof 2019 LA evaluation set for the spoof detection task. We further propose a simple yet effective spoofing aware speaker verification (SASV) methodology, which takes advantage of the disentangled representations from our countermeasure system. Evaluation performed with the SASV Challenge 2022 database show 1.08% of SASV EER. Quantitative analysis shows that using the explored feature space of wav2vec 2.0 advantages both spoofing CM and SASV.Comment: Submitted to Interspeech 202

    Self-refining of Pseudo Labels for Music Source Separation with Noisy Labeled Data

    Full text link
    Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks. With the push to acquire larger datasets to improve MSS performance, the inevitability of encountering mislabeled individual instrument tracks becomes a significant challenge to address. This paper introduces an automated technique for refining the labels in a partially mislabeled dataset. Our proposed self-refining technique, employed with a noisy-labeled dataset, results in only a 1% accuracy degradation in multi-label instrument recognition compared to a classifier trained on a clean-labeled dataset. The study demonstrates the importance of refining noisy-labeled data in MSS model training and shows that utilizing the refined dataset leads to comparable results derived from a clean-labeled dataset. Notably, upon only access to a noisy dataset, MSS models trained on a self-refined dataset even outperform those trained on a dataset refined with a classifier trained on clean labels.Comment: 24th International Society for Music Information Retrieval Conference (ISMIR 2023

    Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects

    Full text link
    We propose an end-to-end music mixing style transfer system that converts the mixing style of an input multitrack to that of a reference song. This is achieved with an encoder pre-trained with a contrastive objective to extract only audio effects related information from a reference music recording. All our models are trained in a self-supervised manner from an already-processed wet multitrack dataset with an effective data preprocessing method that alleviates the data scarcity of obtaining unprocessed dry data. We analyze the proposed encoder for the disentanglement capability of audio effects and also validate its performance for mixing style transfer through both objective and subjective evaluations. From the results, we show the proposed system not only converts the mixing style of multitrack audio close to a reference but is also robust with mixture-wise style transfer upon using a music source separation model

    Frequency and clinical implications of the isolation of rare nontuberculous mycobacteria

    Get PDF
    Background: To date, more than 125 species of nontuberculous mycobacteria (NTM) have been identified. In this study, we investigated the frequency and clinical implication of the rarely isolated NTM from respiratory specimens. Methods: Patients with NTM isolated from their respiratory specimens between July 1, 2010 and June 31, 2012 were screened for inclusion. Rare NTM were defined as those NTM not falling within the group of eight NTM species commonly identified at our institution: Mycobacterium avium, M. intracellulare, M. abscessus, M. massiliense, M. fortuitum, M. kansasii, M. gordonae, and M. peregrinum. Clinical, radiographic and microbiological data from patients with rare NTM were reviewed and analyzed. Results: During the study period, 73 rare NTM were isolated from the respiratory specimens of 68 patients. Among these, M. conceptionense was the most common (nine patients, 12.3%). The median age of the 68 patients with rare NTM was 68 years, while 39 of the patients were male. Rare NTM were isolated only once in majority of patient (64 patients, 94.1%). Among the four patients from whom rare NTM were isolated two or more times, only two showed radiographic aggravation caused by rare NTM during the follow-up period. Conclusions: Most of the rarely identified NTM species were isolated from respiratory specimens only once per patient, without concomitant clinical aggravation. Clinicians could therefore observe such patients closely without invasive work-ups or treatment, provided the patients do not have decreased host immunity towards mycobacteriaPeer Reviewe

    Ultrasensitive Plasmon-Free Surface-Enhanced Raman Spectroscopy with Femtomolar Detection Limit from 2D van der Waals Heterostructure

    No full text
    Two-dimensional (2D) materials have been promoted as an ideal platform for surface-enhanced Raman spectroscopy (SERS), as they mitigate the drawbacks of noble metal-based SERS substrates. However, the inferior limit of detection has limited the practical applicability of 2D material-based SERS substrates. Here, we synthesize uniform large-area ReOxSy thin films via solution-phase deposition without post-treatments and demonstrate a graphene/ReOxSy vertical heterostructure as an ultrasensitive SERS platform. The electronic structure of ReOxSy can be modulated by changing the oxygen concentration in the lattice structure, obtaining efficient complementary resonance effects between ReOxSy and the probe molecule. In addition, the oxygen atoms in the ReOxSy lattice generate a dipole moment on the thin- film surface, which increases the electron transition probability. These synergistic effects outstandingly enhance the Raman effect in the ReOxSy thin film. When ReOxSy forms a vertical heterostructure on a graphene as the SERS substrate, the enhanced charge-transfer and exciton resonances improve the limit of detection to the femtomolar level, while achieving remarkable flexibility, reproducibility, and operational stability. Our results provide important insights into 2D material-based ultrasensitive SERS based on chemical mechanisms
    corecore