Search CORE

1,379 research outputs found

Convolution channel separation and frequency sub-bands aggregation for music genre classification

Author: Heo Jungwoo
Kim Ju-ho
Lim Chan-yeong
Shin Hyun-seo
Yu Ha-Jin
Publication venue
Publication date: 03/11/2022
Field of study

In music, short-term features such as pitch and tempo constitute long-term semantic features such as melody and narrative. A music genre classification (MGC) system should be able to analyze these features. In this research, we propose a novel framework that can extract and aggregate both short- and long-term features hierarchically. Our framework is based on ECAPA-TDNN, where all the layers that extract short-term features are affected by the layers that extract long-term features because of the back-propagation training. To prevent the distortion of short-term features, we devised the convolution channel separation technique that separates short-term features from long-term feature extraction paths. To extract more diverse features from our framework, we incorporated the frequency sub-bands aggregation method, which divides the input spectrogram along frequency bandwidths and processes each segment. We evaluated our framework using the Melon Playlist dataset which is a large-scale dataset containing 600 times more data than GTZAN which is a widely used dataset in MGC studies. As the result, our framework achieved 70.4% accuracy, which was improved by 16.9% compared to a conventional framework

arXiv.org e-Print Archive

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

Author: Heo Jungwoo
Kim Ju-ho
Lim Chan-yeong
Shin Hyun-seo
Yu Ha-Jin
Publication venue
Publication date: 03/11/2022
Field of study

The advent of hyper-scale and general-purpose pre-trained models is shifting the paradigm of building task-specific models for target tasks. In the field of audio research, task-agnostic pre-trained models with high transferability and adaptability have achieved state-of-the-art performances through fine-tuning for downstream tasks. Nevertheless, re-training all the parameters of these massive models entails an enormous amount of time and cost, along with a huge carbon footprint. To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain. We also propose an integrated parameter-efficient tuning (IPET) framework by aggregating the embedding prompt (a prompt-based learning approach), and the adapter (an effective transfer learning method). We demonstrate the efficacy of the proposed framework using two backbone pre-trained audio models with different characteristics: the audio spectrogram transformer and wav2vec 2.0. The proposed IPET framework exhibits remarkable performance compared to fine-tuning method with fewer trainable parameters in four downstream tasks: sound event classification, music genre classification, keyword spotting, and speaker verification. Furthermore, the authors identify and analyze the shortcomings of the IPET framework, providing lessons and research directions for parameter efficient tuning in the audio domain.Comment: 5 pages, 3 figures, submit to ICASSP202

arXiv.org e-Print Archive

One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification

Author: Heo Jungwoo
Kim Ju-ho
Lim Chan-yeong
Shin Hyun-seo
Yu Ha-Jin
Publication venue
Publication date: 27/05/2023
Field of study

The application of speech self-supervised learning (SSL) models has achieved remarkable performance in speaker verification (SV). However, there is a computational cost hurdle in employing them, which makes development and deployment difficult. Several studies have simply compressed SSL models through knowledge distillation (KD) without considering the target task. Consequently, these methods could not extract SV-tailored features. This paper suggests One-Step Knowledge Distillation and Fine-Tuning (OS-KDFT), which incorporates KD and fine-tuning (FT). We optimize a student model for SV during KD training to avert the distillation of inappropriate information for the SV. OS-KDFT could downsize Wav2Vec 2.0 based ECAPA-TDNN size by approximately 76.2%, and reduce the SSL model's inference time by 79% while presenting an EER of 0.98%. The proposed OS-KDFT is validated across VoxCeleb1 and VoxCeleb2 datasets and W2V2 and HuBERT SSL models. Experiments are available on our GitHub

arXiv.org e-Print Archive

Nucleotide sequence of the vmhA gene encoding hemolysin from Vibrio mimicus

Author: Huh Sung-Hoi
Kim Gu-Taek
Kong In-Soo
Lee Jong-Young
Yu Ju-Hyun
Publication venue: Published by Elsevier B.V.
Publication date: 12/04/1997
Field of study

AbstractThe structural gene (vmhA) of hemolysin from Vibrio mimicus (ATCC33653) was cloned and sequenced. The vmhA gene contains an open reading frame consisting of 2232 nucleotides which can code for a protein of 744 amino acids with a predicted molecular mass of 83 059. The similarity of amino acid sequence shows 81.6% identity with Vibrio cholerae El Tor hemolysin

Elsevier - Publisher Connector

PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification

Author: Heo Jungwoo
Kim Ju-ho
Kim Wonbin
Lim Chan-yeong
Shin Hyun-seo
Yu Ha-Jin
Publication venue
Publication date: 20/07/2023
Field of study

Background noise reduces speech intelligibility and quality, making speaker verification (SV) in noisy environments a challenging task. To improve the noise robustness of SV systems, additive noise data augmentation method has been commonly used. In this paper, we propose a new additive noise method, partial additive speech (PAS), which aims to train SV systems to be less affected by noisy environments. The experimental results demonstrate that PAS outperforms traditional additive noise in terms of equal error rates (EER), with relative improvements of 4.64% and 5.01% observed in SE-ResNet34 and ECAPA-TDNN. We also show the effectiveness of proposed method by analyzing attention modules and visualizing speaker embeddings.Comment: 5 pages, 2 figures, 1 table, accepted to CKAIA2023 as a conference pape

arXiv.org e-Print Archive

High-speed infrared phase modulators using short helical pitch ferroelectric liquid crystals

Author: Kim Dong-Woo
Lee Ju-Hyun
Lee Sin-Doo
Wu Shin-Tson
Wu Yung-Hsun
Yu Chang-Jae
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2005
Field of study

A fast phase modulator based on ferroelectric liquid crystal (FLC) is demonstrated and its performances characterized. For uniform alignment and pure phase modulation, we propose a new FLC device configuration using short helical pitch material and homeotropic alignment structure. This device is driven by periodic in-plane electrode stripes implemented on the surface of both cell substrates. As a result, we have obtained large phase modulation ( \u3e 2 pi at lambda=1.55 mu m) and fast response ( \u3c 200 mu sec)

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Effect of rapid influenza diagnostic tests on patient management in an emergency department

Author: Hanjin Cho
Jong-Hak Park
Ju Young Kim
Ju-Hyun Song
Sungwoo Moon
Yu-Sang Ahn
Publication venue: 'The Korean Society of Emergency Medicine'
Publication date: 01/03/2019
Field of study

Objective We evaluated the effect of rapid influenza diagnostic tests (RIDTs) on patient management in an emergency department for 3 years after 2009, and also identified factors associated with the choice of treatment for patients with influenza-like illnesses. Methods The study period consisted of three influenza epidemic seasons. Patients older than 15 years who underwent RIDTs in the emergency department and were then discharged without admission were included. Results A total of 453 patients were enrolled, 114 of whom had positive RIDT results and 339 had negative results. Antiviral medication was prescribed to 103 patients (90.4%) who had positive RIDT results, while 1 patient (0.3%) who tested negative was treated with antivirals (P<0.001). Conservative care was administered to 11 RIDT-positive patients (9.6%) and 244 RIDT-negative patients (72.0%) (P<0.001). Symptom onset in less than 48 hours, being older than 65 years, and the presence of comorbidities were not associated with the administration of antiviral therapy. Conclusion RIDT results had a critical effect on physician decision-making regarding antiviral treatment for patients with influenza-like illnesses in the emergency department. However, symptom onset in less than 48 hours, old age, and comorbidities, which are all indications for antiviral therapy, were not found to influence the administration of antiviral treatment

Directory of Open Access Journals

Sequences of the Cytochrome C Oxidase Subunit I (COI) Gene are Suitable for Species Identification of Korean Calliphorinae Flies of Forensic Importance (Diptera: Calliphoridae)

Author: Hwang Juck-Joon
Jeong Hyun Ju
Jo Tae-Ho
Park Seong Hwan
Piao Huguo
Yoo Ga Young
Yu Dong Ha
Zhang Yong
Publication venue: 'Wiley'
Publication date: 01/09/2009
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/75068/1/j.1556-4029.2009.01126.x.pd

Deep Blue Documents at the University of Michigan