587 research outputs found

    Vocoder-free End-to-End Voice Conversion with Transformer Network

    Full text link
    Mel-frequency filter bank (MFB) based approaches have the advantage of learning speech compared to raw spectrum since MFB has less feature size. However, speech generator with MFB approaches require additional vocoder that needs a huge amount of computation expense for training process. The additional pre/post processing such as MFB and vocoder is not essential to convert real human speech to others. It is possible to only use the raw spectrum along with the phase to generate different style of voices with clear pronunciation. In this regard, we propose a fast and effective approach to convert realistic voices using raw spectrum in a parallel manner. Our transformer-based model architecture which does not have any CNN or RNN layers has shown the advantage of learning fast and solved the limitation of sequential computation of conventional RNN. In this paper, we introduce a vocoder-free end-to-end voice conversion method using transformer network. The presented conversion model can also be used in speaker adaptation for speech recognition. Our approach can convert the source voice to a target voice without using MFB and vocoder. We can get an adapted MFB for speech recognition by multiplying the converted magnitude with phase. We perform our voice conversion experiments on TIDIGITS dataset using the metrics such as naturalness, similarity, and clarity with mean opinion score, respectively.Comment: Work in progres

    Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance

    Full text link
    Deep generative models have emerged as a promising approach in the medical image domain to address data scarcity. However, their use for sequential data like respiratory sounds is less explored. In this work, we propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder. We also demonstrate a simple yet effective adversarial fine-tuning method to align features between the synthetic and real respiratory sound samples to improve respiratory sound classification performance. Our experimental results on the ICBHI dataset demonstrate that the proposed adversarial fine-tuning is effective, while only using the conventional augmentation method shows performance degradation. Moreover, our method outperforms the baseline by 2.24% on the ICBHI Score and improves the accuracy of the minority classes up to 26.58%. For the supplementary material, we provide the code at https://github.com/kaen2891/adversarial_fine-tuning_using_generated_respiratory_sound.Comment: accepted in NeurIPS 2023 Workshop on Deep Generative Models for Health (DGM4H

    Application of the Blister Test to Assess Reliability of Polyimide Based Retinal Electrode

    Get PDF
    NBS-ERC supported by KOSEF & Korea Health 21 R&D Project(A050251) supported by Ministry of Health & Welfar

    Stethoscope-guided Supervised Contrastive Learning for Cross-domain Adaptation on Respiratory Sound Classification

    Full text link
    Despite the remarkable advances in deep learning technology, achieving satisfactory performance in lung sound classification remains a challenge due to the scarcity of available data. Moreover, the respiratory sound samples are collected from a variety of electronic stethoscopes, which could potentially introduce biases into the trained models. When a significant distribution shift occurs within the test dataset or in a practical scenario, it can substantially decrease the performance. To tackle this issue, we introduce cross-domain adaptation techniques, which transfer the knowledge from a source domain to a distinct target domain. In particular, by considering different stethoscope types as individual domains, we propose a novel stethoscope-guided supervised contrastive learning approach. This method can mitigate any domain-related disparities and thus enables the model to distinguish respiratory sounds of the recording variation of the stethoscope. The experimental results on the ICBHI dataset demonstrate that the proposed methods are effective in reducing the domain dependency and achieving the ICBHI Score of 61.71%, which is a significant improvement of 2.16% over the baseline.Comment: accepted to ICASSP 202

    Electron−hole separation in ferroelectric oxides for efficient photovoltaic responses

    Get PDF
    Despite their potential to exceed the theoretical Shockley−Queisser limit, ferroelectric photovoltaics (FPVs) have performed inefficiently due to their extremely low photocurrents. Incorporating Bi₂FeCrO₆(BFCO) as the light absorber in FPVs has recently led to impressively high and record photocurrents [Nechache R, et al. (2015) Nat Photonics 9:61–67], which has revived the FPV field. However, our understanding of this remarkable phenomenon is far from satisfactory. Here, we use first-principles calculations to determine that such excellent performance mainly lies in the efficient separation of electron− hole (e-h) pairs. We show that photoexcited electrons and holes in BFCO are spatially separated on the Fe and Cr sites, respectively. This separation is much more pronounced in disordered BFCO phases, which adequately explains the observed exceptional PV responses. We further establish a design strategy to discover next-generation FPV materials. By exploring 44 additional Bi-based double-perovskite oxides, we suggest five active-layer materials that offer a combination of strong e-h separations and visible-light absorptions for FPV applications. Our work indicates that charge separation is the most important issue to be addressed for FPVs to compete with conventional devices. Keywords: ferroelectrics; double perovskites; photovoltaics; e-h separation; density functional theor

    Chiral magnetoresistance in Pt/Co/Pt zigzag wires

    Full text link
    The Rashba effect leads to a chiral precession of the spins of moving electrons while the Dzyaloshinskii-Moriya interaction (DMI) generates preference towards a chiral profile of local spins. We predict that the exchange interaction between these two spin systems results in a 'chiral' magnetoresistance depending on the chirality of the local spin texture. We observe this magnetoresistance by measuring the domain wall (DW) resistance in a uniquely designed Pt/Co/Pt zigzag wire, and by changing the chirality of the DW with applying an in-plane magnetic field. A chirality-dependent DW resistance is found, and a quantitative analysis shows a good agreement with a theory based on the Rashba model. Moreover, the DW resistance measurement allows us to independently determine the strength of the Rashba effect and the DMI simultaneously, and the result implies a possible correlation between the Rashba effect, the DMI, and the symmetric Heisenberg exchange

    Enhanced magnetic and thermoelectric properties in epitaxial polycrystalline SrRuO3 thin film

    Full text link
    Transition metal oxide thin films show versatile electrical, magnetic, and thermal properties which can be tailored by deliberately introducing macroscopic grain boundaries via polycrystalline solids. In this study, we focus on the modification of the magnetic and thermal transport properties by fabricating single- and polycrystalline epitaxial SrRuO3 thin films using pulsed laser epitaxy. Using epitaxial stabilization technique with atomically flat polycrystalline SrTiO3 substrate, epitaxial polycrystalline SrRuO3 thin film with crystalline quality of each grain comparable to that of single-crystalline counterpart is realized. In particular, alleviated compressive strain near the grain boundaries due to coalescence is evidenced structurally, which induced enhancement of ferromagnetic ordering of the polycrystalline epitaxial thin film. The structural variations associated with the grain boundaries further reduce the thermal conductivity without deteriorating the electronic transport, and lead to enhanced thermoelectric efficiency in the epitaxial polycrystalline thin films, compared with their single-crystalline counterpart.Comment: 24 pages, 5 figure

    Electrically Evoked Cortical Potentials (EECP) in Rabbits Using Implantable Retinal Stimulation System

    Get PDF
    NBS-ERC Supported by KOSEF (Grant R11-2000-075-01001-0) & Korea Health 21 R&D Project MOHW A05025
    • 

    corecore