3 research outputs found

    AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion

    Full text link
    This paper proposes a simple and robust zero-shot voice conversion system with a cycle structure and mel-spectrogram pre-processing. Previous works suffer from information loss and poor synthesis quality due to their reliance on a carefully designed bottleneck structure. Moreover, models relying solely on self-reconstruction loss struggled with reproducing different speakers' voices. To address these issues, we suggested a cycle-consistency loss that considers conversion back and forth between target and source speakers. Additionally, stacked random-shuffled mel-spectrograms and a label smoothing method are utilized during speaker encoder training to extract a time-independent global speaker representation from speech, which is the key to a zero-shot conversion. Our model outperforms existing state-of-the-art results in both subjective and objective evaluations. Furthermore, it facilitates cross-lingual voice conversions and enhances the quality of synthesized speech

    Real-time Fall Detection Using Wi-Fi Channel State Information

    No full text
    2

    Recycling Sampling Timing Offset of Wi-Fi for Estimating Multiple ToFs of Superimposed Signal

    No full text
    Many Wi-Fi based device free localization (DFL) methods have been proposed for indoor location based services. Unfortunately, the received signal is superimposed with Line-of-Sight signal and reflections so that multi target DFL is only possible by estimating the time of flight (ToF) of each signal. To estimate multiple ToFs, we utilize the sampling timing offset (STO) that inherently occurs by asynchronous sampling timing between TX-RX. By utilizing STO, we can generate signals mimicking oversampled signals. We put the signal to our correlation based ToF estimation algorithm. We achieved 0.75-4.65 ns median error when 2-5 signals are superimposed.11Nsciescopu
    corecore