Search CORE

172 research outputs found

Natural Model Reduction for Kinetic Equations

Author: Jin Zeyu
Li Ruo
Publication venue
Publication date: 19/10/2023
Field of study

A promising approach to investigating high-dimensional problems is to identify their intrinsically low-dimensional features, which can be achieved through recently developed techniques for effective low-dimensional representation of functions such as machine learning. Based on available finite-dimensional approximate solution manifolds, this paper proposes a novel model reduction framework for kinetic equations. The method employs projections onto tangent bundles of approximate manifolds, naturally resulting in first-order hyperbolic systems. Under certain conditions on the approximate manifolds, the reduced models preserve several crucial properties, including hyperbolicity, conservation laws, entropy dissipation, finite propagation speed, and linear stability. For the first time, this paper rigorously discusses the relation between the H-theorem of kinetic equations and the linear stability conditions of reduced systems, determining the choice of Riemannian metrics involved in the model reduction. The framework is widely applicable for the model reduction of many models in kinetic theory.Comment: 46 page

arXiv.org e-Print Archive

Lax Equivalence for Hyperbolic Relaxation Approximations

Author: Jin Zeyu
Li Ruo
Publication venue
Publication date: 17/11/2023
Field of study

This paper investigates the zero relaxation limit for general linear hyperbolic relaxation systems and establishes the asymptotic convergence of slow variables under the unimprovable weakest stability condition, akin to the Lax equivalence theorem for hyperbolic relaxation approximations. Despite potential high oscillations, the convergence of macroscopic variables is established in the strong

L^\infty_t L^2_x

sense rather than the sense of weak convergence, time averaging, or ensemble averaging.Comment: 32 page

arXiv.org e-Print Archive

High Order Numerical Homogenization for Dissipative Ordinary Differential Equations

Author: Jin Zeyu
Li Ruo
Publication venue
Publication date: 06/02/2021
Field of study

We propose a high order numerical homogenization method for dissipative ordinary differential equations (ODEs) containing two time scales. Essentially, only first order homogenized model globally in time can be derived. To achieve a high order method, we have to adopt a numerical approach in the framework of the heterogeneous multiscale method (HMM). By a successively refined microscopic solver, the accuracy improvement up to arbitrary order is attained providing input data smooth enough. Based on the formulation of the high order microscopic solver we derived, an iterative formula to calculate the microscopic solver is then proposed. Using the iterative formula, we develop an implementation to the method in an efficient way for practical applications. Several numerical examples are presented to validate the new models and numerical methods.Comment: 29 pages, 8 figure

arXiv.org e-Print Archive

HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Author: Finkelstein Adam
Jin Zeyu
Su Jiaqi
Publication venue
Publication date: 01/01/2020
Field of study

Real-world audio recordings are often degraded by factors such as noise, reverberation, and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to transform recorded speech to sound as though it had been recorded in a studio. We use an end-to-end feed-forward WaveNet architecture, trained with multi-scale adversarial discriminators in both the time domain and the time-frequency domain. It relies on the deep feature matching losses of the discriminators to improve the perceptual quality of enhanced speech. The proposed model generalizes well to new speakers, new speech content, and new environments. It significantly outperforms state-of-the-art baseline methods in both objective and subjective experiments.Comment: Accepted by INTERSPEECH 202

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Efficient Spoken Language Recognition via Multilabel Classification

Author: Dernoncourt Franck
Jin Zeyu
Nieto Oriol
Salamon Justin
Publication venue
Publication date: 02/06/2023
Field of study

Spoken language recognition (SLR) is the task of automatically identifying the language present in a speech signal. Existing SLR models are either too computationally expensive or too large to run effectively on devices with limited resources. For real-world deployment, a model should also gracefully handle unseen languages outside of the target language set, yet prior work has focused on closed-set classification where all input languages are known a-priori. In this paper we address these two limitations: we explore efficient model architectures for SLR based on convolutional networks, and propose a multilabel training strategy to handle non-target languages at inference time. Using the VoxLingua107 dataset, we show that our models obtain competitive results while being orders of magnitude smaller and faster than current state-of-the-art methods, and that our multilabel strategy is more robust to unseen non-target languages compared to multiclass classification.Comment: Accepted to InterSpeech 202

arXiv.org e-Print Archive

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

Author: Hasegawa-Johnson Mark
Jin Zeyu
Mysore Gautham J.
Qian Kaizhi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/04/2020
Field of study

Non-parallel many-to-many voice conversion remains an interesting but challenging speech processing task. Many style-transfer-inspired methods such as generative adversarial networks (GANs) and variational autoencoders (VAEs) have been proposed. Recently, AutoVC, a conditional autoencoders (CAEs) based method achieved state-of-the-art results by disentangling the speaker identity and speech content using information-constraining bottlenecks, and it achieves zero-shot conversion by swapping in a different speaker's identity embedding to synthesize a new voice. However, we found that while speaker identity is disentangled from speech content, a significant amount of prosodic information, such as source F0, leaks through the bottleneck, causing target F0 to fluctuate unnaturally. Furthermore, AutoVC has no control of the converted F0 and thus unsuitable for many applications. In the paper, we modified and improved autoencoder-based voice conversion to disentangle content, F0, and speaker identity at the same time. Therefore, we can control the F0 contour, generate speech with F0 consistent with the target speaker, and significantly improve quality and similarity. We support our improvement through quantitative and qualitative analysis

arXiv.org e-Print Archive

Crossref