Search CORE

7 research outputs found

Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training

Author: Li Xu
Meng Helen
Shan Ying
Wu Zhiyong
Zhou Shaohuan
Publication venue
Publication date: 01/09/2023
Field of study

The single-speaker singing voice synthesis (SVS) usually underperforms at pitch values that are out of the singer's vocal range or associated with limited training samples. Based on our previous work, this work proposes a melody-unsupervised multi-speaker pre-training method conducted on a multi-singer dataset to enhance the vocal range of the single-speaker, while not degrading the timbre similarity. This pre-training method can be deployed to a large-scale multi-singer dataset, which only contains audio-and-lyrics pairs without phonemic timing information and pitch annotation. Specifically, in the pre-training step, we design a phoneme predictor to produce the frame-level phoneme probability vectors as the phonemic timing information and a speaker encoder to model the timbre variations of different singers, and directly estimate the frame-level f0 values from the audio to provide the pitch information. These pre-trained model parameters are delivered into the fine-tuning step as prior knowledge to enhance the single speaker's vocal range. Moreover, this work also contributes to improving the sound quality and rhythm naturalness of the synthesized singing voices. It is the first to introduce a differentiable duration regulator to improve the rhythm naturalness of the synthesized voice, and a bi-directional flow model to improve the sound quality. Experimental results verify that the proposed SVS system outperforms the baseline on both sound quality and naturalness

arXiv.org e-Print Archive

Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information

Author: Kang Shiyin
Lei Shun
Meng Helen
Tuo Deyi
Wu Zhiyong
You Weiya
You Yuren
Zhou Shaohuan
Publication venue
Publication date: 31/08/2023
Field of study

This paper presents an end-to-end high-quality singing voice synthesis (SVS) system that uses bidirectional encoder representation from Transformers (BERT) derived semantic embeddings to improve the expressiveness of the synthesized singing voice. Based on the main architecture of recently proposed VISinger, we put forward several specific designs for expressive singing voice synthesis. First, different from the previous SVS models, we use text representation of lyrics extracted from pre-trained BERT as additional input to the model. The representation contains information about semantics of the lyrics, which could help SVS system produce more expressive and natural voice. Second, we further introduce an energy predictor to stabilize the synthesized voice and model the wider range of energy variations that also contribute to the expressiveness of singing voice. Last but not the least, to attenuate the off-key issues, the pitch predictor is re-designed to predict the real to note pitch ratio. Both objective and subjective experimental results indicate that the proposed SVS system can produce singing voice with higher-quality outperforming VISinger

arXiv.org e-Print Archive

Three-Operator Proximal Splitting Scheme for 3-D Seismic Data Reconstruction

Author: Hui Zhou
Shaohuan Zu
Weijian Mao
Yangkang Chen
Yufeng Wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Quantitative characterization of shale gas reservoir properties based on BiLSTM with attention mechanism

Author: Chao Li
Huailai Zhou
Kangkang Guo
Lihui Wu
Shaohuan Zu
Xingye Liu
Publication venue: 'Elsevier BV'
Publication date: 01/07/2023
Field of study

Evaluating the potential of shale gas reservoirs is inseparable from reservoir properties prediction. Accurate characterization of total organic carbon, porosity and permeability is necessary to understand shale gas reservoirs. Seismic data can help to estimate these parameters in the area crossing-wells. We develop an improved deep learning method to achieve shale gas reservoir properties estimation. The relationship between elastic attributes and reservoir properties is built up by training a deep bidirectional long short-term memory network, which is suitable for time/depth sequence prediction, on the logging and core data. Except some commonly used technologies, such as layer normalization and dropout, we also introduce attention mechanism to further enhance the prediction accuracy. Besides, we propose to carry on the normal scores transform on the input features, which aims to make the relationship between inputs and targets clear and easy to learn. During the training process, we construct quantile loss function, then use Adam algorithm to optimize the network. Not only the characterization results, but also the confidence interval can be output that is meaningful for uncertainty analysis. The well experiment indicates that the method is promising for reducing prediction errors when training samples are insufficient. After analyzing in wells, the established model is acted upon seismic inverted elastic attributes to characterize shale gas reservoirs in the whole studied area. The estimation results coincide well with the actual development results, showing the feasibility of the novel method on the characterization for shale gas reservoirs

Directory of Open Access Journals

Hybrid-Sparsity Constrained Dictionary Learning for Iterative Deblending of Extremely Noisy Simultaneous-Source Data

Author: Hui Zhou
Rushan Wu
Shaohuan Zu
Weijian Mao
Yangkang Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Deblending of Simultaneous-source Seismic Data using Fast Iterative Shrinkage-thresholding Algorithm with Firm-thresholding

Author: A Beck
A Chambolle
A Mahdad
A Mahdad
AG Bruce
CJ Beasley
DL Donoho
EJ Candes
EJ Candes
H-Y Gao
Hui Zhou
I Daubechies
Jiang Yuan
MA Figueiredo
P Akerberg
P Doulgeris
R Chartrand
Renwu Liu
RL Abma
S Huo
S Qu
Sa Yu
Shan Qu
Shaohuan Zu
TTY Lin
Y Chen
Y Chen
Y Chen
Yahui Yang
Yangkang Chen
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

Crossref

The lysine catabolite saccharopine impairs development by disrupting mitochondrial homeostasis

Author: Arribere
Breckenridge
Bruce Alberts
Carson
Cederbaum
Chen
Chonglin Yang
Cong
Cox
Dancis
Davis
de Wet
Dickinson
Fengxia Zhang
Fengyang Wang
Fiermonte
Guodong Wang
Houten
Hudson
Ichishita
Jinek
Junxiang Zhou
Kanazawa
Labrousse
Lagido
Lagido
Liyuan Zhao
Mali
Markovitz
Min Wang
Mishra
Pagliarini
Papes
Pink
Porcelli
Qian Zhang
Qiwen Gan
Ruofeng Tang
Sacksteder
Shaohuan Wu
Shen
Simell
Vianey-Liaud
Wai
Weixiang Guo
Wenfeng Qian
Xin Wang
Xu
Ye Guo
Youle
Yudong Jing
Yuwei Chang
Zhang
Zhaonan Ban
Publication venue: 'Rockefeller University Press'
Publication date
Field of study

Crossref