Search CORE

181 research outputs found

VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement

Author: Cui Chenye
Huang Rongjie
Liu Jinglin
Ren Yi
Zhao Zhou
Publication venue
Publication date: 19/11/2022
Field of study

Video to sound generation aims to generate realistic and natural sound given a video input. However, previous video-to-sound generation methods can only generate a random or average timbre without any controls or specializations of the generated sound timbre, leading to the problem that people cannot obtain the desired timbre under these methods sometimes. In this paper, we pose the task of generating sound with a specific timbre given a video input and a reference audio sample. To solve this task, we disentangle each target sound audio into three components: temporal information, acoustic information, and background information. We first use three encoders to encode these components respectively: 1) a temporal encoder to encode temporal information, which is fed with video frames since the input video shares the same temporal information as the original audio; 2) an acoustic encoder to encode timbre information, which takes the original audio as input and discards its temporal information by a temporal-corrupting operation; and 3) a background encoder to encode the residual or background sound, which uses the background part of the original audio as input. To make the generated result achieve better quality and temporal alignment, we also adopt a mel discriminator and a temporal discriminator for the adversarial training. Our experimental results on the VAS dataset demonstrate that our method can generate high-quality audio samples with good synchronization with events in video and high timbre similarity with the reference audio

arXiv.org e-Print Archive

Adaptive VSG control strategy considering energy storage SOC constraints

Author: Chunguang He
Hui Zhao
Jinglin Han
Jinglin Han
Ping Hu
Xichun Feng
Publication venue: Frontiers Media S.A.
Publication date: 01/09/2023
Field of study

The virtual synchronous generator (VSG) control strategy is proposed to mitigate the low inertia problem in the power system brought about by the high percentage of distributed generation connected to the grid and the application of power electronic devices. In order to maximize the effectiveness of the advantages of the flexible and adjustable parameters of VSG control, an adaptive VSG control strategy considering SOC constraint of the energy storage unit is proposed in this paper. Considering the significant loss of service life by operating the energy storage unit at its limit state, based on the rate and degree of change in system frequency, the adaptive control strategy realizes the online adaptive adjustment of the inertia factor and damping factor under different perturbation conditions by adaptively adjusting the control parameters when the system frequency oscillates. Finally, the effects of this adaptive VSG control method and conventional VSG control method are compared by simulation in PLECS. Hardware-in-the-loop (HIL) experimental platforms and semi-physical simulation experiments are constructed on RTBOX, and the feasibility and validity of this adaptive VSG control strategy are verified

Directory of Open Access Journals

Peg Precipitation Coupled with Chromatography is a New and Sufficient Method for the Purification of Botulinum Neurotoxin Type B

Author: Gao Shan
Gao Xing
Kang Lin
Wang Jinglin
Xin Wenwen
Zhao Yao
Publication venue: Public Library of Science
Publication date: 28/06/2012
Field of study

Clostridium botulinum neurotoxins are used to treat a variety of neuro-muscular disorders, as well as in cosmetology. The increased demand requires efficient methods for the production and purification of these toxins. In this study, a new purification process was developed for purifying type B neurotoxin. The kinetics of C.botulinum strain growth and neurotoxin production were determined for maximum yield of toxin. The neurotoxin was purified by polyethylene glycol (PEG) precipitation and chromatography. Based on design of full factorial experiment, 20% (w/v) PEG-6000, 4°C, pH 5.0 and 0.3 M NaCl were optimal conditions to obtain a high recovery rate of 87% for the type B neurotoxin complex, as indicated by a purification factor of 61.5 fold. Furthermore, residual bacterial cells, impurity proteins and some nucleic acids were removed by PEG precipitation. The following purification of neurotoxin was accomplished by two chromatography techniques using Sephacryl™ S-100 and phenyl HP columns. The neurotoxin was recovered with an overall yield of 21.5% and the purification factor increased to 216.7 fold. In addition, a mouse bioassay determined the purified neurotoxin complex possessed a specific toxicity (LD50) of 4.095 ng/kg

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

DopplerBAS: Binaural Audio Synthesis Addressing Doppler Effect

Author: Chen Qian
Liu Jinglin
Wang Wen
Ye Zhenhui
Zhang Qinglin
Zhao Zhou
Zheng Siqi
Publication venue
Publication date: 01/06/2023
Field of study

Recently, binaural audio synthesis (BAS) has emerged as a promising research field for its applications in augmented and virtual realities. Binaural audio helps users orient themselves and establish immersion by providing the brain with interaural time differences reflecting spatial information. However, existing BAS methods are limited in terms of phase estimation, which is crucial for spatial hearing. In this paper, we propose the \textbf{DopplerBAS} method to explicitly address the Doppler effect of the moving sound source. Specifically, we calculate the radial relative velocity of the moving speaker in spherical coordinates, which further guides the synthesis of binaural audio. This simple method introduces no additional hyper-parameters and does not modify the loss functions, and is plug-and-play: it scales well to different types of backbones. DopperBAS distinctly improves the representative WarpNet and BinauralGrad backbones in the phase error metric and reaches a new state of the art (SOTA): 0.780 (versus the current SOTA 0.807). Experiments and ablation studies demonstrate the effectiveness of our method.Comment: Accepted to ACL 2023 short paper; key words: binaural audio, stereophonic soun

arXiv.org e-Print Archive

Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech

Author: Jiang Ziyue
Liu Jinglin
Ren Yi
Yang Qian
Ye Zhenhui
Zhao Zhou
Zhe Su
Publication venue
Publication date: 09/10/2022
Field of study

Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech (TTS) systems. However, previous approaches require substantial annotated training data and additional efforts from language experts, making it difficult to extend high-quality neural TTS systems to out-of-domain daily conversations and countless languages worldwide. This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language). Specifically, we design a semantics-to-pronunciation attention (S2PA) module to match the semantic patterns between the input text sequence and the prior semantics in the dictionary and obtain the corresponding pronunciations; The S2PA module can be easily trained with the end-to-end TTS model without any annotated phoneme labels. Experimental results in three languages show that our model outperforms several strong baseline models in terms of pronunciation accuracy and improves the prosody modeling of TTS systems. Further extensive analyses demonstrate that each design in Dict-TTS is effective. The code is available at \url{https://github.com/Zain-Jiang/Dict-TTS}.Comment: Accepted by NeurIPS 202

arXiv.org e-Print Archive

Three planets orbiting Wolf 1061

Author: Bentley J. S.
Tinney C. G.
Wittenmyer R. A.
Wright D. J.
Zhao Jinglin
Publication venue: 'American Astronomical Society'
Publication date: 16/12/2015
Field of study

We use archival HARPS spectra to detect three planets orbiting the M3 dwarf Wolf 1061 (GJ 628). We detect a 1.36M⊕ minimum-mass planet with an orbital period P = 4.888 days (Wolf 1061b), a 4.25M⊕ minimum-mass planet with orbital period P = 17.867 days (Wolf 1061c), and a likely 5.21M⊕ minimum-mass planet with orbital period P = 67.274 days (Wolf 1061d). All of the planets are of sufficiently low mass that they may be rocky in nature. The 17.867 day planet falls within the habitable zone for Wolf 1061 and the 67.274 day planet falls just outside the outer boundary of the habitable zone. There are no signs of activity observed in the bisector spans, cross-correlation FWHMs, calcium H & K indices, NaD indices, or Hα indices near the planetary periods. We use custom methods to generate a cross-correlation template tailored to the star. The resulting velocities do not suffer the strong annual variation observed in the HARPS DRS velocities. This differential technique should deliver better exploitation of the archival HARPS data for the detection of planets at extremely low amplitudes

arXiv.org e-Print Archive

University of Southern Queensland ePrints