374 research outputs found
Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis
This paper proposes a forward attention method for the sequenceto- sequence
acoustic modeling of speech synthesis. This method is motivated by the nature
of the monotonic alignment from phone sequences to acoustic sequences. Only the
alignment paths that satisfy the monotonic condition are taken into
consideration at each decoder timestep. The modified attention probabilities at
each timestep are computed recursively using a forward algorithm. A transition
agent for forward attention is further proposed, which helps the attention
mechanism to make decisions whether to move forward or stay at each decoder
timestep. Experimental results show that the proposed forward attention method
achieves faster convergence speed and higher stability than the baseline
attention method. Besides, the method of forward attention with transition
agent can also help improve the naturalness of synthetic speech and control the
speed of synthetic speech effectively.Comment: 5 pages, 3 figures, 2 tables. Published in IEEE International
Conference on Acoustics, Speech and Signal Processing 2018 (ICASSP2018
Whisper-to-speech conversion using restricted Boltzmann machine arrays
Whispers are a natural vocal communication mechanism, in which vocal cords do not vibrate normally. Lack of glottal-induced pitch leads to low energy, and an inherent noise-like spectral distribution reduces intelligibility. Much research has been devoted to processing of whispers, including conversion of whispers to speech. Unfortunately, among several approaches, the best reconstructed speech to date still contains obviously artificial muffles and suffers from an unnatural prosody. To address these issues, the novel use of multiple restricted Boltzmann machines (RBMs) is reported as a statistical conversion model between whisper and speech spectral envelopes. Moreover, the accuracy of estimated pitch is improved using machine learning techniques for pitch estimation within only voiced (V) regions. Both objective and subjective evaluations show that this new method improves the quality of whisper-reconstructed speech compared with the state-of-the-art approaches
Adsorption of phenylacetylene on Si(100)-2Ă1: Reaction mechanism and formation of a styrene-like Ï-conjugation system
This is the published version. Copyright 2003 American Physical SocietyThe interactions of phentylacetylene and phenylacetyleneâαâd1 with Si(100)â2Ă1 have been studied as a model system to mechanistically understand the adsorption of conjugated Ï-electron aromatic substitutions on Si(100)â2Ă1. Vibrational signatures show that phenylacetylene covalently binds to the surface through a [2+2]-like cycloaddition pathway between the external CâĄC and Si=Si dimer, forming styrene-like conjugation structure which was further supported by the chemical-shift of C 1s core level. These experimental results are consistent with the density-functional theory [B3LYP/6â311//+G(d)] calculations. The resulting styrene-like conjugation structures may possibly be employed as an intermediate for further organic syntheses and fabrication of molecular architecture for modification and functionalization of Si surfaces, or as a monomer for polymerization on Si surfaces
Numerical Simulation on the Gas Explosion Propagation Related to Roadway
AbstractBased on the combustion, explosions and air dynamics and related theory etc, this paper describes the mathematical model of gas explosion in detail, combined with the gas explosion transmission mechanism, make a research on two wave-three area structure of gas explosion and the energy change rule of the array face of precursor wave and the array face of flame wave, with the fluid dynamics analysis Fluent software, this paper makes a numerical simulation and analysis on the overpressure transmission rule when gas explosion takes place in different types roadways. The results of the study show that: Fluent software can be used to accurately simulate gas explosion condition, when explosion wave spreads in the roadway turns, the bigger of the overpressure value in corner, the stronger of the destructive power; when tunnel has bifurcation, the overpressure will release in bifurcation, but explosions wave with flame wave will produce more powerful destruction effect. The research results can be used as a certain reference for how to prevent and control the gas explosion, and how to reduce the power of the gas explosion etc
Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision
This paper presents methods of making using of text supervision to improve
the performance of sequence-to-sequence (seq2seq) voice conversion. Compared
with conventional frame-to-frame voice conversion approaches, the seq2seq
acoustic modeling method proposed in our previous work achieved higher
naturalness and similarity. In this paper, we further improve its performance
by utilizing the text transcriptions of parallel training data. First, a
multi-task learning structure is designed which adds auxiliary classifiers to
the middle layers of the seq2seq model and predicts linguistic labels as a
secondary task. Second, a data-augmentation method is proposed which utilizes
text alignment to produce extra parallel sequences for model training.
Experiments are conducted to evaluate our proposed method with training sets at
different sizes. Experimental results show that the multi-task learning with
linguistic labels is effective at reducing the errors of seq2seq voice
conversion. The data-augmentation method can further improve the performance of
seq2seq voice conversion when only 50 or 100 training utterances are available.Comment: 5 pages, 4 figures, 2 tables. Submitted to IEEE ICASSP 201
5âČ-Adenosine Monophosphate-Induced Hypothermia Attenuates Brain Ischemia/Reperfusion Injury in a Rat Model by Inhibiting the Inflammatory Response
- âŠ