28,515 research outputs found
Neural Speech Synthesis with Transformer Network
Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2)
are proposed and achieve state-of-the-art performance, they still suffer from
two problems: 1) low efficiency during training and inference; 2) hard to model
long dependency using current recurrent neural networks (RNNs). Inspired by the
success of Transformer network in neural machine translation (NMT), in this
paper, we introduce and adapt the multi-head attention mechanism to replace the
RNN structures and also the original attention mechanism in Tacotron2. With the
help of multi-head self-attention, the hidden states in the encoder and decoder
are constructed in parallel, which improves the training efficiency. Meanwhile,
any two inputs at different times are connected directly by self-attention
mechanism, which solves the long range dependency problem effectively. Using
phoneme sequences as input, our Transformer TTS network generates mel
spectrograms, followed by a WaveNet vocoder to output the final audio results.
Experiments are conducted to test the efficiency and performance of our new
network. For the efficiency, our Transformer TTS network can speed up the
training about 4.25 times faster compared with Tacotron2. For the performance,
rigorous human tests show that our proposed model achieves state-of-the-art
performance (outperforms Tacotron2 with a gap of 0.048) and is very close to
human quality (4.39 vs 4.44 in MOS)
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search
To speed up the inference of neural speech synthesis, non-autoregressive
models receive increasing attention recently. In non-autoregressive models,
additional durations of text tokens are required to make a hard alignment
between the encoder and the decoder. The duration-based alignment plays a
crucial role since it controls the correspondence between text tokens and
spectrum frames and determines the rhythm and speed of synthesized audio. To
get better duration-based alignment and improve the quality of
non-autoregressive speech synthesis, in this paper, we propose a novel neural
alignment model named MoboAligner. Given the pairs of the text and mel
spectrum, MoboAligner tries to identify the boundaries of text tokens in the
given mel spectrum frames based on the token-frame similarity in the neural
semantic space with an end-to-end framework. With these boundaries, durations
can be extracted and used in the training of non-autoregressive TTS models.
Compared with the duration extracted by TransformerTTS, MoboAligner brings
improvement for the non-autoregressive TTS model on MOS (3.74 comparing to
FastSpeech's 3.44). Besides, MoboAligner is task-specified and lightweight,
which reduces the parameter number by 45% and the training time consuming by
30%
Criteria of Biholomorphic Convex Mappings on the bounded convex balanced domain
In this paper, we first establish several general sufficient conditions for
the biholomorphic convex mappings on the bounded convex balanced domain
in , which extend some related
results of earlier authors. From these, some concrete examples of biholomorphic
convex mappings on are also provided.Comment: 16 page
Universal Einstein Relation Model in Disordered Organic Semiconductors under Quasi-equilibrium
It is still under debate whether the classical Einstein relation in
disordered organic semiconductors is valid. We investigated Einstein relation
in disordered organic semiconductors theoretically. The results show that, the
classic Einstein relation deviate dramatically with disorder and electric
field, even in the case of thermal equilibrium
A Novel Demodulation and Estimation Algorithm for Blackout Communication: Extract Principal Components with Deep Learning
For reentry or near space communication, owing to the influence of the
time-varying plasma sheath channel environment, the received IQ baseband
signals are severely rotated on the constellation. Researches have shown that
the frequency of electron density varies from 20kHz to 100 kHz which is on the
same order as the symbol rate of most TT\&C communication systems and a mass of
bandwidth will be consumed to track the time-varying channel with traditional
estimation. In this paper, motivated by principal curve analysis, we propose a
deep learning (DL) algorithm which called symmetric manifold network (SMN) to
extract the curves on the constellation and classify the signals based on the
curves. The key advantage is that SMN can achieve joint optimization of
demodulation and channel estimation. From our simulation results, the new
algorithm significantly reduces the symbol error rate (SER) compared to
existing algorithms and enables accurate estimation of fading with extremely
high bandwith utilization rate
Measuring and Discovering Correlations in Large Data Sets
In this paper, a class of statistics named ART (the alternant recursive
topology statistics) is proposed to measure the properties of correlation
between two variables. A wide range of bi-variable correlations both linear and
nonlinear can be evaluated by ART efficiently and equitably even if nothing is
known about the specific types of those relationships. ART compensates the
disadvantages of Reshef's model in which no polynomial time precise algorithm
exists and the "local random" phenomenon can not be identified. As a class of
nonparametric exploration statistics, ART is applied for analyzing a dataset of
10 American classical indexes, as a result, lots of bi-variable correlations
are discovered.Comment: 6 page
Self-supporting Topology Optimization for Additive Manufacturing
The paper presents a topology optimization approach that designs an optimal
structure, called a self-supporting structure, which is ready to be fabricated
via additive manufacturing without the usage of additional support structures.
Such supports in general have to be created during the fabricating process so
that the primary object can be manufactured layer by layer without collapse,
which is very time-consuming and waste of material.
The proposed approach resolves this problem by formulating the
self-supporting requirements as a novel explicit quadratic continuous
constraint in the topology optimization problem, or specifically, requiring the
number of unsupported elements (in terms of the sum of squares of their
densities) to be zero. Benefiting form such novel formulations, computing
sensitivity of the self-supporting constraint with respect to the design
density is straightforward, which otherwise would require lots of research
efforts in general topology optimization studies. The derived sensitivity for
each element is only linearly dependent on its sole density, which, different
from previous layer-based sensitivities, consequently allows for a parallel
implementation and possible higher convergence rate. In addition, a discrete
convolution operator is also designed to detect the unsupported elements as
involved in each step of optimization iteration, and improves the detection
process 100 times as compared with simply enumerating these elements. The
approach works for cases of general overhang angle, or general domain, and
produces an optimized structures, and their associated optimal compliance, very
close to that of the reference structure obtained without considering the
self-supporting constraint, as demonstrated by extensive 2D and 3D benchmark
examples.Comment: submitted ou
Physical Origin of Nonlinear transport in organic semiconductor at high carrier densities
The charge transport in some organic semiconductors demonstrates nonlinear
properties and further universal power-law scaling with both bias and
temperature. The physical origin of this behavior is investigated here using
variable range hopping theory. The results shows, this universal power-law
scaling can be well explained by variable range hopping theory . Relation to
the recent experimental data is also discussed
Validity of Transport Energy in Disordered Organic Semiconductors
A systematic study of the transport energy in disordered organic
semiconductors based on variable range hopping theory has been presented here.
The temperature, electric field, material disorder and carrier concentration
dependent transport energy is extensively discussed. We demonstrate here,
transport energy is not a general concept and invalid even in low electric
field and concentration regime
Comments on "Unusual Thermoelectric Behavior Indicating a Hopping to Bandlike Transport Transition in Pentacene"
W. Chr. Germs, K. Guo, R. A. J. Janssen, and M. Kemerink [1] recently
measured the temperature and concentration dependent seebeck coefficient in
organic thin film transistor and found the seebeck coefficient increases with
carrier concentration (corresponding to the gate voltage) in the low
temperature regime. They further concluded that this unusual behavior is due to
a transition from hopping transport in static localized states to bandlike
transport, occurring at low temperatures. This is obviously in contrast to the
previous theoretical prediction because it is widely accepted that hopping
transport is more pronounced at low temperature. We will discuss the reason for
this unusual behavior here and suggest that the density of states function
plays an important role in concentration dependent seebeck coefficient
- …