28,515 research outputs found

    Neural Speech Synthesis with Transformer Network

    Full text link
    Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2) are proposed and achieve state-of-the-art performance, they still suffer from two problems: 1) low efficiency during training and inference; 2) hard to model long dependency using current recurrent neural networks (RNNs). Inspired by the success of Transformer network in neural machine translation (NMT), in this paper, we introduce and adapt the multi-head attention mechanism to replace the RNN structures and also the original attention mechanism in Tacotron2. With the help of multi-head self-attention, the hidden states in the encoder and decoder are constructed in parallel, which improves the training efficiency. Meanwhile, any two inputs at different times are connected directly by self-attention mechanism, which solves the long range dependency problem effectively. Using phoneme sequences as input, our Transformer TTS network generates mel spectrograms, followed by a WaveNet vocoder to output the final audio results. Experiments are conducted to test the efficiency and performance of our new network. For the efficiency, our Transformer TTS network can speed up the training about 4.25 times faster compared with Tacotron2. For the performance, rigorous human tests show that our proposed model achieves state-of-the-art performance (outperforms Tacotron2 with a gap of 0.048) and is very close to human quality (4.39 vs 4.44 in MOS)

    MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search

    Full text link
    To speed up the inference of neural speech synthesis, non-autoregressive models receive increasing attention recently. In non-autoregressive models, additional durations of text tokens are required to make a hard alignment between the encoder and the decoder. The duration-based alignment plays a crucial role since it controls the correspondence between text tokens and spectrum frames and determines the rhythm and speed of synthesized audio. To get better duration-based alignment and improve the quality of non-autoregressive speech synthesis, in this paper, we propose a novel neural alignment model named MoboAligner. Given the pairs of the text and mel spectrum, MoboAligner tries to identify the boundaries of text tokens in the given mel spectrum frames based on the token-frame similarity in the neural semantic space with an end-to-end framework. With these boundaries, durations can be extracted and used in the training of non-autoregressive TTS models. Compared with the duration extracted by TransformerTTS, MoboAligner brings improvement for the non-autoregressive TTS model on MOS (3.74 comparing to FastSpeech's 3.44). Besides, MoboAligner is task-specified and lightweight, which reduces the parameter number by 45% and the training time consuming by 30%

    Criteria of Biholomorphic Convex Mappings on the bounded convex balanced domain DpnD_{p}^n

    Full text link
    In this paper, we first establish several general sufficient conditions for the biholomorphic convex mappings on the bounded convex balanced domain Dpn(pj≥2,j=1,⋯ ,n)D_{p}^n(p_{j}\geq 2,j=1,\cdots,n) in CnC^{n}, which extend some related results of earlier authors. From these, some concrete examples of biholomorphic convex mappings on DpnD_{p}^n are also provided.Comment: 16 page

    Universal Einstein Relation Model in Disordered Organic Semiconductors under Quasi-equilibrium

    Full text link
    It is still under debate whether the classical Einstein relation in disordered organic semiconductors is valid. We investigated Einstein relation in disordered organic semiconductors theoretically. The results show that, the classic Einstein relation deviate dramatically with disorder and electric field, even in the case of thermal equilibrium

    A Novel Demodulation and Estimation Algorithm for Blackout Communication: Extract Principal Components with Deep Learning

    Full text link
    For reentry or near space communication, owing to the influence of the time-varying plasma sheath channel environment, the received IQ baseband signals are severely rotated on the constellation. Researches have shown that the frequency of electron density varies from 20kHz to 100 kHz which is on the same order as the symbol rate of most TT\&C communication systems and a mass of bandwidth will be consumed to track the time-varying channel with traditional estimation. In this paper, motivated by principal curve analysis, we propose a deep learning (DL) algorithm which called symmetric manifold network (SMN) to extract the curves on the constellation and classify the signals based on the curves. The key advantage is that SMN can achieve joint optimization of demodulation and channel estimation. From our simulation results, the new algorithm significantly reduces the symbol error rate (SER) compared to existing algorithms and enables accurate estimation of fading with extremely high bandwith utilization rate

    Measuring and Discovering Correlations in Large Data Sets

    Full text link
    In this paper, a class of statistics named ART (the alternant recursive topology statistics) is proposed to measure the properties of correlation between two variables. A wide range of bi-variable correlations both linear and nonlinear can be evaluated by ART efficiently and equitably even if nothing is known about the specific types of those relationships. ART compensates the disadvantages of Reshef's model in which no polynomial time precise algorithm exists and the "local random" phenomenon can not be identified. As a class of nonparametric exploration statistics, ART is applied for analyzing a dataset of 10 American classical indexes, as a result, lots of bi-variable correlations are discovered.Comment: 6 page

    Self-supporting Topology Optimization for Additive Manufacturing

    Full text link
    The paper presents a topology optimization approach that designs an optimal structure, called a self-supporting structure, which is ready to be fabricated via additive manufacturing without the usage of additional support structures. Such supports in general have to be created during the fabricating process so that the primary object can be manufactured layer by layer without collapse, which is very time-consuming and waste of material. The proposed approach resolves this problem by formulating the self-supporting requirements as a novel explicit quadratic continuous constraint in the topology optimization problem, or specifically, requiring the number of unsupported elements (in terms of the sum of squares of their densities) to be zero. Benefiting form such novel formulations, computing sensitivity of the self-supporting constraint with respect to the design density is straightforward, which otherwise would require lots of research efforts in general topology optimization studies. The derived sensitivity for each element is only linearly dependent on its sole density, which, different from previous layer-based sensitivities, consequently allows for a parallel implementation and possible higher convergence rate. In addition, a discrete convolution operator is also designed to detect the unsupported elements as involved in each step of optimization iteration, and improves the detection process 100 times as compared with simply enumerating these elements. The approach works for cases of general overhang angle, or general domain, and produces an optimized structures, and their associated optimal compliance, very close to that of the reference structure obtained without considering the self-supporting constraint, as demonstrated by extensive 2D and 3D benchmark examples.Comment: submitted ou

    Physical Origin of Nonlinear transport in organic semiconductor at high carrier densities

    Full text link
    The charge transport in some organic semiconductors demonstrates nonlinear properties and further universal power-law scaling with both bias and temperature. The physical origin of this behavior is investigated here using variable range hopping theory. The results shows, this universal power-law scaling can be well explained by variable range hopping theory . Relation to the recent experimental data is also discussed

    Validity of Transport Energy in Disordered Organic Semiconductors

    Full text link
    A systematic study of the transport energy in disordered organic semiconductors based on variable range hopping theory has been presented here. The temperature, electric field, material disorder and carrier concentration dependent transport energy is extensively discussed. We demonstrate here, transport energy is not a general concept and invalid even in low electric field and concentration regime

    Comments on "Unusual Thermoelectric Behavior Indicating a Hopping to Bandlike Transport Transition in Pentacene"

    Full text link
    W. Chr. Germs, K. Guo, R. A. J. Janssen, and M. Kemerink [1] recently measured the temperature and concentration dependent seebeck coefficient in organic thin film transistor and found the seebeck coefficient increases with carrier concentration (corresponding to the gate voltage) in the low temperature regime. They further concluded that this unusual behavior is due to a transition from hopping transport in static localized states to bandlike transport, occurring at low temperatures. This is obviously in contrast to the previous theoretical prediction because it is widely accepted that hopping transport is more pronounced at low temperature. We will discuss the reason for this unusual behavior here and suggest that the density of states function plays an important role in concentration dependent seebeck coefficient
    • …
    corecore