15,413 research outputs found

    Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis

    Full text link
    This paper proposes a forward attention method for the sequenceto- sequence acoustic modeling of speech synthesis. This method is motivated by the nature of the monotonic alignment from phone sequences to acoustic sequences. Only the alignment paths that satisfy the monotonic condition are taken into consideration at each decoder timestep. The modified attention probabilities at each timestep are computed recursively using a forward algorithm. A transition agent for forward attention is further proposed, which helps the attention mechanism to make decisions whether to move forward or stay at each decoder timestep. Experimental results show that the proposed forward attention method achieves faster convergence speed and higher stability than the baseline attention method. Besides, the method of forward attention with transition agent can also help improve the naturalness of synthetic speech and control the speed of synthetic speech effectively.Comment: 5 pages, 3 figures, 2 tables. Published in IEEE International Conference on Acoustics, Speech and Signal Processing 2018 (ICASSP2018

    Integrated Deep and Shallow Networks for Salient Object Detection

    Full text link
    Deep convolutional neural network (CNN) based salient object detection methods have achieved state-of-the-art performance and outperform those unsupervised methods with a wide margin. In this paper, we propose to integrate deep and unsupervised saliency for salient object detection under a unified framework. Specifically, our method takes results of unsupervised saliency (Robust Background Detection, RBD) and normalized color images as inputs, and directly learns an end-to-end mapping between inputs and the corresponding saliency maps. The color images are fed into a Fully Convolutional Neural Networks (FCNN) adapted from semantic segmentation to exploit high-level semantic cues for salient object detection. Then the results from deep FCNN and RBD are concatenated to feed into a shallow network to map the concatenated feature maps to saliency maps. Finally, to obtain a spatially consistent saliency map with sharp object boundaries, we fuse superpixel level saliency map at multi-scale. Extensive experimental results on 8 benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches with a margin.Comment: Accepted by IEEE International Conference on Image Processing (ICIP) 201

    Correcting for the solar wind in pulsar timing observations: the role of simultaneous a nd l ow-frequency observations

    Full text link
    The primary goal of the pulsar timing array projects is to detect ultra-low-frequency gravitational waves. The pulsar data sets are affected by numerous noise processes including varying dispersive delays in the interstellar medium and from the solar wind. The solar wind can lead to rapidly changing variations that, with existing telescopes, can be hard to measure and then remove. In this paper we study the possibility of using a low frequency telescope to aid in such correction for the Parkes Pulsar Timing Array (PPTA) and also discuss whether the ultra-wide-bandwidth receiver for the FAST telescope is sufficient to model the solar wind variations. Our key result is that a single wide-bandwidth receiver can be used to model and remove the effect of the solar wind. However, for pulsars that pass close to the Sun such as PSR J1022+1022, the solar wind is so variable that observations at two telescopes separated by a day are insufficient to correct the solar wind effect.Comment: accepted by RA

    Estimating and predicting the distribution of the number of visits to the medical doctor

    Full text link
    In many countries the demand for health care services is of increasing importance. Especially in the industrialized world with a changing demographic structure social insurances and politics face real challenges. Reliable predictors of those demand functions will therefore become invaluable tools. This article proposes a prediction method for the distribution of the number of visits to the medical doctor for a determined population, given a sample that is not necessarily taken from that population. It uses the estimated conditional sample distribution, and it can be applied for forecast scenarios. The methods are illustrated along data from Sidney. The introduced methodology can be applied as well to any other prediction problem of discrete distributions in real, future or any fictitious population. It is therefore also an excellent tool for future predictions, scenarios and policy evaluation

    Multimodal Storytelling via Generative Adversarial Imitation Learning

    Full text link
    Deriving event storylines is an effective summarization method to succinctly organize extensive information, which can significantly alleviate the pain of information overload. The critical challenge is the lack of widely recognized definition of storyline metric. Prior studies have developed various approaches based on different assumptions about users' interests. These works can extract interesting patterns, but their assumptions do not guarantee that the derived patterns will match users' preference. On the other hand, their exclusiveness of single modality source misses cross-modality information. This paper proposes a method, multimodal imitation learning via generative adversarial networks(MIL-GAN), to directly model users' interests as reflected by various data. In particular, the proposed model addresses the critical challenge by imitating users' demonstrated storylines. Our proposed model is designed to learn the reward patterns given user-provided storylines and then applies the learned policy to unseen data. The proposed approach is demonstrated to be capable of acquiring the user's implicit intent and outperforming competing methods by a substantial margin with a user study.Comment: IJCAI 201

    Graphene Nanoribbons with Smooth Edges Behave as Quantum Wires

    Full text link
    Graphene nanoribbons with perfect edges are predicted to exhibit interesting electronic and spintronic properties, notably quantum-confined bandgaps and magnetic edge states. However, graphene nanoribbons produced by lithography have, to date, exhibited rough edges and low-temperature transport characteristics dominated by defects, mainly variable range hopping between localized states in a transport gap near the Dirac point. Here, we report that one- and two-layer nanoribbons quantum dots made by unzipping carbon nanotubes10 exhibit well-defined quantum transport phenomena, including Coulomb blockade, Kondo effect, clear excited states up to ~20meV, and inelastic co-tunnelling. Along with signatures of intrinsic quantum-confined bandgaps and high conductivities, our data indicate that the nanoribbons behave as clean quantum wires at low temperatures, and are not dominated by defects.Comment: To appear in Nature Nanotechnolog
    corecore