7,806 research outputs found

    Assessing the Prosody of Non-Native Speakers of English: Measures and Feature Sets

    Get PDF
    In this paper, we describe a new database with audio recordings of non-native (L2) speakers of English, and the perceptual evaluation experiment conducted with native English speakers for assessing the prosody of each recording. These annotations are then used to compute the gold standard using different methods, and a series of regression experiments is conducted to evaluate their impact on the performance of a regression model predicting the degree of Abstract naturalness of L2 speech. Further, we compare the relevance of different feature groups modelling prosody in general (without speech tempo), speech rate and pauses modelling speech tempo (fluency), voice quality, and a variety of spectral features. We also discuss the impact of various fusion strategies on performance.Overall, our results demonstrate that the prosody of non-native speakers of English as L2 can be reliably assessed using supra- segmental audio features; prosodic features seem to be the most important ones

    Reação de Cultivares de Oliveira a Meloidogyne mayaguensis.

    Get PDF
    bitstream/item/31511/1/comunicado-224.pd

    Mercados do azeite de oliva e da azeitona de mesa no mundo e no Brasil.

    Get PDF
    bitstream/item/33109/1/Mercados-do-azeite-de-oliva.pd

    Relação entre poda verde e o uso de material refletivo com a qualidade de pêssegos ´Eldorado´.

    Get PDF
    bitstream/item/33532/1/documento-134.pd

    The munich LSTM-RNN approach to the MediaEval 2014 "Emotion in Music" Task

    Get PDF
    In this paper we describe TUM's approach for the MediaEval's \Emotion in Music" task. The goal of this task is to automatically estimate the emotions expressed by music (in terms of Arousal and Valence) in a time-continuous fashion. Our system consists of Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression. We used two di erent sets of acoustic and psychoacoustic features that have been previously proven as e ective for emotion prediction in music and speech. The best model yielded an average Pearson's correlation coe-cient of 0.354 (Arousal) and 0.198 (Valence), and an average Root Mean Squared Error of 0.102 (Arousal) and 0.079 (Valence)
    corecore