14 research outputs found

    Automatic detection and classification of head movements in face-to-face conversations

    Get PDF
    This paper presents an approach to automatic head movement detection and classification in data from a corpus of video-recorded face-toface conversations in Danish involving 12 different speakers. A number of classifiers were trained with different combinations of visual, acoustic and word features and tested in a leave-one-out cross validation scenario. The visual movement features were extracted from the raw video data using OpenPose, the acoustic ones from the sound files using Praat, and the word features from the transcriptions. The best results were obtained by a Multilayer Perceptron classifier, which reached an average 0.68 F1 score across the 12 speakers for head movement detection, and 0.40 for head movement classification given four different classes. In both cases, the classifier outperformed a simple most frequent class baseline, a more advanced baseline only relying on velocity features, and linear classifiers using different combinations of featurespeer-reviewe

    How suitable are TED talks for academic listening?

    No full text
    To investigate the suitability of TED talks for academic listening in EAP contexts, this research paper compares Academic Vocabulary List (AVL) representation (Gardner & Davies, 2014), lexical density, and speech rate in a TED talk corpus and a lecture discourse corpus, which were both compiled for this study. 28 lecture series (727 lectures total) and 49 TED talks were analysed for AVL representation. TED talks were found to have lower AVL representation than the university lectures (t(75) = 4.95, p < 0.0001). 43 one-minute samples from the Lecture Discourse Corpus and 47 one-minute samples from the TED Talk Corpus were analysed for lexical density, where no differences were found; and speech rate, which was found to be significantly faster in TED talks, in terms of syllables per second (t(98) = 4.23, p < 0.0001) and words per minute (t(98) = 4.20, p < 0.0001). A negative correlation was found between lexical density and syllables per second in the lecture discourse corpus (r = −0.343, p < 0.05), where none was found in the TED talk corpus (r = −0.031, ns), perhaps due to TED talks being a scripted genre. It is concluded that TED talk variation enables a range of academic listening applications
    corecore