151,654 research outputs found
Teaching the pronunciation of sentence final and word boundary stops to French learners of English: distracted imitation versus audio-visual explanations.
Studies on stop unrelease in second language acquisition have hitherto focused on the productions of Slavic learners of English (Šimáčková & Podlipský, 2015) and experiments on Polish learners of English; the latter show the tendency to release stops on a more regular basis depending on the type of stop combinations (Rojczyk et al. 2013). In the present study, we aim to test the efficiency of audio-visual explanations as opposed to distracted imitation in pronunciation teaching amongst French learners of English. While unreleased stops are rather frequent in French and English - especially in plosives clusters (Byrd, 1993; Davidson, 2010), unreleased plosives in final positions are less common in French (Van Dommelen, 1983). During phase 1 of the experiment, three groups of 12 native French learners of English (level A1/A2, B1/B2 and C1/C2) were asked to read idiomatic expressions containing both homogeneous and heterogeneous sequences of voiceless stops straddled between words, namely, in sequences like “that cat” [ðæt˺ kæt˺], and stops at the end of sentences like “I told him to speak” [tə spiːk˺]. In the second phase of the experiment, one half in each group was given a different task. The first group heard recorded versions of phase 1 sentences and before reading them out loud, counted up to five in their L1. Stimuli for imitation contained no release in the contexts under scrutiny. The other half had to watch a video explaining the phenomenon of unreleased stops with a production of phase-two expressions propped up by hand gestures. They were then asked to re-read the sentences given in phase 1. Based on these results the current study makes recommendations about what working environment should be prioritized in pronunciation teaching both in class and online (Kröger et al. 2010), and suggests ways to assess students and visually keep track of their progress
Coherent Multi-Sentence Video Description with Variable Level of Detail
Humans can easily describe what they see in a coherent way and at varying
level of detail. However, existing approaches for automatic video description
are mainly focused on single sentence generation and produce descriptions at a
fixed level of detail. In this paper, we address both of these limitations: for
a variable level of detail we produce coherent multi-sentence descriptions of
complex videos. We follow a two-step approach where we first learn to predict a
semantic representation (SR) from video and then generate natural language
descriptions from the SR. To produce consistent multi-sentence descriptions, we
model across-sentence consistency at the level of the SR by enforcing a
consistent topic. We also contribute both to the visual recognition of objects
proposing a hand-centric approach as well as to the robust generation of
sentences using a word lattice. Human judges rate our multi-sentence
descriptions as more readable, correct, and relevant than related work. To
understand the difference between more detailed and shorter descriptions, we
collect and analyze a video description corpus of three levels of detail.Comment: 10 page
Movie Description
Audio Description (AD) provides linguistic descriptions of movies and allows
visually impaired people to follow a movie along with their peers. Such
descriptions are by design mainly visual and thus naturally form an interesting
data source for computer vision and computational linguistics. In this work we
propose a novel dataset which contains transcribed ADs, which are temporally
aligned to full length movies. In addition we also collected and aligned movie
scripts used in prior work and compare the two sources of descriptions. In
total the Large Scale Movie Description Challenge (LSMDC) contains a parallel
corpus of 118,114 sentences and video clips from 202 movies. First we
characterize the dataset by benchmarking different approaches for generating
video descriptions. Comparing ADs to scripts, we find that ADs are indeed more
visual and describe precisely what is shown rather than what should happen
according to the scripts created prior to movie production. Furthermore, we
present and compare the results of several teams who participated in a
challenge organized in the context of the workshop "Describing and
Understanding Video & The Large Scale Movie Description Challenge (LSMDC)", at
ICCV 2015
Storytelling with objects to explore digital archives
Finding media in archives is difficult while storytelling with photos can be fun and supports memory retrieval. Could the search for media become a natural part of the storytelling experience? This study investigates spatial interactions with objects as a means to encode information for retrieval while being embedded in the story flow. An experiment is carried out in which participants watch a short video and re-tell the story using cards each of which shows a character or object occurring in the video. Participants arrange the cards when telling the story. It is analyzed what information interactions with cards carry and how this information relates to the language of storytelling. Most participants align interactions with objects with the sentences of the story while some arrange the cards corresponding to the video scene. Spatial interactions with objects can carry information on their own or complemented by language
- …