3,580 research outputs found
Movie Description
Audio Description (AD) provides linguistic descriptions of movies and allows
visually impaired people to follow a movie along with their peers. Such
descriptions are by design mainly visual and thus naturally form an interesting
data source for computer vision and computational linguistics. In this work we
propose a novel dataset which contains transcribed ADs, which are temporally
aligned to full length movies. In addition we also collected and aligned movie
scripts used in prior work and compare the two sources of descriptions. In
total the Large Scale Movie Description Challenge (LSMDC) contains a parallel
corpus of 118,114 sentences and video clips from 202 movies. First we
characterize the dataset by benchmarking different approaches for generating
video descriptions. Comparing ADs to scripts, we find that ADs are indeed more
visual and describe precisely what is shown rather than what should happen
according to the scripts created prior to movie production. Furthermore, we
present and compare the results of several teams who participated in a
challenge organized in the context of the workshop "Describing and
Understanding Video & The Large Scale Movie Description Challenge (LSMDC)", at
ICCV 2015
Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs
Conversational participants tend to immediately and unconsciously adapt to
each other's language styles: a speaker will even adjust the number of articles
and other function words in their next utterance in response to the number in
their partner's immediately preceding utterance. This striking level of
coordination is thought to have arisen as a way to achieve social goals, such
as gaining approval or emphasizing difference in status. But has the adaptation
mechanism become so deeply embedded in the language-generation process as to
become a reflex? We argue that fictional dialogs offer a way to study this
question, since authors create the conversations but don't receive the social
benefits (rather, the imagined characters do). Indeed, we find significant
coordination across many families of function words in our large movie-script
corpus. We also report suggestive preliminary findings on the effects of gender
and other features; e.g., surprisingly, for articles, on average, characters
adapt more to females than to males.Comment: data available at http://www.cs.cornell.edu/~cristian/movie
A Dataset for Movie Description
Descriptive video service (DVS) provides linguistic descriptions of movies
and allows visually impaired people to follow a movie along with their peers.
Such descriptions are by design mainly visual and thus naturally form an
interesting data source for computer vision and computational linguistics. In
this work we propose a novel dataset which contains transcribed DVS, which is
temporally aligned to full length HD movies. In addition we also collected the
aligned movie scripts which have been used in prior work and compare the two
different sources of descriptions. In total the Movie Description dataset
contains a parallel corpus of over 54,000 sentences and video snippets from 72
HD movies. We characterize the dataset by benchmarking different approaches for
generating video descriptions. Comparing DVS to scripts, we find that DVS is
far more visual and describes precisely what is shown rather than what should
happen according to the scripts created prior to movie production
- …