15,926 research outputs found
Who is the director of this movie? Automatic style recognition based on shot features
We show how low-level formal features, such as shot duration, meant as length
of camera takes, and shot scale, i.e. the distance between the camera and the
subject, are distinctive of a director's style in art movies. So far such
features were thought of not having enough varieties to become distinctive of
an author. However our investigation on the full filmographies of six different
authors (Scorsese, Godard, Tarr, Fellini, Antonioni, and Bergman) for a total
number of 120 movies analysed second by second, confirms that these
shot-related features do not appear as random patterns in movies from the same
director. For feature extraction we adopt methods based on both conventional
and deep learning techniques. Our findings suggest that feature sequential
patterns, i.e. how features evolve in time, are at least as important as the
related feature distributions. To the best of our knowledge this is the first
study dealing with automatic attribution of movie authorship, which opens up
interesting lines of cross-disciplinary research on the impact of style on the
aesthetic and emotional effects on the viewers
6 Seconds of Sound and Vision: Creativity in Micro-Videos
The notion of creativity, as opposed to related concepts such as beauty or
interestingness, has not been studied from the perspective of automatic
analysis of multimedia content. Meanwhile, short online videos shared on social
media platforms, or micro-videos, have arisen as a new medium for creative
expression. In this paper we study creative micro-videos in an effort to
understand the features that make a video creative, and to address the problem
of automatic detection of creative content. Defining creative videos as those
that are novel and have aesthetic value, we conduct a crowdsourcing experiment
to create a dataset of over 3,800 micro-videos labelled as creative and
non-creative. We propose a set of computational features that we map to the
components of our definition of creativity, and conduct an analysis to
determine which of these features correlate most with creative video. Finally,
we evaluate a supervised approach to automatically detect creative video, with
promising results, showing that it is necessary to model both aesthetic value
and novelty to achieve optimal classification accuracy.Comment: 8 pages, 1 figures, conference IEEE CVPR 201
Investigating facial animation production through artistic inquiry
Studies into dynamic facial expressions tend to make use of experimental methods based on objectively manipulated stimuli. New techniques for displaying increasingly realistic facial movement and methods of measuring observer responses are typical of computer animation and psychology facial expression research. However, few projects focus on the artistic nature of performance production. Instead, most concentrate on the naturalistic appearance of posed or acted expressions. In this paper, the authors discuss a method for exploring the creative process of emotional facial expression animation, and ask whether anything can be learned about authentic dynamic expressions through artistic inquiry
"Sitting too close to the screen can be bad for your ears": A study of audio-visual location discrepancy detection under different visual projections
In this work, we look at the perception of event locality under conditions of disparate audio and visual cues. We address an aspect of the so called âventriloquism effectâ relevant for multi-media designers; namely, how auditory perception of event locality is influenced by the size and scale of the accompanying visual projection of those events. We observed that recalibration of the visual axes of an audio-visual animation (by resizing and zooming) exerts a recalibrating influence on the auditory space perception. In particular, sensitivity to audio-visual discrepancies (between a centrally located visual stimuli and laterally displaced audio cue) increases near the edge of the screen on which the visual cue is displayed. In other words,discrepancy detection thresholds are not fixed for a particular pair of stimuli, but are influenced by the size of the display space. Moreover, the discrepancy thresholds are influenced by scale as well as size. That is, the boundary of auditory space perception is not rigidly fixed on the boundaries of the screen; it also depends on the spatial relationship depicted. For example,the ventriloquism effect will break down within the boundaries of a large screen if zooming is used to exaggerate the proximity of the audience to the events. The latter effect appears to be much weaker than the former
Content-Based Video Retrieval in Historical Collections of the German Broadcasting Archive
The German Broadcasting Archive (DRA) maintains the cultural heritage of
radio and television broadcasts of the former German Democratic Republic (GDR).
The uniqueness and importance of the video material stimulates a large
scientific interest in the video content. In this paper, we present an
automatic video analysis and retrieval system for searching in historical
collections of GDR television recordings. It consists of video analysis
algorithms for shot boundary detection, concept classification, person
recognition, text recognition and similarity search. The performance of the
system is evaluated from a technical and an archival perspective on 2,500 hours
of GDR television recordings.Comment: TPDL 2016, Hannover, Germany. Final version is available at Springer
via DO
Scene extraction in motion pictures
This paper addresses the challenge of bridging the semantic gap between the rich meaning users desire when they query to locate and browse media and the shallowness of media descriptions that can be computed in today\u27s content management systems. To facilitate high-level semantics-based content annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from fill production to determine when a scene change occurs. We then investigate different rules and conventions followed as part of Fill Grammar that would guide and shape an algorithmic solution for determining a scene. Two different techniques using intershot analysis are proposed as solutions in this paper. In addition, we present different refinement mechanisms, such as film-punctuation detection founded on Film Grammar, to further improve the results. These refinement techniques demonstrate significant improvements in overall performance. Furthermore, we analyze errors in the context of film-production techniques, which offer useful insights into the limitations of our method
Vision-Based Production of Personalized Video
In this paper we present a novel vision-based system for the automated production of personalised video souvenirs for visitors in leisure and cultural heritage venues. Visitors are visually identified and tracked through a camera network. The system produces a personalized DVD souvenir at the end of a visitorâs stay allowing visitors to relive their experiences. We analyze how we identify visitors by fusing facial and body features, how we track visitors, how the tracker recovers from failures due to occlusions, as well as how we annotate and compile the final product. Our experiments demonstrate the feasibility of the proposed approach
- âŠ