19,590 research outputs found
Face detection and clustering for video indexing applications
This paper describes a method for automatically detecting human faces in generic video sequences. We employ an iterative algorithm in order to give a confidence measure for the presence or absence of faces within video shots. Skin colour filtering is carried out on a selected number of frames per video shot, followed by the application of shape and size heuristics. Finally, the remaining candidate regions are normalized and projected into an eigenspace, the reconstruction error being the measure of confidence for presence/absence of face. Following this, the confidence score for the entire video shot is calculated. In order to cluster extracted faces into a set of face classes, we employ an incremental procedure using a PCA-based dissimilarity measure in con-junction with spatio-temporal correlation. Experiments were carried out on a representative broadcast news test corpus
General Dynamic Scene Reconstruction from Multiple View Video
This paper introduces a general approach to dynamic scene reconstruction from
multiple moving cameras without prior knowledge or limiting constraints on the
scene structure, appearance, or illumination. Existing techniques for dynamic
scene reconstruction from multiple wide-baseline camera views primarily focus
on accurate reconstruction in controlled environments, where the cameras are
fixed and calibrated and background is known. These approaches are not robust
for general dynamic scenes captured with sparse moving cameras. Previous
approaches for outdoor dynamic scene reconstruction assume prior knowledge of
the static background appearance and structure. The primary contributions of
this paper are twofold: an automatic method for initial coarse dynamic scene
segmentation and reconstruction without prior knowledge of background
appearance or structure; and a general robust approach for joint segmentation
refinement and dense reconstruction of dynamic scenes from multiple
wide-baseline static or moving cameras. Evaluation is performed on a variety of
indoor and outdoor scenes with cluttered backgrounds and multiple dynamic
non-rigid objects such as people. Comparison with state-of-the-art approaches
demonstrates improved accuracy in both multiple view segmentation and dense
reconstruction. The proposed approach also eliminates the requirement for prior
knowledge of scene structure and appearance
Multi-camera analysis of soccer sequences
The automatic detection of meaningful phases in a soccer game depends on the accurate localization of players and the ball at each moment. However, the automatic analysis of soccer sequences is a challenging task due to the presence of fast moving multiple objects. For this purpose, we present a multi-camera analysis system that yields the position of the ball and players on a common ground plane. The detection in each camera is based on a code-book algorithm and different features are used to classify the detected blobs. The detection results of each camera are transformed using homography to a virtual top-view of the playing field. Within this virtual top-view we merge trajectory information of the different cameras allowing to refine the found positions. In this paper, we evaluate the system on a public SOCCER dataset and end with a discussion of possible improvements of the dataset
Video-rate computational super-resolution and integral imaging at longwave-infrared wavelengths
We report the first computational super-resolved, multi-camera integral
imaging at long-wave infrared (LWIR) wavelengths. A synchronized array of FLIR
Lepton cameras was assembled, and computational super-resolution and
integral-imaging reconstruction employed to generate video with light-field
imaging capabilities, such as 3D imaging and recognition of partially obscured
objects, while also providing a four-fold increase in effective pixel count.
This approach to high-resolution imaging enables a fundamental reduction in the
track length and volume of an imaging system, while also enabling use of
low-cost lens materials.Comment: Supplementary multimedia material in
http://dx.doi.org/10.6084/m9.figshare.530302
3D video coding and transmission
The capture, transmission, and display of
3D content has gained a lot of attention in the last few
years. 3D multimedia content is no longer con fined to
cinema theatres but is being transmitted using stereoscopic
video over satellite, shared on Blu-RayTMdisks,
or sent over Internet technologies. Stereoscopic displays
are needed at the receiving end and the viewer needs to
wear special glasses to present the two versions of the
video to the human vision system that then generates
the 3D illusion. To be more e ffective and improve the
immersive experience, more views are acquired from a
larger number of cameras and presented on di fferent displays,
such as autostereoscopic and light field displays.
These multiple views, combined with depth data, also
allow enhanced user experiences and new forms of interaction
with the 3D content from virtual viewpoints.
This type of audiovisual information is represented by a
huge amount of data that needs to be compressed and
transmitted over bandwidth-limited channels. Part of
the COST Action IC1105 \3D Content Creation, Coding
and Transmission over Future Media Networks" (3DConTourNet)
focuses on this research challenge.peer-reviewe
3D-TV Production from Conventional Cameras for Sports Broadcast
3DTV production of live sports events presents a challenging problem involving conflicting requirements of main- taining broadcast stereo picture quality with practical problems in developing robust systems for cost effective deployment. In this paper we propose an alternative approach to stereo production in sports events using the conventional monocular broadcast cameras for 3D reconstruction of the event and subsequent stereo rendering. This approach has the potential advantage over stereo camera rigs of recovering full scene depth, allowing inter-ocular distance and convergence to be adapted according to the requirements of the target display and enabling stereo coverage from both existing and âvirtualâ camera positions without additional cameras. A prototype system is presented with results of sports TV production trials for rendering of stereo and free-viewpoint video sequences of soccer and rugby
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a âpiece â of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
- âŚ