47 research outputs found
Circulant temporal encoding for video retrieval and temporal alignment
We address the problem of specific video event retrieval. Given a query video
of a specific event, e.g., a concert of Madonna, the goal is to retrieve other
videos of the same event that temporally overlap with the query. Our approach
encodes the frame descriptors of a video to jointly represent their appearance
and temporal order. It exploits the properties of circulant matrices to
efficiently compare the videos in the frequency domain. This offers a
significant gain in complexity and accurately localizes the matching parts of
videos. The descriptors can be compressed in the frequency domain with a
product quantizer adapted to complex numbers. In this case, video retrieval is
performed without decompressing the descriptors. We also consider the temporal
alignment of a set of videos. We exploit the matching confidence and an
estimate of the temporal offset computed for all pairs of videos by our
retrieval approach. Our robust algorithm aligns the videos on a global timeline
by maximizing the set of temporally consistent matches. The global temporal
alignment enables synchronous playback of the videos of a given scene
Slice Matching for Accurate Spatio-Temporal Alignment
International audienceVideo synchronization and alignment is a rather recent topic in computer vision. It usually deals with the problem of aligning sequences recorded simultaneously by static, jointly- or independently-moving cameras. In this paper, we investigate the more difficult problem of matching videos captured at different times from independently-moving cameras, whose trajectories are approximately co-incident or parallel. To this end, we propose a novel method that pixel-wise aligns videos and allows thus to automatically highlight their differences. This primarily aims at visual surveillance but the method can be adopted as is by other related video applications, like object transfer (augmented reality) or high dynamic range video. We build upon a slice matching scheme to first synchronize the sequences, while we develop a spatio-temporal alignment scheme to spatially register corresponding frames and re- fine the temporal mapping. We investigate the performance of the proposed method on videos recorded from vehicles driven along different types of roads and compare with related previous works
Continuous gesture recognition from articulated poses
International audienceThis paper addresses the problem of continuous gesture recognition from articulated poses. Unlike the common isolated recognition scenario, the gesture boundaries are here unknown, and one has to solve two problems: segmentation and recognition. This is cast into a labeling framework, namely every site (frame) must be assigned a label (gesture ID). The inherent constraint for a piece-wise constant labeling is satisfied by solving a global optimization problem with a smoothness term. For efficiency reasons, we suggest a dynamic programming (DP) solver that seeks the optimal path in a recursive manner. To quantify the consistency between the labels and the observations, we build on a recent method that encodes sequences of articulated poses into Fisher vectors using short skeletal descriptors. A sliding window allows to frame-wise build such Fisher vectors that are then classified by a multi-class SVM, whereby each label is assigned to each frame at some cost. The evaluation in the ChalearnLAP-2014 challenge shows that the method outperforms other participants that rely only on skeleton data. We also show that the proposed method competes with the top-ranking methods when colour and skeleton features are jointly used
Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video
We propose an automatic system for organizing the content of a collection of
unstructured videos of an articulated object class (e.g. tiger, horse). By
exploiting the recurring motion patterns of the class across videos, our
system: 1) identifies its characteristic behaviors; and 2) recovers
pixel-to-pixel alignments across different instances. Our system can be useful
for organizing video collections for indexing and retrieval. Moreover, it can
be a platform for learning the appearance or behaviors of object classes from
Internet video. Traditional supervised techniques cannot exploit this wealth of
data directly, as they require a large amount of time-consuming manual
annotations.
The behavior discovery stage generates temporal video intervals, each
automatically trimmed to one instance of the discovered behavior, clustered by
type. It relies on our novel motion representation for articulated motion based
on the displacement of ordered pairs of trajectories (PoTs). The alignment
stage aligns hundreds of instances of the class to a great accuracy despite
considerable appearance variations (e.g. an adult tiger and a cub). It uses a
flexible Thin Plate Spline deformation model that can vary through time. We
carefully evaluate each step of our system on a new, fully annotated dataset.
On behavior discovery, we outperform the state-of-the-art Improved DTF
descriptor. On spatial alignment, we outperform the popular SIFT Flow
algorithm.Comment: 19 pages, 19 figure, 3 tables. arXiv admin note: substantial text
overlap with arXiv:1411.788
Augmentieren von Personen in Monokularen Videodaten
When aiming at realistic video augmentation, i.e. the embedding of virtual, 3-dimensional objects into a scene's original content, a series of challenging problems has to be solved.
This is especially the case when working with solely monocular input material, as important additional 3D information is missing and has to be recovered during the process, if necessary.
In this work, I will present a semi-automatic strategy to tackle this task by providing solutions to individual problems in the context of virtual clothing as an example for realistic video augmentation.
Starting with two different approaches for monocular pose and motion estimation, I will show how to build a 3D human body model by estimating detailed shape information as well as basic surface material properties.
This information allows to further extract a dynamic illumination model from the provided input material.
The illumination model is particularly important for rendering a realistic virtual object and adds a lot of realism to the final video augmentation.
The animated human model is able to interact with virtual 3D objects and is used in the context of virtual clothing to animate simulated garments.
To achieve the desired realism, I present an additional image-based compositing approach that realistically embeds the simulated garment into the original scene content.
Combining the presented approaches provide an integrated strategy for realistic augmentation of actors in monocular video sequences.Unter der Zielsetzung einer realistischen Videoaugmentierung durch das Einbetten virtueller, dreidimensionaler Objekte in eine bestehende Videoaufnahme, gibt eine Reihe interessanter und schwieriger Problemen zu lösen. Besonders im Hinblick auf die Verarbeitung monokularer Eingabedaten fehlen wichtige räumliche Informationen, welche aus den zweidimensionalen Eingabedaten rekonstruiert werden müssen. In dieser Arbeit präsentiere ich eine halbautomatische Verfahrensweise, welche es ermöglicht, die einzelnen Teilprobleme einer umfassenden Videoaugmentierung nacheinander in einer integrierten Strategie zu lösen. Dies demonstriere ich am Beispiel von virtueller Kleidung. Beginnend mit zwei unterschiedlichen Ansätzen zur Posen- und Bewegungsrekonstruktion wird ein realistisches 3D Körpermodell eines Menschen erzeugt. Dazu wird die detaillierte Körperform durch ein geeignetes Verfahren approximiert und eine Rekonstruktion der Oberflächenmaterialen vorgenommen. Diese Informationen werden unter anderem dazu verwendet, aus dem Eingabevideo eine dynamische Szenenbeleuchtung zu rekonstruieren. Die Beleuchtungsinformationen sind besonders wichtig für eine realistische Videoaugmentierung, da gerade eine korrekte Beleuchtung den Realitätsgrad des virtuell generierten Objektes erhöht. Das rekonstruierte und animierte Körpermodell ist durch seinen Detailgrad in der Lage, mit virtuellen Objekten zu interagieren. Dies kommt besonders im Anwendungsfall von virtueller Kleidung zum tragen. Um den gewünschten Realitätsgrad zu erreichen, führe ich ein zusätzliches, bild-basiertes Korrekturverfahren ein, welches hilft, die finale Bildkomposition zu optimieren. Die Kombination aller präsentierter Teilverfahren bildet eine vollumfängliche Strategie zur Augmentierung von monokularem Videomaterial, die zur realistischen Simulation und Einbettung von virtueller Kleidung eines Schauspielers im Originalvideo verwendet werden kann
Video object segmentation and applications in temporal alignment and aspect learning
Modern computer vision has seen recently significant progress in learning visual concepts
from examples. This progress has been fuelled by recent models of visual appearance
as well as recently collected large-scale datasets of manually annotated still
images. Video is a promising alternative, as it inherently contains much richer information
compared to still images. For instance, in video we can observe an object move
which allows us to differentiate it from its surroundings, or we can observe a smooth
transition between different viewpoints of the same object instance. This richness in
information allows us to effectively tackle tasks that would otherwise be very difficult
if we only considered still images, or even adress tasks that are video-specific.
Our first contribution is a computationally efficient technique for video object segmentation.
Our method relies solely on motion in order to rapidly create a rough initial
estimate of the foreground object. This rough initial estimate is then refined through
an energy formulation to be spatio-temporally smooth. The method is able to handle
rapidly moving backgrounds and objects, as well as non-rigid deformations and articulations
without having prior knowledge about the objects appearance, size or location.
In addition to this class-agnostic method, we present a class-specific method that incorporates
additional class-specific appearance cues when the class of the foreground
object is known in advance (e.g. a video of a car).
For our second contribution, we propose a novel model for temporal video alignment
with regard to the viewpoint of the foreground object (i.e., a pair of aligned
frames shows the same object viewpoint) Our work relies on our video object segmentation
technique to automatically localise the foreground objects and extract appearance
measurements solely from them instead of the background. Our model is able
to temporally align realistic videos, where events may occur in a different order, or
occur only in one of the videos. This is in contrast to previous works that typically
assume that the videos show a scripted sequence of events and can simply be aligned
by stretching or compressing one of the videos.
As a final contribution, we once again use our video object segmentation technique
as a basis for automatic visual aspect discovery from videos of an object class. Compared
to previous works, we use a broader definition of an aspect that considers four
factors of variation: viewpoint, articulated pose, occlusions and cropping by the image
border. We pose the aspect discovery task as a clustering problem and provide an
extensive experimental exploration on the benefits of object segmentation for this task
Mariner Mars 1971 project. Volume 3: Mission operations system implementation and standard mission flight operations
The Mariner Mars 1971 mission which was another step in the continuing program of planetary exploration in search of evidence of exobiological activity, information on the origin and evolution of the solar system, and basic science data related to the study of planetary physics, geology, planetology, and cosmology is reported. The mission plan was designed for two spacecraft, each performing a separate but complementary mission. However, a single mission plan was actually used for Mariner 9 because of failure of the launch vehicle for the first spacecraft. The implementation is described, of the Mission Operations System, including organization, training, and data processing development and operations, and Mariner 9 spacecraft cruise and orbital operations through completion of the standard mission from launch to solar occultation in April 1972 are discussed
Earth resources technology satellite operations control center and data processing facility. Book 2 - Systems studies Final report
Systems analysis for ERTS NASA Data Processing Facility system and subsystem