5,431 research outputs found
Enhanced tracking and recognition of moving objects by reasoning about spatio-temporal continuity.
A framework for the logical and statistical analysis and annotation of dynamic scenes containing occlusion and other uncertainties is presented. This framework consists
of three elements; an object tracker module, an object recognition/classification module and a logical consistency, ambiguity and error reasoning engine. The principle behind the object tracker and object recognition modules is to reduce error by increasing ambiguity (by merging objects in close proximity and presenting multiple
hypotheses). The reasoning engine deals with error, ambiguity and occlusion in a unified framework to produce a hypothesis that satisfies fundamental constraints
on the spatio-temporal continuity of objects. Our algorithm finds a globally consistent model of an extended video sequence that is maximally supported by a voting function based on the output of a statistical classifier. The system results
in an annotation that is significantly more accurate than what would be obtained
by frame-by-frame evaluation of the classifier output. The framework has been implemented
and applied successfully to the analysis of team sports with a single
camera.
Key words: Visua
Focusing and Compression of Ultrashort Pulses through Scattering Media
Light scattering in inhomogeneous media induces wavefront distortions which
pose an inherent limitation in many optical applications. Examples range from
microscopy and nanosurgery to astronomy. In recent years, ongoing efforts have
made the correction of spatial distortions possible by wavefront shaping
techniques. However, when ultrashort pulses are employed scattering induces
temporal distortions which hinder their use in nonlinear processes such as in
multiphoton microscopy and quantum control experiments. Here we show that
correction of both spatial and temporal distortions can be attained by
manipulating only the spatial degrees of freedom of the incident wavefront.
Moreover, by optimizing a nonlinear signal the refocused pulse can be shorter
than the input pulse. We demonstrate focusing of 100fs pulses through a 1mm
thick brain tissue, and 1000-fold enhancement of a localized two-photon
fluorescence signal. Our results open up new possibilities for optical
manipulation and nonlinear imaging in scattering media
Online real-time crowd behavior detection in video sequences
Automatically detecting events in crowded scenes is a challenging task in Computer Vision. A number of offline approaches have been proposed for solving the problem of crowd behavior detection, however the offline assumption limits their application in real-world video surveillance systems. In this paper, we propose an online and real-time method for detecting events in crowded video sequences. The proposed approach is based on the combination of visual feature extraction and image segmentation and it works without the need of a training phase. A quantitative experimental evaluation has been carried out on multiple publicly available video sequences, containing data from various crowd scenarios and different types of events, to demonstrate the effectiveness of the approach
Online Video Deblurring via Dynamic Temporal Blending Network
State-of-the-art video deblurring methods are capable of removing non-uniform
blur caused by unwanted camera shake and/or object motion in dynamic scenes.
However, most existing methods are based on batch processing and thus need
access to all recorded frames, rendering them computationally demanding and
time consuming and thus limiting their practical use. In contrast, we propose
an online (sequential) video deblurring method based on a spatio-temporal
recurrent network that allows for real-time performance. In particular, we
introduce a novel architecture which extends the receptive field while keeping
the overall size of the network small to enable fast execution. In doing so,
our network is able to remove even large blur caused by strong camera shake
and/or fast moving objects. Furthermore, we propose a novel network layer that
enforces temporal consistency between consecutive frames by dynamic temporal
blending which compares and adaptively (at test time) shares features obtained
at different time steps. We show the superiority of the proposed method in an
extensive experimental evaluation.Comment: 10 page
An Efficient Recurrent Adversarial Framework for Unsupervised Real-Time Video Enhancement
Video enhancement is a challenging problem, more than that of stills, mainly
due to high computational cost, larger data volumes and the difficulty of
achieving consistency in the spatio-temporal domain. In practice, these
challenges are often coupled with the lack of example pairs, which inhibits the
application of supervised learning strategies. To address these challenges, we
propose an efficient adversarial video enhancement framework that learns
directly from unpaired video examples. In particular, our framework introduces
new recurrent cells that consist of interleaved local and global modules for
implicit integration of spatial and temporal information. The proposed design
allows our recurrent cells to efficiently propagate spatio-temporal information
across frames and reduces the need for high complexity networks. Our setting
enables learning from unpaired videos in a cyclic adversarial manner, where the
proposed recurrent units are employed in all architectures. Efficient training
is accomplished by introducing one single discriminator that learns the joint
distribution of source and target domain simultaneously. The enhancement
results demonstrate clear superiority of the proposed video enhancer over the
state-of-the-art methods, in all terms of visual quality, quantitative metrics,
and inference speed. Notably, our video enhancer is capable of enhancing over
35 frames per second of FullHD video (1080x1920)
- …