35,817 research outputs found
On using gait to enhance frontal face extraction
Visual surveillance finds increasing deployment formonitoring urban environments. Operators need to be able to determine identity from surveillance images and often use face recognition for this purpose. In surveillance environments, it is necessary to handle pose variation of the human head, low frame rate, and low resolution input images. We describe the first use of gait to enable face acquisition and recognition, by analysis of 3-D head motion and gait trajectory, with super-resolution analysis. We use region- and distance-based refinement of head pose estimation. We develop a direct mapping to relate the 2-D image with a 3-D model. In gait trajectory analysis, we model the looming effect so as to obtain the correct face region. Based on head position and the gait trajectory, we can reconstruct high-quality frontal face images which are demonstrated to be suitable for face recognition. The contributions of this research include the construction of a 3-D model for pose estimation from planar imagery and the first use of gait information to enhance the face extraction process allowing for deployment in surveillance scenario
Image enhancement from a stabilised video sequence
The aim of video stabilisation is to create a new video sequence where the motions (i.e. rotations, translations) and scale differences between frames (or parts of a frame) have effectively been removed. These stabilisation effects can be obtained via digital video processing techniques which use the information extracted from the video sequence itself, with no need for additional hardware or knowledge about camera physical motion.
A video sequence usually contains a large overlap between successive frames, and regions of the same scene are sampled at different positions. In this paper, this multiple sampling is combined to achieve images with a higher spatial resolution. Higher resolution imagery play an important role in assisting in the identification of people, vehicles, structures or objects of interest captured by surveillance cameras or by video cameras used in face recognition, traffic monitoring, traffic law reinforcement, driver assistance and automatic vehicle guidance systems
Maximum likelihood estimation of cloud height from multi-angle satellite imagery
We develop a new estimation technique for recovering depth-of-field from
multiple stereo images. Depth-of-field is estimated by determining the shift in
image location resulting from different camera viewpoints. When this shift is
not divisible by pixel width, the multiple stereo images can be combined to
form a super-resolution image. By modeling this super-resolution image as a
realization of a random field, one can view the recovery of depth as a
likelihood estimation problem. We apply these modeling techniques to the
recovery of cloud height from multiple viewing angles provided by the MISR
instrument on the Terra Satellite. Our efforts are focused on a two layer cloud
ensemble where both layers are relatively planar, the bottom layer is optically
thick and textured, and the top layer is optically thin. Our results
demonstrate that with relative ease, we get comparable estimates to the M2
stereo matcher which is the same algorithm used in the current MISR standard
product (details can be found in [IEEE Transactions on Geoscience and Remote
Sensing 40 (2002) 1547--1559]). Moreover, our techniques provide the
possibility of modeling all of the MISR data in a unified way for cloud height
estimation. Research is underway to extend this framework for fast, quality
global estimates of cloud height.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS243 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Temporal shape super-resolution by intra-frame motion encoding using high-fps structured light
One of the solutions of depth imaging of moving scene is to project a static
pattern on the object and use just a single image for reconstruction. However,
if the motion of the object is too fast with respect to the exposure time of
the image sensor, patterns on the captured image are blurred and reconstruction
fails. In this paper, we impose multiple projection patterns into each single
captured image to realize temporal super resolution of the depth image
sequences. With our method, multiple patterns are projected onto the object
with higher fps than possible with a camera. In this case, the observed pattern
varies depending on the depth and motion of the object, so we can extract
temporal information of the scene from each single image. The decoding process
is realized using a learning-based approach where no geometric calibration is
needed. Experiments confirm the effectiveness of our method where sequential
shapes are reconstructed from a single image. Both quantitative evaluations and
comparisons with recent techniques were also conducted.Comment: 9 pages, Published at the International Conference on Computer Vision
(ICCV 2017
- …