3,393 research outputs found
Perceptual modelling for 2D and 3D
Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet
Methods for reducing visual discomfort in stereoscopic 3D: A review
This work was supported by the EPSRC Grant EP/M01469X/1, “Geometric Evaluation of Stereoscopic Video”
Perceptual modelling for 2D and 3D
Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet
Recommended from our members
Camera positioning for 3D panoramic image rendering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.Virtual camera realisation and the proposition of trapezoidal camera architecture are the two broad contributions of this thesis. Firstly, multiple camera and their arrangement constitute a critical component which affect the integrity of visual content acquisition for multi-view video. Currently, linear, convergence, and divergence arrays are the prominent camera topologies adopted. However, the large number of cameras required and their synchronisation are two of prominent challenges usually encountered. The use of virtual cameras can significantly reduce the number of physical cameras used with respect to any of the known
camera structures, hence adequately reducing some of the other implementation issues. This thesis explores to use image-based rendering with and without geometry in the implementations leading to the realisation of virtual cameras. The virtual camera implementation was carried out from the perspective of depth map (geometry) and use of multiple image samples (no geometry). Prior to the virtual camera realisation, the generation of depth map was investigated using region match measures widely known for solving image point correspondence problem. The constructed depth maps have been compare with the ones generated
using the dynamic programming approach. In both the geometry and no geometry approaches, the virtual cameras lead to the rendering of views from a textured depth map, construction of 3D panoramic image of a scene by stitching multiple image samples and performing superposition on them, and computation
of virtual scene from a stereo pair of panoramic images. The quality of these rendered images were assessed through the use of either objective or subjective analysis in Imatest software. Further more, metric reconstruction of a scene was performed by re-projection of the pixel points from multiple image samples with
a single centre of projection. This was done using sparse bundle adjustment algorithm. The statistical summary obtained after the application of this algorithm provides a gauge for the efficiency of the optimisation step. The optimised data was then visualised in Meshlab software environment, hence providing the reconstructed scene. Secondly, with any of the well-established camera arrangements, all cameras are usually constrained to the same horizontal plane. Therefore, occlusion becomes an extremely challenging problem, and a robust camera set-up is required in order to resolve strongly the hidden part of any scene objects.
To adequately meet the visibility condition for scene objects and given that occlusion of the same scene objects can occur, a multi-plane camera structure is highly desirable. Therefore, this thesis also explore trapezoidal camera structure for image acquisition. The approach here is to assess the feasibility and potential
of several physical cameras of the same model being sparsely arranged on the edge of an efficient trapezoid graph. This is implemented both Matlab and Maya. The quality of the depth maps rendered in Matlab are better in Quality
Space-variant picture coding
PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of
the human visual system in order to increase coding efficiency in terms of perceived quality
per bit. This thesis extends space-variant coding research in two directions. The first of
these directions is in foveated coding. Past foveated coding research has been dominated
by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer
and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing
an additive multi-viewer sensitivity function based on an established eye resolution
model, and, from this, a blur map that is optimal in the sense of discarding frequencies in
least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm
is presented for the efficient computation of high-accuracy smoothly space-variant
Gaussian blurring, using a specialised filter bank which approximates perfect space-variant
Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to
the brute force approach of employing a separate low-pass filter at each image location.
The second direction is that of artifi cially increasing the depth-of- field of an image, an
idea borrowed from photography with the advantage of allowing an image to be reduced
in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic
occlusion eff ects as occur in natural blurring, and of handling any number of blurring
and occlusion levels with the same level of computational complexity. The merits of this
coding approach have been investigated by subjective experiments to compare it with
single-viewer foveated image coding. The results found the depth-based preblurring to
generally be significantly preferable to the same level of foveation blurring
Disparity compensation using geometric transforms
This dissertation describes the research and development of some techniques to enhance
the disparity compensation in 3D video compression algorithms. Disparity compensation
is usually performed using a block matching technique between views, disregarding the
various levels of disparity present for objects at different depths in the scene. An alternative
coding scheme is proposed, taking advantage of the cameras setup information and
the object’s depth in the scene, to compensate more complex spatial distortions, being
able to improve disparity compensation even with convergent cameras.
In order to perform a more accurate disparity compensation, the reference picture
list is enriched with additional geometrically transformed images, for the most relevant
object’s levels of depth in the scene, resulting from projections of one view to another.
This scheme can be implemented in any state-of-the-art video codec, as H.264/AVC or
HEVC, in order to improve the disparity matching accuracy between views.
Experimental results, using MV-HEVC extension, show the efficiency of the proposed
method for coding stereo video, presenting bitrate savings up to 2.87%, for convergent
camera sequences, and 1.52% for parallel camera sequences. Also a method to choose
the geometrically transformed inter view reference pictures was developed, in order to
reduce unnecessary overhead for unused reference pictures. By selecting and adding to
the reference picture list, only the most useful pictures, all results improved, presenting
bitrate savings up to 3.06% for convergent camera sequences, and 2% for parallel camera
sequences
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
Human detection in surveillance videos and its applications - a review
Detecting human beings accurately in a visual surveillance system is crucial for diverse application areas including abnormal event detection, human gait characterization, congestion analysis, person identification, gender classification and fall detection for elderly people. The first step of the detection process is to detect an object which is in motion. Object detection could be performed using background subtraction, optical flow and spatio-temporal filtering techniques. Once detected, a moving object could be classified as a human being using shape-based, texture-based or motion-based features. A comprehensive review with comparisons on available techniques for detecting human beings in surveillance videos is presented in this paper. The characteristics of few benchmark datasets as well as the future research directions on human detection have also been discussed
Recent Developments in Video Surveillance
With surveillance cameras installed everywhere and continuously streaming thousands of hours of video, how can that huge amount of data be analyzed or even be useful? Is it possible to search those countless hours of videos for subjects or events of interest? Shouldn’t the presence of a car stopped at a railroad crossing trigger an alarm system to prevent a potential accident? In the chapters selected for this book, experts in video surveillance provide answers to these questions and other interesting problems, skillfully blending research experience with practical real life applications. Academic researchers will find a reliable compilation of relevant literature in addition to pointers to current advances in the field. Industry practitioners will find useful hints about state-of-the-art applications. The book also provides directions for open problems where further advances can be pursued
Recommended from our members
Image based human body rendering via regression & MRF energy minimization
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A machine learning method for synthesising human images is explored to create new images without relying on 3D modelling. Machine learning allows the creation of new images through prediction from existing data based on the use of training images. In the present study, image synthesis is performed at two levels: contour and pixel. A class of learning-based methods is formulated to create object contours from the training image for the synthetic image that allow pixel synthesis within the contours in the second level. The methods rely on applying robust object descriptions, dynamic learning models after appropriate motion segmentation, and machine learning-based frameworks.
Image-based human image synthesis using machine learning is a research focus that has recently gained considerable attention in the field of computer graphics. It makes use of techniques from image/motion analysis in computer vision. The problem lies in the estimation of methods for image-based object configuration (i.e. segmentation, contour outline). Using the results of these analysis methods as bases, the research adopts the machine learning approach, in which human images are synthesised by executing the synthesis of contour and pixels through the learning from training image.
Firstly, thesis shows how an accurate silhouette is distilled using developed background subtraction for accuracy and efficiency. The traditional vector machine approach is used to avoid ambiguities within the regression process. Images can be represented as a class of accurate and efficient vectors for single images as well as sequences. Secondly, the framework is explored using a unique view of machine learning methods, i.e., support vector regression (SVR), to obtain the convergence result of vectors for contour allocation. The changing relationship between the synthetic image and the training image is expressed as a vector and represented in functions. Finally, a pixel synthesis is performed based on belief propagation.
This thesis proposes a novel image-based rendering method for colour image synthesis using SVR and belief propagation for generalisation to enable the prediction of contour and colour information from input colour images. The methods rely on using appropriately defined and robust input colour images, optimising the input contour images within a sparse SVR framework. Firstly, the thesis shows how contour can effectively and efficiently be predicted from small numbers of input contour images. In addition, the thesis exploits the sparse properties of SVR efficiency, and makes use of SVR to estimate regression function. The image-based rendering method employed in this study enables contour synthesis for the prediction of small numbers of input source images. This procedure avoids the use of complex models and geometry information. Secondly, the method used for human body contour colouring is extended to define eight differently connected pixels, and construct a link distance field via the belief propagation method. The link distance, which acts as the message in propagation, is transformed by improving the low-envelope method in fast distance transform. Finally, the methodology is tested by considering human facial and human body clothing information. The accuracy of the test results for the human body model confirms the efficiency of the proposed method
- …