6 research outputs found

    CYCLOP: A stereo color image quality assessment metric

    Full text link
    International audienceIn this work, a reduced reference (RR) perceptual quality metric for color stereoscopic images is presented. Given a reference stereo pair of images and their "distorted" version, we first compute the disparity map of both the reference and the distorted stereoscopic images. To this end, we define a method for color image disparity estimation based on the structure tensors properties and eigenvalues/eigenvectors analysis. Then, we compute the cyclopean images of both the reference and the distorted pairs. Thereafter, we apply a multispectral wavelet decomposition to the two cyclopean color images in order to describe the different channels in the human visual system (HVS). Then, contrast sensitivity function (CSF) filtering is performed to obtain the same visual sensitivity information within the original and the distorted cyclopean images. Thereafter, based on the properties of the human visual system (HVS), rational sensitivity thresholding is performed to obtain the sensitivity coefficients of the cyclopean images. Finally, RR stereo color image quality assessment (SCIQA) is performed by comparing the sensitivity coefficients of the cyclopean images and studying the coherence between the disparity maps of the reference and the distorted pairs. Experiments performed on color stereoscopic images indicate that the objective scores obtained by the proposed metric agree well with the subjective assessment scores

    Algorithms for VLSI stereo vision circuits applied to autonomous robots

    Get PDF
    Since the inception of Robotics, visual information has been incorporated in order to allow the robots to perform tasks that require an interaction with their environment, particularly when it is a changing environment. Depth perception is a most useful information for a mobile robot to navigate in its environment and interact with its surroundings. Among the different methods capable of measuring the distance to the objects in the scene, stereo vision is the most advantageous for a small, mobile robot with limited energy and computational power. Stereoscopy implies a low power consumption because it uses passive sensors and it does not require the robot to move. Furthermore, it is more robust, because it does not require a complex optic system with moving elements. On the other hand, stereo vision is computationally intensive. Objects in the scene have to be detected and matched across images. Biological sensory systems are based on simple computational elements that process information in parallel and communicate among them. Analog VLSI chips are an ideal substrate to mimic the massive parallelism and collective computation present in biological nervous systems. For mobile robotics they have the added advantage of low power consumption and high computational power, thus freeing the CPU for other tasks. This dissertation discusses two stereoscopic methods that are based on simple, parallel cal- culations requiring communication only among neighboring processing units (local communication). Algorithms with these properties are easy to implement in analog VLSI and they are also very convenient for digital systems. The first algorithm is phase-based. Disparity, i.e., the spatial shift between left and right images, is recovered as a phase shift in the spatial-frequency domain. GĂĄbor functions are used to recover the frequency spectrum of the image because of their optimum joint spatial and spatial-frequency properties. The GĂĄbor-based algorithm is discussed and tested on a Khepera miniature mobile robot. Two further approximations are introduced to ease the analog VLSI and digital implementations. The second stereoscopic algorithm is difference-based. Disparity is recovered by a simple calculation using the image differences and their spatial derivatives. The algorithm is simulated on a digital system and an analog VLSI implementation is proposed and discussed. The thesis concludes with the description of some tools used in this research project. A stereo vision system has been developed for the Webots mobile robotics simulator, to simplify the testing of different stereo algorithms. Similarly, two stereo vision turrets have been built for the Khepera robot

    3-D Scene Reconstruction from Multiple Photometric Images

    Get PDF
    This thesis deals with the problem of three dimensional scene reconstruction from multiple camera images. This is a well established problem in computer vision and has been significantly researched. In recent years some excellent results have been achieved, however existing algorithms often fall short of many biological systems in terms of robustness and generality. The aim of this research was to develop improved algorithms for reconstructing 3D scenes, with a focus on accurate system modelling and correctly dealing with occlusions. With scene reconstruction the objective is to infer scene parameters describing the 3D structure of the scene from the data given by camera images. This is an illposed inverse problem, where an exact solution cannot be guaranteed. The use of a statistical approach to deal with the scene reconstruction problem is introduced and the differences between maximum a priori (MAP) and minimum mean square estimate (MMSE) considered. It is discussed how traditional stereo matching can be performed using a volumetric scene model. An improved model describing the relationship between the camera data and a discrete model of the scene is presented. This highlights some of the common causes of modelling errors, enabling them to be dealt with objectively. The problems posed by occlusions are considered. Using a greedy algorithm the scene is progressively reconstructed to account for visibility interactions between regions and the idea of a complete scene estimate is established. Some simple and improved techniques for reliably assigning opaque voxels are developed, making use of prior information. Problems with variations in the imaging convolution kernel between images motivate the development of a pixel dissimilarity measure. Belief propagation is then applied to better utilise prior information and obtain an improved global optimum. A new volumetric factor graph model is presented which represents the joint probability distribution of the scene and imaging system. By utilising the structure of the local compatibility functions, an efficient procedure for updating the messages is detailed. To help convergence, a novel approach of accentuating beliefs is shown. Results demonstrate the validity of this approach, however the reconstruction error is similar or slightly higher than from the Greedy algorithm. To simplify the volumetric model, a new approach to belief propagation is demonstrated by applying it to a dynamic model. This approach is developed as an alternative to the full volumetric model because it is less memory and computationally intensive. Using a factor graph, a volumetric known visibility model is presented which ensures the scene is complete with respect to all the camera images. Dynamic updating is also applied to a simpler single depth-map model. Results show this approach is unsuitable for the volumetric known visibility model, however, improved results are obtained with the simple depth-map model

    GPU data structures for graphics and vision

    Get PDF
    Graphics hardware has in recent years become increasingly programmable, and its programming APIs use the stream processor model to expose massive parallelization to the programmer. Unfortunately, the inherent restrictions of the stream processor model, used by the GPU in order to maintain high performance, often pose a problem in porting CPU algorithms for both video and volume processing to graphics hardware. Serial data dependencies which accelerate CPU processing are counterproductive for the data-parallel GPU. This thesis demonstrates new ways for tackling well-known problems of large scale video/volume analysis. In some instances, we enable processing on the restricted hardware model by re-introducing algorithms from early computer graphics research. On other occasions, we use newly discovered, hierarchical data structures to circumvent the random-access read/fixed write restriction that had previously kept sophisticated analysis algorithms from running solely on graphics hardware. For 3D processing, we apply known game graphics concepts such as mip-maps, projective texturing, and dependent texture lookups to show how video/volume processing can benefit algorithmically from being implemented in a graphics API. The novel GPU data structures provide drastically increased processing speed, and lift processing heavy operations to real-time performance levels, paving the way for new and interactive vision/graphics applications.Graphikhardware wurde in den letzen Jahren immer weiter programmierbar. Ihre APIs verwenden das Streamprozessor-Modell, um die massive Parallelisierung auch fĂŒr den Programmierer verfĂŒgbar zu machen. Leider folgen aus dem strikten Streamprozessor-Modell, welches die GPU fĂŒr ihre hohe Rechenleistung benötigt, auch Hindernisse in der Portierung von CPU-Algorithmen zur Video- und Volumenverarbeitung auf die GPU. Serielle DatenabhĂ€ngigkeiten beschleunigen zwar CPU-Verarbeitung, sind aber fĂŒr die daten-parallele GPU kontraproduktiv . Diese Arbeit prĂ€sentiert neue Herangehensweisen fĂŒr bekannte Probleme der Video- und Volumensverarbeitung. Teilweise wird die Verarbeitung mit Hilfe von modifizierten Algorithmen aus der frĂŒhen Computergraphik-Forschung an das beschrĂ€nkte Hardwaremodell angepasst. Anderswo helfen neu entdeckte, hierarchische Datenstrukturen beim Umgang mit den Schreibzugriff-Restriktionen die lange die Portierung von komplexeren Bildanalyseverfahren verhindert hatten. In der 3D-Verarbeitung nutzen wir bekannte Konzepte aus der Computerspielegraphik wie Mipmaps, projektive Texturierung, oder verkettete Texturzugriffe, und zeigen auf welche Vorteile die Video- und Volumenverarbeitung aus hardwarebeschleunigter Graphik-API-Implementation ziehen kann. Die prĂ€sentierten GPU-Datenstrukturen bieten drastisch schnellere Verarbeitung und heben rechenintensive Operationen auf Echtzeit-Niveau. Damit werden neue, interaktive Bildverarbeitungs- und Graphik-Anwendungen möglich
    corecore