668 research outputs found
The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work).
This paper explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream -- from V1, V2, V4 and to IT -- is to discount image transformations, after learning them during development
Noncentral catadioptric systems with quadric mirrors : geometry and calibration
Tese de doutoramento em Engenharia Electrotécnica (Informática) apresentada à Faculdade de Ciências e Tecnologia da Universidade de CoimbraNesta dissertação de doutoramento estudamos e analisamos a geometria dos sistema catadióptricos não-centrais compostos por uma câmara pinhole ou ortográfica e um espelho curvo, cuja forma é uma quádrica não degenerada, incluindo elipsóides, que podem ser esferas, hiperbolóides e parabolóides. A geometria destes sistemas de visão é parameterizada, analisando o fenómeno de formação da imagem, e é composta pelos parâmetros intrínsecos da câmara, os parâmetros da superfície do espelho e a posição e orientação da câmara em relação ao espelho e ao sistema de referência do mundo. A formação da imagem é estudada numa perspectiva puramente geométrica, focando principalmente o modelo de projecção e a calibração do sistema de visão. As principais contribuições deste trabalho incluem a demonstração de que num sistema catadióptrico não-central com um câmara em perspectiva e uma quádrica não degenerada, o ponto de reflexão na superfície do espelho (projectando na imagem qualquer ponto 3D do mundo) pertence a uma curva quártica que é dada pela intersecção de duas superfícies quádricas. O correspondente modelo de projecção é também desenvolvido e é expresso através de uma equação não linear implícita, dependente de um único parâmetro. Relativamente `a calibração destes sistemas de visão, foi desenvolvido um método de calibração, assumindo o conhecimento dos parâmetros intrínsecos da câmara em perspectiva e de um conjunto de pontos 3D expressos em coordenadas locais (estrutura 3D do mundo). Informação acerca do contorno aparente do espelho é também usada para melhorar a precisão da estimação. Um outro método de calibração é proposto, assumindo uma calibração prévia do sistema no sentido de um modelo geral de câmara (correspondências entre pontos na imagem e raios incidentes no espaço). Adicionalmente, a posição e orientação (pose) da câmara em relação ao espelho e ao sistema de referência do mundo são estimadas usando métricas algébricas e equações lineares (escritas para um método de calibração que também é apresentado). Considera-se a câmara como pré-calibrada. São desenvolvidas e apresentadas experiências com simulações extensivas e também com imagens reais de forma a testar a robustez e precisão dos métodos apresentados. As principais conclusões apontam para o facto de estes sistemas de visão serem altamente não lineares e a sua calibração ser possível com boa precisão, embora difícil de alcançar com precisão muito elevada, especialmente se o sistema de visão tem como objectivo aplicações direccionadas para a precisão. Apesar disso, pode observar-se que a informação da estrutura do mundo pode ser complementada com informação adicional, tal como o contorno aparente da quádrica, de forma a melhorar a qualidade dos resultados de calibração. Na verdade, o uso do contorno aparente do espelho pode, por si, melhorar drasticamente a precisão da estimação.In this PhD thesis we study and analyze the geometry of noncentral catadioptric systems composed by a pinhole or orthographic camera and a non-ruled quadric shaped mirror, that is to say an ellipsoid, which can be a sphere, a hyperboloid or a paraboloid surface. The geometry of these vision systems is parameterized by analyzing the image formation and is composed by the intrinsic parameters of the camera, the parameters of the mirror surface and the poses of the camera in relation to the mirror and to the world reference frames. Image formation is studied in a purely geometrical way, focusing mainly on the projection model and on the calibration of the vision system. The main contributions include the proof that in a noncentral catadioptric system with a perspective camera and a non degenerate quadric the reflection point on the surface (projecting any given 3D world point to the image) is on the quartic curve that is the intersection of two quadrics. The projection model related to the previous definition of the reflection point is also derived and is expressed as an implicit non linear function on a single unknown. In what concerns the calibration of these vision systems, we developed a calibration method assuming the knowledge of the intrinsic parameters of the perspective camera and of some 3D points in a local reference frame (structure) . Information about the apparent contour is also used to enhance the accuracy of the estimation. Another calibration method is proposed, assuming a previous calibration of the system in the sense of a general camera model (correspondences between image points and incident lines in space). Additionally, the camera-mirror and camera-world poses are estimated using algebraic metrics and linear equations (derived for a calibration method that is also presented). The camera is considered to be pre-calibrated. Experiments with extensive simulations and also using real images are performed to test the robustness and accuracy of the methods presented. The main conclusions are that these vision systems are highly non linear and that their calibration is possible with good accuracy but difficult to achieve with very high accuracy, specially if the vision system is aimed at being used for accuracy-driven applications. Nevertheless it is observed that structure of the world can be complemented with some additional information as the quadric apparent contour in order to improve the quality of the calibration results. Actually, the use of the apparent contour can dramatically improve the accuracy of the estimation
On the popularization of digital close-range photogrammetry: a handbook for new users.
Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) “Γεωπληροφορική
Geometry and Photometry in 3D Visual Recognition
The report addresses the problem of visual recognition under two sources of variability: geometric and photometric. The geometric deals with the relation between 3D objects and their views under orthographic and perspective projection. The photometric deals with the relation between 3D matte objects and their images under changing illumination conditions. Taken together, an alignment-based method is presented for recognizing objects viewed from arbitrary viewing positions and illuminated by arbitrary settings of light sources
Proceedings of the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory
This book is a collection of 15 reviewed technical reports summarizing the presentations at the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory. The covered topics include image processing, optical signal processing, visual inspection, pattern recognition and classification, human-machine interaction, world and situation modeling, autonomous system localization and mapping, information fusion, and trust propagation in sensor networks
Vision-based Navigation and Mapping Using Non-central Catadioptric Omnidirectional Camera
Omnidirectional catadioptric cameras find their use in navigation and mapping, owing to their wide field of view. Having a wider field of view, or rather a potential 360 degree field of view, allows the user to see and move more freely in the navigation space.
A catadioptric camera system is a low cost system which consists of a mirror and a camera. A calibration method was developed in order to obtain the relative position and orientation between the two components so that they can be considered as one monolithic system. The position of the system was determined, for an environment using the conditions obtained from the reflective properties of the mirror. Object control points were set up and experiments were performed at different sites to test the mathematical models and the achieved location and mapping accuracy of the system. The obtained positions were then used to map the environment
View recommendation for multi-camera demonstration-based training
While humans can effortlessly pick a view from multiple streams, automatically choosing the best view is a challenge. Choosing the best view from multi-camera streams poses a problem regarding which objective metrics should be considered. Existing works on view selection lack consensus about which metrics should be considered to select the best view. The literature on view selection describes diverse possible metrics. And strategies such as information-theoretic, instructional design, or aesthetics-motivated fail to incorporate all approaches. In this work, we postulate a strategy incorporating information-theoretic and instructional design-based objective metrics to select the best view from a set of views. Traditionally, information-theoretic measures have been used to find the goodness of a view, such as in 3D rendering. We adapted a similar measure known as the viewpoint entropy for real-world 2D images. Additionally, we incorporated similarity penalization to get a more accurate measure of the entropy of a view, which is one of the metrics for the best view selection. Since the choice of the best view is domain-dependent, we chose demonstration-based training scenarios as our use case. The limitation of our chosen scenarios is that they do not include collaborative training and solely feature a single trainer. To incorporate instructional design considerations, we included the trainer’s body pose, face, face when instructing, and hands visibility as metrics. To incorporate domain knowledge we included predetermined regions’ visibility as another metric. All of those metrics are taken into account to produce a parameterized view recommendation approach for demonstration-based training. An online study using recorded multi-camera video streams from a simulation environment was used to validate those metrics. Furthermore, the responses from the online study were used to optimize the view recommendation performance with a normalized discounted cumulative gain (NDCG) value of 0.912, which shows good performance with respect to matching user choices
Recommended from our members
Detailed and Practical 3D Reconstruction with Advanced Photometric Stereo Modelling
Object 3D reconstruction has always been one of the main objectives of computer vision. After many decades of research, most techniques are still unsuccessful at recovering high resolution surfaces, especially for objects with limited surface texture. Moreover, most shiny materials are particularly hard to reconstruct.
Photometric Stereo (PS), which operates by capturing multiple images under changing illumination has traditionally been one of the most successful techniques at recovering a large amount of surface details, by exploiting the relationship between shading and local shape. However, using PS has been highly impractical because most approaches are only applicable in a very controlled lab setting and limited to objects experiencing diffuse reflection.
Nevertheless, recent advances in differential modelling have made complicated Photometric Stereo models possible and variational optimisations for these kinds of models show remarkable resilience to real world imperfections such as non-Gaussian noise and other outliers. Thus, a highly accurate, photometric-based reconstruction system is now possible.
The contribution of this thesis is threefold. First of all, the Photometric Stereo model is extended in order to be able to deal with arbitrary ambient lighting. This is a step towards acquisition in a non-fully controlled lab setting. Secondly, the need for a priori knowledge of the light source brightness and attenuation characteristics is relaxed as an alternating optimisation procedure is proposed which is able to estimate these parameters. This extension allows for quick acquisition with inexpensive LEDs that exhibit unpredictable illumination characteristics (flickering etc). Finally, a volumetric parameterisation is proposed which allows one to tackle the multi-view Photometric Stereo problem in a similar manner, in a simple unified differential model. This final extension allows for complete object reconstruction merging information from multiple images taken from multiple viewpoints and variable illumination.
The theoretical work in this thesis is experimentally evaluated in a number of challenging real world experiments, with data captured by custom-made hardware. In addition, the applicability of the generality of the proposed models is demonstrated by presenting a differential model for the shape of polarisation problem, which leads to a unified optimisation problem, fusing information from both methods. This allows for the acquisition of geometrical information about objects such as semi-transparent glass, hitherto hard to deal with
Ordinal depth from SFM and its application in robust scene recognition
Ph.DDOCTOR OF PHILOSOPH
Nonlinear probabilistic estimation of 3-D geometry from images
Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1997.Includes bibliographical references (p. 159-164).by Ali Jerome Azarbayejani.Ph.D
- …