77 research outputs found

    Colour Helmholtz Stereopsis for Reconstruction of Complex Dynamic Scenes

    Get PDF
    Helmholtz Stereopsis (HS) is a powerful technique for reconstruction of scenes with arbitrary reflectance properties. However, previous formulations have been limited to static objects due to the requirement to sequentially capture reciprocal image pairs (i.e. two images with the camera and light source positions mutually interchanged). In this paper, we propose colour HS-a novel variant of the technique based on wavelength multiplexing. To address the new set of challenges introduced by multispectral data acquisition, the proposed novel pipeline for colour HS uniquely combines a tailored photometric calibration for multiple camera/light source pairs, a novel procedure for surface chromaticity calibration and the state-of-the-art Bayesian HS suitable for reconstruction from a minimal number of reciprocal pairs. Experimental results including quantitative and qualitative evaluation demonstrate that the method is suitable for flexible (single-shot) reconstruction of static scenes and reconstruction of dynamic scenes with complex surface reflectance properties

    Computational Imaging for Shape Understanding

    Get PDF
    Geometry is the essential property of real-world scenes. Understanding the shape of the object is critical to many computer vision applications. In this dissertation, we explore using computational imaging approaches to recover the geometry of real-world scenes. Computational imaging is an emerging technique that uses the co-designs of image hardware and computational software to expand the capacity of traditional cameras. To tackle face recognition in the uncontrolled environment, we study 2D color image and 3D shape to deal with body movement and self-occlusion. Especially, we use multiple RGB-D cameras to fuse the varying pose and register the front face in a unified coordinate system. The deep color feature and geodesic distance feature have been used to complete face recognition. To handle the underwater image application, we study the angular-spatial encoding and polarization state encoding of light rays using computational imaging devices. Specifically, we use the light field camera to tackle the challenging problem of underwater 3D reconstruction. We leverage the angular sampling of the light field for robust depth estimation. We also develop a fast ray marching algorithm to improve the efficiency of the algorithm. To deal with arbitrary reflectance, we investigate polarimetric imaging and develop polarimetric Helmholtz stereopsis that uses reciprocal polarimetric image pairs for high-fidelity 3D surface reconstruction. We formulate new reciprocity and diffuse/specular polarimetric constraints to recover surface depths and normals using an optimization framework. To recover the 3D shape in the unknown and uncontrolled natural illumination, we use two circularly polarized spotlights to boost the polarization cues corrupted by the environment lighting, as well as to provide photometric cues. To mitigate the effect of uncontrolled environment light in photometric constraints, we estimate a lighting proxy map and iteratively refine the normal and lighting estimation. Through expensive experiments on the simulated and real images, we demonstrate that our proposed computational imaging methods outperform traditional imaging approaches

    The Southampton-York Natural Scenes (SYNS) dataset: statistics of surface attitude

    No full text
    Recovering 3D scenes from 2D images is an under-constrained task; optimal estimation depends upon knowledge of the underlying scene statistics. Here we introduce the Southampton-York Natural Scenes dataset (SYNS: https://syns.soton.ac.uk), which provides comprehensive scene statistics useful for understanding biological vision and for improving machine vision systems. In order to capture the diversity of environments that humans encounter, scenes were surveyed at random locations within 25 indoor and outdoor categories. Each survey includes (i) spherical LiDAR range data (ii) high-dynamic range spherical imagery and (iii) a panorama of stereo image pairs. We envisage many uses for the dataset and present one example: an analysis of surface attitude statistics, conditioned on scene category and viewing elevation. Surface normals were estimated using a novel adaptive scale selection algorithm. Across categories, surface attitude below the horizon is dominated by the ground plane (0° tilt). Near the horizon, probability density is elevated at 90°/270° tilt due to vertical surfaces (trees, walls). Above the horizon, probability density is elevated near 0° slant due to overhead structure such as ceilings and leaf canopies. These structural regularities represent potentially useful prior assumptions for human and machine observers, and may predict human biases in perceived surface attitude

    Infinite Horizons : Le Corbusier, the Pavillon de l’Esprit Nouveau dioramas and the science of visual distance

    Get PDF
    The Pavillon de l’Esprit Nouveau was a building central to the development of Le Corbusier’s architecture and key to the role played by painting in his work. Significantly, as a prototype living space and as a setting for Purist art, it not only established Le Corbusier’s vision for contemporary architecture and urbanism, it also served as a demonstration of principles developed in collaboration with Amédée Ozenfant through their joint editorship of L’Esprit Nouveau. In the pages of the journal are numerous references to the nature of visual sensation and to the science of vision, but to what extent do the paintings and other material displayed in the pavilion reflect these ideas? Concentrating primarily on the panoramic images of the city displayed in the pavilion’s dioramas and on the contrasting nature of Le Corbusier’s paintings at this time, this paper considers the influence of nineteenth-century science and visual culture on his work

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Spatial and temporal integration of binocular disparity in the primate brain

    Get PDF
    Le système visuel du primate s'appuie sur les légères différences entre les deux projections rétiniennes pour percevoir la profondeur. Cependant, on ne sait pas exactement comment ces disparités binoculaires sont traitées et intégrées par le système nerveux. D'un côté, des enregistrements unitaires chez le macaque permettent d'avoir accès au codage neuronal de la disparité à un niveau local. De l'autre côté, la neuroimagerie fonctionnelle (IRMf) chez l'humain met en lumière les réseaux corticaux impliqués dans le traitement de la disparité à un niveau macroscopique mais chez une espèce différente. Dans le cadre de cette thèse, nous proposons d'utiliser la technique de l'IRMf chez le macaque pour permettre de faire le lien entre les enregistrements unitaires chez le macaque et les enregistrements IRMf chez l'humain. Cela, afin de pouvoir faire des comparaisons directes entre les deux espèces. Plus spécifiquement, nous nous sommes intéressés au traitement spatial et temporal des disparités binoculaires au niveau cortical mais aussi au niveau perceptif. En étudiant l'activité corticale en réponse au mouvement tridimensionnel (3D), nous avons pu montrer pour la première fois 1) qu'il existe un réseau dédié chez le macaque qui contient des aires allant au-delà du cluster MT et des aires environnantes et 2) qu'il y a des homologies avec le réseau trouvé chez l'humain en réponse à des stimuli similaires. Dans une deuxième étude, nous avons tenté d'établir un lien entre les biais perceptifs qui reflètent les régularités statistiques 3D ans l'environnement visuel et l'activité corticale. Nous nous sommes demandés si de tels biais existent et peuvent être reliés à des réponses spécifiques au niveau macroscopique. Nous avons trouvé de plus fortes activations pour le stimulus reflétant les statistiques naturelles chez un sujet, démontrant ainsi une possible influence des régularités spatiales sur l'activité corticale. Des analyses supplémentaires sont cependant nécessaires pour conclure de façon définitive. Néanmoins, nous avons pu confirmer de façon robuste l'existence d'un vaste réseau cortical répondant aux disparités corrélées chez le macaque. Pour finir, nous avons pu mesurer pour la première fois les points rétiniens correspondants au niveau du méridien vertical chez un sujet macaque qui réalisait une tâche comportementale (procédure à choix forcé). Nous avons pu comparer les résultats obtenus avec des données également collectées chez des participants humains avec le même protocole. Dans les différentes sections de discussion, nous montrons comment nos différents résultats ouvrent la voie à de nouvelles perspectives.The primate visual system strongly relies on the small differences between the two retinal projections to perceive depth. However, it is not fully understood how those binocular disparities are computed and integrated by the nervous system. On the one hand, single-unit recordings in macaque give access to neuronal encoding of disparity at a very local level. On the other hand, functional neuroimaging (fMRI) studies in human shed light on the cortical networks involved in disparity processing at a macroscopic level but with a different species. In this thesis, we propose to use an fMRI approach in macaque to bridge the gap between single-unit and fMRI recordings conducted in the non-human and human primate brain, respectively, by allowing direct comparisons between the two species. More specifically, we focused on the temporal and spatial processing of binocular disparities at the cortical but also at the perceptual level. Investigating cortical activity in response to motion-in-depth, we could show for the first time that 1) there is a dedicated network in macaque that comprises areas beyond the MT cluster and its surroundings and that 2) there are homologies with the human network involved in processing very similar stimuli. In a second study, we tried to establish a link between perceptual biases that reflect statistical regularities in the three-dimensional visual environment and cortical activity, by investigating whether such biases exist and can be related to specific responses at a macroscopic level. We found stronger activity for the stimulus reflecting natural statistics in one subject, demonstrating a potential influence of spatial regularities on the cortical activity. Further work is needed to firmly conclude about such a link. Nonetheless, we robustly confirmed the existence of a vast cortical network responding to correlated disparities in the macaque brain. Finally, we could measure for the first time retinal corresponding points on the vertical meridian of a macaque subject performing a behavioural task (forced-choice procedure) and compare it to the data we also collected in several human observers with the very same protocol. In the discussion sections, we showed how these findings open the door to varied perspectives

    Light field processor: a lytro illum imaging application

    Get PDF
    Light field imaging technology is at the intersection of three main research areas: computer graphics, computational photography and computer vision. This technology has the potential to allow functionalities that were previously impracticable, if not impossible, like refocusing photographic images after the capture or moving around in a VR scene produced by a real-time game engine, with 6DoF. Traditional photography produces one single output whenever a user presses the shot button. Light field photography may have several different outputs because it collects much more data about a scene. Thus it requires post-processing in order to extract any piece of useful information, like 2D images, and that is a characteristic feature that makes this technology substantially different from all others in the field of image making. Post processing means using a specialised application and, since this technology is still in its infancy, those applications are scarce. This context presented a good opportunity for such a development. Light Field Processor is the main outcome of this work. It is a computer application able to open and decode images from Lytro Illum light field cameras, which it may then store as a new file format (Decoded Light Field), proposed in this dissertation, for later use. It is able to extract 2D viewpoints, 2D maps of viewpoints or the microlens array, videos showing the intrinsic parallax of the light field and metadata, as well as do some basic image processing.A tecnologia de imagem de campo de luz está na intersecção de três grandes áreas de investigação: gráficos por computador, fotografia computacional e visão por computador. Esta tecnologia tem o potencial de possibilitar funcionalidades que eram anteriormente impraticáveis, senão mesmo impossíveis, tais como refocar imagens fotográficas após a captura ou movimentar-se numa cena de RV produzida por um motor de jogos em tempo real, com 6 graus de liberdade. A fotografia tradicional produz uma única saída sempre que um utilizador prime o botão de disparo. A fotografia campo de luz pode ter várias saídas diferentes porque junta muito mais dados acerca da cena. Logo ela requer pós-processamento por forma a extrair qualquer informação útil, como imagens 2D, e essa é uma funcionalidade característica que faz desta tecnologia substancialmente diferente de todas as outras no ramo da produção de imagem. Pós-processamento significa usar uma aplicação especializada e, uma vez que esta tecnologia ainda está na sua infância, essas aplicações são escassas. Este contexto proporcionou uma boa oportunidade para tal desenvolvimento. Light Field Processor é o principal resultado deste trabalho. É uma aplicação para computador capaz de abrir e descodificar imagens de cameras Lytro Illum campo de luz, que podem então ser armazenadas como um novo formato de ficheiro (Campo de Luz Descodificado), proposto nesta dissertação, para uso posterior. É capaz de extrair pontos de vista 2D, mapas 2D de pontos de vista ou conjunto de microlentes, vídeos mostrando a paralaxe intrínseca do campo de luz e metadados, assim como fazer algum processamento de imagem básico

    A right hemisphere advantage for processing blurred faces

    Get PDF
    No description supplie
    • …
    corecore