1,981 research outputs found

    A Novel Framework for Highlight Reflectance Transformation Imaging

    Get PDF
    We propose a novel pipeline and related software tools for processing the multi-light image collections (MLICs) acquired in different application contexts to obtain shape and appearance information of captured surfaces, as well as to derive compact relightable representations of them. Our pipeline extends the popular Highlight Reflectance Transformation Imaging (H-RTI) framework, which is widely used in the Cultural Heritage domain. We support, in particular, perspective camera modeling, per-pixel interpolated light direction estimation, as well as light normalization correcting vignetting and uneven non-directional illumination. Furthermore, we propose two novel easy-to-use software tools to simplify all processing steps. The tools, in addition to support easy processing and encoding of pixel data, implement a variety of visualizations, as well as multiple reflectance-model-fitting options. Experimental tests on synthetic and real-world MLICs demonstrate the usefulness of the novel algorithmic framework and the potential benefits of the proposed tools for end-user applications.Terms: "European Union (EU)" & "Horizon 2020" / Action: H2020-EU.3.6.3. - Reflective societies - cultural heritage and European identity / Acronym: Scan4Reco / Grant number: 665091DSURF project (PRIN 2015) funded by the Italian Ministry of University and ResearchSardinian Regional Authorities under projects VIGEC and Vis&VideoLa

    Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives

    Get PDF
    The importance of depth perception in the interactions that humans have within their nearby space is a well established fact. Consequently, it is also well known that the possibility of exploiting good stereo information would ease and, in many cases, enable, a large variety of attentional and interactive behaviors on humanoid robotic platforms. However, the difficulty of computing real-time and robust binocular disparity maps from moving stereo cameras often prevents from relying on this kind of cue to visually guide robots' attention and actions in real-world scenarios. The contribution of this paper is two-fold: first, we show that the Efficient Large-scale Stereo Matching algorithm (ELAS) by A. Geiger et al. 2010 for computation of the disparity map is well suited to be used on a humanoid robotic platform as the iCub robot; second, we show how, provided with a fast and reliable stereo system, implementing relatively challenging visual behaviors in natural settings can require much less effort. As a case of study we consider the common situation where the robot is asked to focus the attention on one object close in the scene, showing how a simple but effective disparity-based segmentation solves the problem in this case. Indeed this example paves the way to a variety of other similar applications

    (SI10-124) Inverse Reconstruction Methodologies: A Review

    Get PDF
    The three-dimensional reconstruction problem is a longstanding ill-posed problem, which has made enormous progress in the field of computer vision. This field has attracted increasing interest and demonstrated an impressive performance. Due to a long era of increasing evolution, this paper presents an extensive review of the developments made in this field. For the three dimensional visualization, researchers have focused on the developments of three dimensional information and acquisition methodologies from two dimensional scenes or objects. These acquisition methodologies require a complex calibration procedure which is not practical in general. Hence, the requirement of flexibility was much needed in all these methods. Due to this emerging factors, many techniques were presented. The methodologies are organized on the basis of different aspects of the three dimensional reconstruction like active method, passive method, different geometrical shapes, etc. A brief analysis and comparison of the performance of these methodologies are also presented

    On the popularization of digital close-range photogrammetry: a handbook for new users.

    Get PDF
    Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) “Γεωπληροφορική

    Surface analysis and visualization from multi-light image collections

    Get PDF
    Multi-Light Image Collections (MLICs) are stacks of photos of a scene acquired with a fixed viewpoint and a varying surface illumination that provides large amounts of visual and geometric information. Over the last decades, a wide variety of methods have been devised to extract information from MLICs and have shown its use in different application domains to support daily activities. In this thesis, we present methods that leverage a MLICs for surface analysis and visualization. First, we provide background information: acquisition setup, light calibration and application areas where MLICs have been successfully used for the research of daily analysis work. Following, we discuss the use of MLIC for surface visualization and analysis and available tools used to support the analysis. Here, we discuss methods that strive to support the direct exploration of the captured MLIC, methods that generate relightable models from MLIC, non-photorealistic visualization methods that rely on MLIC, methods that estimate normal map from MLIC and we point out visualization tools used to do MLIC analysis. In chapter 3 we propose novel benchmark datasets (RealRTI, SynthRTI and SynthPS) that can be used to evaluate algorithms that rely on MLIC and discusses available benchmark for validation of photometric algorithms that can be also used to validate other MLIC-based algorithms. In chapter 4, we evaluate the performance of different photometric stereo algorithms using SynthPS for cultural heritage applications. RealRTI and SynthRTI have been used to evaluate the performance of (Neural)RTI method. Then, in chapter 5, we present a neural network-based RTI method, aka NeuralRTI, a framework for pixel-based encoding and relighting of RTI data. In this method using a simple autoencoder architecture, we show that it is possible to obtain a highly compressed representation that better preserves the original information and provides increased quality of virtual images relighted from novel directions, particularly in the case of challenging glossy materials. Finally, in chapter 6, we present a method for the detection of crack on the surface of paintings from multi-light image acquisitions and that can be used as well on single images and conclude our presentation

    Towards binocular active vision in a robot head system

    Get PDF
    This paper presents the first results of an investigation and pilot study into an active, binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognizing objects in a highly-cluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in a symbolic space without creating explicit pixel-space maps. The system structure is based on the ‘searchlight metaphor’ of biological systems. We present results of a first pilot investigation that yield a maximum vergence error of 6.4 pixels, while seven of nine known objects were recognized in a high-cluttered environment. Finally a “stepping stone” visual search strategy was demonstrated, taking a total of 40 saccades to find two known objects in the workspace, neither of which appeared simultaneously within the Field of View resulting from any individual saccade

    Computational Imaging for Shape Understanding

    Get PDF
    Geometry is the essential property of real-world scenes. Understanding the shape of the object is critical to many computer vision applications. In this dissertation, we explore using computational imaging approaches to recover the geometry of real-world scenes. Computational imaging is an emerging technique that uses the co-designs of image hardware and computational software to expand the capacity of traditional cameras. To tackle face recognition in the uncontrolled environment, we study 2D color image and 3D shape to deal with body movement and self-occlusion. Especially, we use multiple RGB-D cameras to fuse the varying pose and register the front face in a unified coordinate system. The deep color feature and geodesic distance feature have been used to complete face recognition. To handle the underwater image application, we study the angular-spatial encoding and polarization state encoding of light rays using computational imaging devices. Specifically, we use the light field camera to tackle the challenging problem of underwater 3D reconstruction. We leverage the angular sampling of the light field for robust depth estimation. We also develop a fast ray marching algorithm to improve the efficiency of the algorithm. To deal with arbitrary reflectance, we investigate polarimetric imaging and develop polarimetric Helmholtz stereopsis that uses reciprocal polarimetric image pairs for high-fidelity 3D surface reconstruction. We formulate new reciprocity and diffuse/specular polarimetric constraints to recover surface depths and normals using an optimization framework. To recover the 3D shape in the unknown and uncontrolled natural illumination, we use two circularly polarized spotlights to boost the polarization cues corrupted by the environment lighting, as well as to provide photometric cues. To mitigate the effect of uncontrolled environment light in photometric constraints, we estimate a lighting proxy map and iteratively refine the normal and lighting estimation. Through expensive experiments on the simulated and real images, we demonstrate that our proposed computational imaging methods outperform traditional imaging approaches

    Semi-automatic 3D reconstruction of urban areas using epipolar geometry and template matching

    Get PDF
    WOS:000240143800002 (Nº de Acesso Web of Science)In this work we describe a novel technique for semi-automatic three-dimensional (3D) reconstruction of urban areas, from airborne stereo-pair images whose output is VRML or DXF. The main challenge is to compute the relevant information—building's height and volume, roof's description, and texture—algorithmically, because it is very time consuming and thus expensive to produce it manually for large urban areas. The algorithm requires some initial calibration input and is able to compute the above-mentioned building characteristics from the stereo pair and the availability of the 2D CAD and the digital elevation model of the same area, with no knowledge of the camera pose or its intrinsic parameters. To achieve this, we have used epipolar geometry, homography computation, automatic feature extraction and we have solved the feature correspondence problem in the stereo pair, by using template matching
    corecore