137 research outputs found

    Auto-Calibration and Three-Dimensional Reconstruction for Zooming Cameras

    Get PDF
    This dissertation proposes new algorithms to recover the calibration parameters and 3D structure of a scene, using 2D images taken by uncalibrated stationary zooming cameras. This is a common configuration, usually encountered in surveillance camera networks, stereo camera systems, and event monitoring vision systems. This problem is known as camera auto-calibration (also called self-calibration) and the motivation behind this work is to obtain the Euclidean three-dimensional reconstruction and metric measurements of the scene, using only the captured images. Under this configuration, the problem of auto-calibrating zooming cameras differs from the classical auto-calibration problem of a moving camera in two major aspects. First, the camera intrinsic parameters are changing due to zooming. Second, because cameras are stationary in our case, using classical motion constraints, such as a pure translation for example, is not possible. In order to simplify the non-linear complexity of this problem, i.e., auto-calibration of zooming cameras, we have followed a geometric stratification approach. In particular, we have taken advantage of the movement of the camera center, that results from the zooming process, to locate the plane at infinity and, consequently to obtain an affine reconstruction. Then, using the assumption that typical cameras have rectangular or square pixels, the calculation of the camera intrinsic parameters have become possible, leading to the recovery of the Euclidean 3D structure. Being linear, the proposed algorithms were easily extended to the case of an arbitrary number of images and cameras. Furthermore, we have devised a sufficient constraint for detecting scene parallel planes, a useful information for solving other computer vision problems

    Viewpoint-Free Photography for Virtual Reality

    Get PDF
    Viewpoint-free photography, i.e., interactively controlling the viewpoint of a photograph after capture, is a standing challenge. In this thesis, we investigate algorithms to enable viewpoint-free photography for virtual reality (VR) from casual capture, i.e., from footage easily captured with consumer cameras. We build on an extensive body of work in image-based rendering (IBR). Given images of an object or scene, IBR methods aim to predict the appearance of an image taken from a novel perspective. Most IBR methods focus on full or near-interpolation, where the output viewpoints either lie directly between captured images, or nearby. These methods are not suitable for VR, where the user has significant range of motion and can look in all directions. Thus, it is essential to create viewpoint-free photos with a wide field-of-view and sufficient positional freedom to cover the range of motion a user might experience in VR. We focus on two VR experiences: 1) Seated VR experiences, where the user can lean in different directions. This simplifies the problem, as the scene is only observed from a small range of viewpoints. Thus, we focus on easy capture, showing how to turn panorama-style capture into 3D photos, a simple representation for viewpoint-free photos, and also how to speed up processing so users can see the final result on-site. 2) Room-scale VR experiences, where the user can explore vastly different perspectives. This is challenging: More input footage is needed, maintaining real-time display rates becomes difficult, view-dependent appearance and object backsides need to be modelled, all while preventing noticeable mistakes. We address these challenges by: (1) creating refined geometry for each input photograph, (2) using a fast tiled rendering algorithm to achieve real-time display rates, and (3) using a convolutional neural network to hide visual mistakes during compositing. Overall, we provide evidence that viewpoint-free photography is feasible from casual capture. We thoroughly compare with the state-of-the-art, showing that our methods achieve both a numerical improvement and a clear increase in visual quality for both seated and room-scale VR experiences

    Proceedings of the 2021 Symposium on Information Theory and Signal Processing in the Benelux, May 20-21, TU Eindhoven

    Get PDF

    Identification through Finger Bone Structure Biometrics

    Get PDF

    Finger Vein Verification with a Convolutional Auto-encoder

    Get PDF

    An investigation into common challenges of 3D scene understanding in visual surveillance

    Get PDF
    Nowadays, video surveillance systems are ubiquitous. Most installations simply consist of CCTV cameras connected to a central control room and rely on human operators to interpret what they see on the screen in order to, for example, detect a crime (either during or after an event). Some modern computer vision systems aim to automate the process, at least to some degree, and various algorithms have been somewhat successful in certain limited areas. However, such systems remain inefficient in general circumstances and present real challenges yet to be solved. These challenges include the ability to recognise and ultimately predict and prevent abnormal behaviour or even reliably recognise objects, for example in order to detect left luggage or suspicious objects. This thesis first aims to study the state-of-the-art and identify the major challenges and possible requirements of future automated and semi-automated CCTV technology in the field. This thesis presents the application of a suite of 2D and highly novel 3D methodologies that go some way to overcome current limitations.The methods presented here are based on the analysis of object features directly extracted from the geometry of the scene and start with a consideration of mainly existing techniques, such as the use of lines, vanishing points (VPs) and planes, applied to real scenes. Then, an investigation is presented into the use of richer 2.5D/3D surface normal data. In all cases the aim is to combine both 2D and 3D data to obtain a better understanding of the scene, aimed ultimately at capturing what is happening within the scene in order to be able to move towards automated scene analysis. Although this thesis focuses on the widespread application of video surveillance, an example case of the railway station environment is used to represent typical real-world challenges, where the principles can be readily extended elsewhere, such as to airports, motorways, the households, shopping malls etc. The context of this research work, together with an overall presentation of existing methods used in video surveillance and their challenges are described in chapter 1.Common computer vision techniques such as VP detection, camera calibration, 3D reconstruction, segmentation etc., can be applied in an effort to extract meaning to video surveillance applications. According to the literature, these methods have been well researched and their use will be assessed in the context of current surveillance requirements in chapter 2. While existing techniques can perform well in some contexts, such as an architectural environment composed of simple geometrical elements, their robustness and performance in feature extraction and object recognition tasks is not sufficient to solve the key challenges encountered in general video surveillance context. This is largely due to issues such as variable lighting, weather conditions, and shadows and in general complexity of the real-world environment. Chapter 3 presents the research and contribution on those topics – methods to extract optimal features for a specific CCTV application – as well as their strengths and weaknesses to highlight that the proposed algorithm obtains better results than most due to its specific design.The comparison of current surveillance systems and methods from the literature has shown that 2D data are however almost constantly used for many applications. Indeed, industrial systems as well as the research community have been improving intensively 2D feature extraction methods since image analysis and Scene understanding has been of interest. The constant progress on 2D feature extraction methods throughout the years makes it almost effortless nowadays due to a large variety of techniques. Moreover, even if 2D data do not allow solving all challenges in video surveillance or other applications, they are still used as starting stages towards scene understanding and image analysis. Chapter 4 will then explore 2D feature extraction via vanishing point detection and segmentation methods. A combination of most common techniques and a novel approach will be then proposed to extract vanishing points from video surveillance environments. Moreover, segmentation techniques will be explored in the aim to determine how they can be used to complement vanishing point detection and lead towards 3D data extraction and analysis. In spite of the contribution above, 2D data is insufficient for all but the simplest applications aimed at obtaining an understanding of a scene, where the aim is for a robust detection of, say, left luggage or abnormal behaviour; without significant a priori information about the scene geometry. Therefore, more information is required in order to be able to design a more automated and intelligent algorithm to obtain richer information from the scene geometry and so a better understanding of what is happening within. This can be overcome by the use of 3D data (in addition to 2D data) allowing opportunity for object “classification” and from this to infer a map of functionality, describing feasible and unfeasible object functionality in a given environment. Chapter 5 presents how 3D data can be beneficial for this task and the various solutions investigated to recover 3D data, as well as some preliminary work towards plane extraction.It is apparent that VPs and planes give useful information about a scene’s perspective and can assist in 3D data recovery within a scene. However, neither VPs nor plane detection techniques alone allow the recovery of more complex generic object shapes - for example composed of spheres, cylinders etc - and any simple model will suffer in the presence of non-Manhattan features, e.g. introduced by the presence of an escalator. For this reason, a novel photometric stereo-based surface normal retrieval methodology is introduced to capture the 3D geometry of the whole scene or part of it. Chapter 6 describes how photometric stereo allows recovery of 3D information in order to obtain a better understanding of a scene, as well as also partially overcoming some current surveillance challenges, such as difficulty in resolving fine detail, particularly at large standoff distances, and in isolating and recognising more complex objects in real scenes. Here items of interest may be obscured by complex environmental factors that are subject to rapid change, making, for example, the detection of suspicious objects and behaviour highly problematic. Here innovative use is made of an untapped latent capability offered within modern surveillance environments to introduce a form of environmental structuring to good advantage in order to achieve a richer form of data acquisition. This chapter also goes on to explore the novel application of photometric stereo in such diverse applications, how our algorithm can be incorporated into an existing surveillance system and considers a typical real commercial application.One of the most important aspects of this research work is its application. Indeed, while most of the research literature has been based on relatively simple structured environments, the approach here has been designed to be applied to real surveillance environments, such as railway stations, airports, waiting rooms, etc, and where surveillance cameras may be fixed or in the future form part of a mobile robotic free roaming surveillance device, that must continually reinterpret its changing environment. So, as mentioned previously, while the main focus has been to apply this algorithm to railway station environments, the work has been approached in a way that allows adaptation to many other applications, such as autonomous robotics, and in motorway, shopping centre, street and home environments. All of these applications require a better understanding of the scene for security or safety purposes. Finally, chapter 7 presents a global conclusion and what will be achieved in the future

    From light rays to 3D models

    Get PDF

    3D high resolution techniques applied on small and medium size objects: from the analysis of the process towards quality assessment

    Get PDF
    The need for metric data acquisition is an issue strictly related to the human capability of describing the world with rigorous and repeatable methods. From the invention of photography to the development of advanced computers, the metric data acquisition has been subjected to rapid mutation, and nowadays there exists a strict connection between metric data acquisition and image processing, Computer Vision and Artificial Intelligence. The sensor devices for the 3D model generation are various and characterized by different functioning principles. In this work, optical passive and active sensors are treated, focusing specifically on close-range photogrammetry, Time of Flight (ToF) sensors and Structured-light scanners (SLS). Starting from the functioning principles of the techniques and showing some issues related to them, the work highlights their potentialities, analyzing the fundamental and most critical steps of the process leading to the quality assessment of the data. Central themes are the instruments calibration, the acquisition plan and the interpretation of the final results. The capability of the acquisition techniques to satisfy unconventional requirements in the field of Cultural Heritage is also shown. The thesis starts with an overview about the history and developments of 3D metric data acquisition. Chapter 1 treats the Human Vision System and presents a complete overview of 3D sensing devices. Chapter 2 starts from the enunciation of the basic principle of close-range photogrammetry considering digital cameras functioning principles, calibration issues, and the process leading to the 3D mesh reconstruction. The case of multi-image acquisition is analyzed, deepening the quality assessment of the photogrammetric process through a case study. Chapter 3 is devoted to the range-based acquisition techniques, namely ToF laser scanners and SLSs. Lastly, Chapter 4 focuses on unconventional applications of the mentioned high-resolution acquisition techniques showing some examples of study cases in the field of Cultural Heritage

    Applications of Derandomization Theory in Coding

    Get PDF
    Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions. In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory. First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions, and construct efficient and information-theoretically optimal communication protocols for this model. Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold. Finally, we design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. [This is a shortened version of the actual abstract in the thesis.]Comment: EPFL Phd Thesi
    • …
    corecore