22 research outputs found

    Super-Resolution Approaches for Depth Video Enhancement

    Get PDF
    Sensing using 3D technologies has seen a revolution in the past years where cost-effective depth sensors are today part of accessible consumer electronics. Their ability in directly capturing depth videos in real-time has opened tremendous possibilities for multiple applications in computer vision. These sensors, however, have major shortcomings due to their high noise contamination, including missing and jagged measurements, and their low spatial resolutions. In order to extract detailed 3D features from this type of data, a dedicated data enhancement is required. We propose a generic depth multi–frame super–resolution framework that addresses the limitations of state-of-theart depth enhancement approaches. The proposed framework doesnot need any additional hardware or coupling with different modalities. It is based on a new data model that uses densely upsampled low resolution observations. This results in a robust median initial estimation, further refined by a deblurring operation using a bilateraltotal variation as the regularization term. The upsampling operation ensures a systematic improvement in the registration accuracy. This is explored in different scenarios based on the motions involved in the depth video. For the general and most challenging case of objects deforming non-rigidly in full 3D, we propose a recursive dynamic multi–frame super-resolution algorithm where the relative local 3D motions between consecutive frames are directly accounted for. We rely on the assumption that these 3D motions can be decoupled into lateral motions and radial displacements. This allows to perform a simple local per–pixel tracking where both depth measurements and deformations are optimized. As compared to alternative approaches, the results show a clear improvement in reconstruction accuracy and in robustness to noise, to relative large non-rigid deformations, and to topological changes. Moreover, the proposed approach, implemented on a CPU, is shown to be computationally efficient and working in real-time

    Images and depth for high resolution, low-latency sensing and security applications

    Get PDF
    The thesis focuses on using images and depths for high resolution, low latency sensing, and then using these sensing techniques to build security applications. First, we introduce the usefulness of high quality depth sensing, and the difficulty to acquire such depth stream via pure hardware approach. Then, we propose our sensor fusion approach, which combines depth camera and color camera. Chapter 2 puts forward a low cost approach to use a high spatial resolution color stream to help aggressively increase the spatial resolution of the depth stream. Continuing this direction, Chapter 3 proposes to use optical ow to forward warp the depth stream according to a high frequency, low latency CMOS color stream. The warping can create a high frequency, low latency depth stream. In both Chapter 2 and Chapter 3, we show that the improved depth sensing can benefit lots of applications. In Chapter 4, we propose a SafetyNet, which can reliably detecting and rejecting adversarial examples. With the revolutionary SafetyNet architecture and the advanced depth sensing, we can reliably prove to users whether a picture of a scene is real or not. In sum, the thesis focuses on improving sensing technologies and building vision and security applications around the sensing technologies

    Fusing spatial and temporal components for real-time depth data enhancement of dynamic scenes

    Get PDF
    The depth images from consumer depth cameras (e.g., structured-light/ToF devices) exhibit a substantial amount of artifacts (e.g., holes, flickering, ghosting) that needs to be removed for real-world applications. Existing methods cannot entirely remove them and perform slow. This thesis proposes a new real-time spatio-temporal depth image enhancement filter that completely removes flickering and ghosting, and significantly reduces holes. This thesis also presents a novel depth-data capture setup and two data reduction methods to optimize the performance of the proposed enhancement method

    Markerless facial motion capture: deep learning approaches on RGBD data

    Get PDF
    Facial expressions are a series of fast, complex and interconnected movement that causes an array of deformations, such as stretching, compressing and folding of the skin. Identifying expression is a natural process in human vision, but due to the diversity of faces, it has many challenges for computer vision. Research in markerless facial motion capture using single Red Green Blue (RGB) camera has gained popularity due to the wide access of the data, such as from mobile phones. The motivation behind this work is much of the existing work attempts to infer the 3-Dimensional (3D) data from 2-Dimensional (2D) images, such as in motion capture multiple 2D cameras are calibration to allow some depth prediction. Whereas, the inclusion of Red Green Blue Depth (RGBD) sensors that give ground truth depth data could gain a better understanding of the human face and how expressions are visualised. The aim of this thesis is to investigate and develop novel methods of markerless facial motion capture, where the focus is on the inclusions of RGBD data to provide 3D data. The contributions are: A tool to aid in the annotation of 3D facial landmarks; A novel neural network that demonstrate the ability of predicting 2D and 3D landmarks by merging RGBD data; Working application that demonstrates complex deep learning network on portable handheld devices; A review of existing methods of denoising fine detail in depth maps using neural networks; A network for the complete analysis of facial landmarks and expressions in 3D. The 3D annotator was developed to overcome the issues of relying on existing 3D modelling software, which made feature identification difficult. The technique of predicting 2D and 3D with auxiliary information, allowed high accuracy 3D landmarking, without the need for full model generation. Also, it outperformed other recent techniques of landmarking. The networks running on the handheld devices show as a proof of concept that even without much optimisation, a complex task can be performed in near real-time. Denoising Time of Flight (ToF) depth maps, showed much more complexity than the tradition RGB denoising, where we reviewed and applied an array of techniques to the task. The full facial analysis showed that when neural networks perform on a wide range of related task for auxiliary information allow for deep understanding of the overall task. The research for facial processing is vast, but still with many new problems and challenges to face and improve upon. While RGB cameras are used widely, we see the inclusion of high accuracy and cost-effective depth sensing device available. The new devices allow better understanding of facial features and expression. By using and merging RGB data, the area of facial landmarking, and expression intensity recognition can be improved

    Efficient Methods for Computational Light Transport

    Get PDF
    En esta tesis presentamos contribuciones sobre distintos retos computacionales relacionados con transporte de luz. Los algoritmos que utilizan información sobre el transporte de luz están presentes en muchas aplicaciones de hoy en día, desde la generación de efectos visuales, a la detección de objetos en tiempo real. La luz es una valiosa fuente de información que nos permite entender y representar nuestro entorno, pero obtener y procesar esta información presenta muchos desafíos debido a la complejidad de las interacciones entre la luz y la materia. Esta tesis aporta contribuciones en este tema desde dos puntos de vista diferentes: algoritmos en estado estacionario, en los que se asume que la velocidad de la luz es infinita; y algoritmos en estado transitorio, que tratan la luz no solo en el dominio espacial, sino también en el temporal. Nuestras contribuciones en algoritmos estacionarios abordan problemas tanto en renderizado offline como en tiempo real. Nos enfocamos en la reducción de varianza para métodos offline,proponiendo un nuevo método para renderizado eficiente de medios participativos. En renderizado en tiempo real, abordamos las limitacionesde consumo de batería en dispositivos móviles proponiendo un sistema de renderizado que incrementa la eficiencia energética en aplicaciones gráficas en tiempo real. En el transporte de luz transitorio, formalizamos la simulación de este tipo transporte en este nuevo dominio, y presentamos nuevos algoritmos y métodos para muestreo eficiente para render transitorio. Finalmente, demostramos la utilidad de generar datos en este dominio, presentando un nuevo método para corregir interferencia multi-caminos en camaras Timeof- Flight, un problema patológico en el procesamiento de imágenes transitorias.n this thesis we present contributions to different challenges of computational light transport. Light transport algorithms are present in many modern applications, from image generation for visual effects to real-time object detection. Light is a rich source of information that allows us to understand and represent our surroundings, but obtaining and processing this information presents many challenges due to its complex interactions with matter. This thesis provides advances in this subject from two different perspectives: steady-state algorithms, where the speed of light is assumed infinite, and transient-state algorithms, which deal with light as it travels not only through space but also time. Our steady-state contributions address problems in both offline and real-time rendering. We target variance reduction in offline rendering by proposing a new efficient method for participating media rendering. In real-time rendering, we target energy constraints of mobile devices by proposing a power-efficient rendering framework for real-time graphics applications. In transient-state we first formalize light transport simulation under this domain, and present new efficient sampling methods and algorithms for transient rendering. We finally demonstrate the potential of simulated data to correct multipath interference in Time-of-Flight cameras, one of the pathological problems in transient imaging.<br /

    Variational image fusion

    Get PDF
    The main goal of this work is the fusion of multiple images to a single composite that offers more information than the individual input images. We approach those fusion tasks within a variational framework. First, we present iterative schemes that are well-suited for such variational problems and related tasks. They lead to efficient algorithms that are simple to implement and well-parallelisable. Next, we design a general fusion technique that aims for an image with optimal local contrast. This is the key for a versatile method that performs well in many application areas such as multispectral imaging, decolourisation, and exposure fusion. To handle motion within an exposure set, we present the following two-step approach: First, we introduce the complete rank transform to design an optic flow approach that is robust against severe illumination changes. Second, we eliminate remaining misalignments by means of brightness transfer functions that relate the brightness values between frames. Additional knowledge about the exposure set enables us to propose the first fully coupled method that jointly computes an aligned high dynamic range image and dense displacement fields. Finally, we present a technique that infers depth information from differently focused images. In this context, we additionally introduce a novel second order regulariser that adapts to the image structure in an anisotropic way.Das Hauptziel dieser Arbeit ist die Fusion mehrerer Bilder zu einem Einzelbild, das mehr Informationen bietet als die einzelnen Eingangsbilder. Wir verwirklichen diese Fusionsaufgaben in einem variationellen Rahmen. Zunächst präsentieren wir iterative Schemata, die sich gut für solche variationellen Probleme und verwandte Aufgaben eignen. Danach entwerfen wir eine Fusionstechnik, die ein Bild mit optimalem lokalen Kontrast anstrebt. Dies ist der Schlüssel für eine vielseitige Methode, die gute Ergebnisse für zahlreiche Anwendungsbereiche wie Multispektralaufnahmen, Bildentfärbung oder Belichtungsreihenfusion liefert. Um Bewegungen in einer Belichtungsreihe zu handhaben, präsentieren wir folgenden Zweischrittansatz: Zuerst stellen wir die komplette Rangtransformation vor, um eine optische Flussmethode zu entwerfen, die robust gegenüber starken Beleuchtungsänderungen ist. Dann eliminieren wir verbleibende Registrierungsfehler mit der Helligkeitstransferfunktion, welche die Helligkeitswerte zwischen Bildern in Beziehung setzt. Zusätzliches Wissen über die Belichtungsreihe ermöglicht uns, die erste vollständig gekoppelte Methode vorzustellen, die gemeinsam ein registriertes Hochkontrastbild sowie dichte Bewegungsfelder berechnet. Final präsentieren wir eine Technik, die von unterschiedlich fokussierten Bildern Tiefeninformation ableitet. In diesem Kontext stellen wir zusätzlich einen neuen Regularisierer zweiter Ordnung vor, der sich der Bildstruktur anisotrop anpasst
    corecore