13 research outputs found
Motion Segmentation Aided Super Resolution Image Reconstruction
This dissertation addresses Super Resolution (SR) Image Reconstruction focusing on motion segmentation. The main thrust is Information Complexity guided Gaussian Mixture Models (GMMs) for Statistical Background Modeling. In the process of developing our framework we also focus on two other topics; motion trajectories estimation toward global and local scene change detections and image reconstruction to have high resolution (HR) representations of the moving regions. Such a framework is used for dynamic scene understanding and recognition of individuals and threats with the help of the image sequences recorded with either stationary or non-stationary camera systems.
We introduce a new technique called Information Complexity guided Statistical Background Modeling. Thus, we successfully employ GMMs, which are optimal with respect to information complexity criteria. Moving objects are segmented out through background subtraction which utilizes the computed background model. This technique produces superior results to competing background modeling strategies.
The state-of-the-art SR Image Reconstruction studies combine the information from a set of unremarkably different low resolution (LR) images of static scene to construct an HR representation. The crucial challenge not handled in these studies is accumulating the corresponding information from highly displaced moving objects. In this aspect, a framework of SR Image Reconstruction of the moving objects with such high level of displacements is developed. Our assumption is that LR images are different from each other due to local motion of the objects and the global motion of the scene imposed by non-stationary imaging system. Contrary to traditional SR approaches, we employed several steps. These steps are; the suppression of the global motion, motion segmentation accompanied by background subtraction to extract moving objects, suppression of the local motion of the segmented out regions, and super-resolving accumulated information coming from moving objects rather than the whole scene. This results in a reliable offline SR Image Reconstruction tool which handles several types of dynamic scene changes, compensates the impacts of camera systems, and provides data redundancy through removing the background. The framework proved to be superior to the state-of-the-art algorithms which put no significant effort toward dynamic scene representation of non-stationary camera systems
Super-resolution from unregistered aliased images
Aliasing in images is often considered as a nuisance. Artificial low frequency patterns and jagged edges appear when an image is sampled at a too low frequency. However, aliasing also conveys useful information about the high frequency content of the image, which is exploited in super-resolution applications. We use a set of input images of the same scene to extract such high frequency information and create a higher resolution aliasing-free image. Typically, there is a small shift or more complex motion between the different images, such that they contain slightly different information about the scene. Super-resolution image reconstruction can be formulated as a multichannel sampling problem with unknown offsets. This results in a set of equations that are linear in the unknown signal coefficients but nonlinear in the offsets. This thesis concentrates on the computation of these offsets, as they are an essential prerequisite for an accurate high resolution reconstruction. If a part of the image spectra is free of aliasing, the planar shift and rotation parameters can be computed using only this low frequency information. In such a case, the images can be registered pairwise to a reference image. Such a method is not applicable if the images are undersampled by a factor of two or larger. A higher number of images needs to be registered jointly. Two subspace methods are discussed for such highly aliased images. The first approach is based on a Fourier description of the aliased signals as a sum of overlapping parts of the spectrum. It uses a rank condition to find the correct offsets. The second one uses a more general expansion in an arbitrary Hilbert space to compute the signal offsets. The sampled signal is represented as a linear combination of sampled basis functions. The offsets are computed by projecting the signal onto varying subspaces. Under certain conditions, in particular for bandlimited signals, the nonlinear super-resolution equations can be written as a set of polynomial equations. Using Buchberger's algorithm, the solution can then be computed as a Gröbner basis for the corresponding polynomial ideal. After a description of a standard algorithm, adaptations are made for the use with noisy measurements. The techniques presented in this thesis are tested in simulations and practical experiments. The experiments are performed on sets of real images taken with a digital camera. The results show the validity of the algorithms: registration parameters are computed with subpixel precision, and aliasing is accurately removed from the resulting high resolution image. This thesis is produced according to the concepts of reproducible research. All the results and examples used in this thesis are reproducible using the code and data available online
Kinect based system applied to breast cancer conservative treatment
Tese de Mestrado Integrado. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 201
Construction de mosaïques de super-résolution à partir de la vidéo de basse résolution. Application au résumé vidéo et la dissimulation d'erreurs de transmission.
La numérisation des vidéos existantes ainsi que le développement explosif des services multimédia par des réseaux comme la diffusion de la télévision numérique ou les communications mobiles ont produit une énorme quantité de vidéos compressées. Ceci nécessite des outils d’indexation et de navigation efficaces, mais une indexation avant l’encodage n’est pas habituelle. L’approche courante est le décodage complet des ces vidéos pour ensuite créer des indexes. Ceci est très coûteux et par conséquent non réalisable en temps réel. De plus, des informations importantes comme le mouvement, perdus lors du décodage, sont reestimées bien que déjà présentes dans le flux comprimé. Notre but dans cette thèse est donc la réutilisation des données déjà présents dans le flux comprimé MPEG pour l’indexation et la navigation rapide. Plus précisément, nous extrayons des coefficients DC et des vecteurs de mouvement. Dans le cadre de cette thèse, nous nous sommes en particulier intéressés à la construction de mosaïques à partir des images DC extraites des images I. Une mosaïque est construite par recalage et fusion de toutes les images d’une séquence vidéo dans un seul système de coordonnées. Ce dernier est en général aligné avec une des images de la séquence : l’image de référence. Il en résulte une seule image qui donne une vue globale de la séquence. Ainsi, nous proposons dans cette thèse un système complet pour la construction des mosaïques à partir du flux MPEG-1/2 qui tient compte de différentes problèmes apparaissant dans des séquences vidéo réeles, comme par exemple des objets en mouvment ou des changements d’éclairage. Une tâche essentielle pour la construction d’une mosaïque est l’estimation de mouvement entre chaque image de la séquence et l’image de référence. Notre méthode se base sur une estimation robuste du mouvement global de la caméra à partir des vecteurs de mouvement des images P. Cependant, le mouvement global de la caméra estimé pour une image P peut être incorrect car il dépend fortement de la précision des vecteurs encodés. Nous détectons les images P concernées en tenant compte des coefficients DC de l’erreur encodée associée et proposons deux méthodes pour corriger ces mouvements. Unemosaïque construite à partir des images DC a une résolution très faible et souffre des effets d’aliasing dus à la nature des images DC. Afin d’augmenter sa résolution et d’améliorer sa qualité visuelle, nous appliquons une méthode de super-résolution basée sur des rétro-projections itératives. Les méthodes de super-résolution sont également basées sur le recalage et la fusion des images d’une séquence vidéo, mais sont accompagnées d’une restauration d’image. Dans ce cadre, nous avons développé une nouvelleméthode d’estimation de flou dû au mouvement de la caméra ainsi qu’une méthode correspondante de restauration spectrale. La restauration spectrale permet de traiter le flou globalement, mais, dans le cas des obvi jets ayant un mouvement indépendant du mouvement de la caméra, des flous locaux apparaissent. C’est pourquoi, nous proposons un nouvel algorithme de super-résolution dérivé de la restauration spatiale itérative de Van Cittert et Jansson permettant de restaurer des flous locaux. En nous basant sur une segmentation d’objets en mouvement, nous restaurons séparément lamosaïque d’arrière-plan et les objets de l’avant-plan. Nous avons adapté notre méthode d’estimation de flou en conséquence. Dans une premier temps, nous avons appliqué notre méthode à la construction de résumé vidéo avec pour l’objectif la navigation rapide par mosaïques dans la vidéo compressée. Puis, nous établissions comment la réutilisation des résultats intermédiaires sert à d’autres tâches d’indexation, notamment à la détection de changement de plan pour les images I et à la caractérisation dumouvement de la caméra. Enfin, nous avons exploré le domaine de la récupération des erreurs de transmission. Notre approche consiste en construire une mosaïque lors du décodage d’un plan ; en cas de perte de données, l’information manquante peut être dissimulée grace à cette mosaïque
Super-resolution of 3-dimensional scenes
Super-resolution is an image enhancement method that increases the resolution of images and video. Previously this technique could only be applied to 2D scenes. The super-resolution algorithm developed in this thesis creates high-resolution views of 3-dimensional scenes, using low-resolution images captured from varying, unknown positions
Recommended from our members
Automated system design for the efficient processing of solar satellite images. Developing novel techniques and software platform for the robust feature detection and the creation of 3D anaglyphs and super-resolution images for solar satellite images.
The Sun is of fundamental importance to life on earth and is studied by scientists from many disciplines. It exhibits phenomena on a wide range of observable scales, timescales and wavelengths and due to technological developments there is a continuing increase in the rate at which solar data is becoming available for study which presents both opportunities and challenges. Two satellites recently launched to observe the sun are STEREO (Solar TErrestrial RElations Observatory), providing simultaneous views of the SUN from two different viewpoints and SDO (Solar Dynamics Observatory) which aims to study the solar atmosphere on small scales and times and in many wavelengths. The STEREO and SDO missions are providing huge volumes of data at rates of about 15 GB per day (initially it was 30 GB per day) and 1.5 terabytes per day respectively. Accessing these huge data volumes efficiently at both high spatial and high time resolutions is important to support scientific discovery but requires increasingly efficient tools to browse, locate and process specific data sets.
This thesis investigates the development of new technologies for processing information contained in multiple and overlapping images of the same scene to produce images of improved quality. This area in general is titled Super Resolution (SR), and offers a technique for reducing artefacts and increasing the spatial resolution. Another challenge is to generate 3D images such as Anaglyphs from uncalibrated pairs of SR images. An automated method to generate SR images is presented here. The SR technique consists of three stages: image registration, interpolation and filtration. Then a method to produce enhanced, near real-time, 3D solar images from uncalibrated pairs of images is introduced.
Image registration is an essential enabling step in SR and Anaglyph processing. An accurate point-to-point mapping between views is estimated, with multiple images registered using only information contained within the images themselves. The performances of the proposed methods are evaluated using benchmark evaluation techniques. A software application called the SOLARSTUDIO has been developed to integrate and run all the methods introduced in this thesis. SOLARSTUDIO offers a number of useful image processing tools associated with activities highly focused on solar images including: Active Region (AR) segmentation, anaglyph creation, solar limb extraction, solar events tracking and video creation
Restauration d'images en IRM anatomique pour l'étude préclinique des marqueurs du vieillissement cérébral
Les maladies neurovasculaires et neurodégénératives liées à l'âge sont en forte augmentation. Alors que ces changements pathologiques montrent des effets sur le cerveau avant l'apparition de symptômes cliniques, une meilleure compréhension du processus de vieillissement normal du cerveau aidera à distinguer l'impact des pathologies connues sur la structure régionale du cerveau. En outre, la connaissance des schémas de rétrécissement du cerveau dans le vieillissement normal pourrait conduire à une meilleure compréhension de ses causes et peut-être à des interventions réduisant la perte de fonctions cérébrales associée à l'atrophie cérébrale. Par conséquent, ce projet de thèse vise à détecter les biomarqueurs du vieillissement normal et pathologique du cerveau dans un modèle de primate non humain, le singe marmouset (Callithrix Jacchus), qui possède des caractéristiques anatomiques plus proches de celles des humains que de celles des rongeurs. Cependant, les changements structurels (par exemple, de volumes, d'épaisseur corticale) qui peuvent se produire au cours de leur vie adulte peuvent être minimes à l'échelle de l'observation. Dans ce contexte, il est essentiel de disposer de techniques d'observation offrant un contraste et une résolution spatiale suffisamment élevés et permettant des évaluations détaillées des changements morphométriques du cerveau associé au vieillissement. Cependant, l'imagerie de petits cerveaux dans une plateforme IRM 3T dédiée à l'homme est une tâche difficile car la résolution spatiale et le contraste obtenus sont insuffisants par rapport à la taille des structures anatomiques observées et à l'échelle des modifications attendues. Cette thèse vise à développer des méthodes de restauration d'image pour les images IRM précliniques qui amélioreront la robustesse des algorithmes de segmentation. L'amélioration de la résolution spatiale des images à un rapport signal/bruit constant limitera les effets de volume partiel dans les voxels situés à la frontière entre deux structures et permettra une meilleure segmentation tout en augmentant la reproductibilité des résultats. Cette étape d'imagerie computationnelle est cruciale pour une analyse morphométrique longitudinale fiable basée sur les voxels et l'identification de marqueurs anatomiques du vieillissement cérébral en suivant les changements de volume dans la matière grise, la matière blanche et le liquide cérébral.Age-related neurovascular and neurodegenerative diseases are increasing significantly. While such pathological changes show effects on the brain before clinical symptoms appear, a better understanding of the normal aging brain process will help distinguish known pathologies' impact on regional brain structure. Furthermore, knowledge of the patterns of brain shrinkage in normal aging could lead to a better understanding of its causes and perhaps to interventions reducing the loss of brain functions. Therefore, this thesis project aims to detect normal and pathological brain aging biomarkers in a non-human primate model, the marmoset monkey (Callithrix Jacchus) which possesses anatomical characteristics more similar to humans than rodents. However, structural changes (e.g., volumes, cortical thickness) that may occur during their adult life may be minimal with respect to the scale of observation. In this context, it is essential to have observation techniques that offer sufficiently high contrast and spatial resolution and allow detailed assessments of the morphometric brain changes associated with aging. However, imaging small brains in a 3T MRI platform dedicated to humans is a challenging task because the spatial resolution and the contrast obtained are insufficient compared to the size of the anatomical structures observed and the scale of the xpected changes with age. This thesis aims to develop image restoration methods for preclinical MR images that will improve the robustness of the segmentation algorithms. Improving the resolution of the images at a constant signal-to-noise ratio will limit the effects of partial volume in voxels located at the border between two structures and allow a better segmentation while increasing the results' reproducibility. This computational imaging step is crucial for a reliable longitudinal voxel-based morphometric analysis and for the identification of anatomical markers of brain aging by following the volume changes in gray matter, white matter and cerebrospinal fluid
Accurate 3D-reconstruction and -navigation for high-precision minimal-invasive interventions
The current lateral skull base surgery is largely invasive since it requires wide exposure and direct visualization of anatomical landmarks to avoid damaging critical structures. A multi-port approach aiming to reduce such invasiveness has been recently investigated. Thereby three canals are drilled from the skull surface to the surgical region of interest: the first canal for the instrument, the second for the endoscope, and the third for material removal or an additional instrument. The transition to minimal invasive approaches in the lateral skull base surgery requires sub-millimeter accuracy and high outcome predictability, which results in high requirements for the image acquisition as well as for the navigation.
Computed tomography (CT) is a non-invasive imaging technique allowing the visualization of the internal patient organs. Planning optimal drill channels based on patient-specific models requires high-accurate three-dimensional (3D) CT images. This thesis focuses on the reconstruction of high quality CT volumes. Therefore, two conventional imaging systems are investigated: spiral CT scanners and C-arm cone-beam CT (CBCT) systems. Spiral CT scanners acquire volumes with typically anisotropic resolution, i.e. the voxel spacing in the slice-selection-direction is larger than the in-the-plane spacing. A new super-resolution reconstruction approach is proposed to recover images with high isotropic resolution from two orthogonal low-resolution CT volumes.
C-arm CBCT systems offers CT-like 3D imaging capabilities while being appropriate for interventional suites. A main drawback of these systems is the commonly encountered CT artifacts due to several limitations in the imaging system, such as the mechanical inaccuracies. This thesis contributes new methods to enhance the CBCT reconstruction quality by addressing two main reconstruction artifacts: the misalignment artifacts caused by mechanical inaccuracies, and the metal-artifacts caused by the presence of metal objects in the scanned region.
CBCT scanners are appropriate for intra-operative image-guided navigation. For instance, they can be used to control the drill process based on intra-operatively acquired 2D fluoroscopic images. For a successful navigation, accurate estimate of C-arm pose relative to the patient anatomy and the associated surgical plan is required. A new algorithm has been developed to fulfill this task with high-precision. The performance of the introduced methods is demonstrated on simulated and real data
Depth-Map-Assisted Texture and Depth Map Super-Resolution
With the development of video technology, high definition video and 3D video applications are becoming increasingly accessible to customers. The interactive and vivid 3D video experience of realistic scenes relies greatly on the amount and quality of the texture and depth map data. However, due to the limitations of video capturing hardware and transmission bandwidth, transmitted video has to be compressed which degrades, in general, the received video quality. This means that it is hard to meet the users’ requirements of high definition and visual experience; it also limits development of future applications. Therefore, image/video super-resolution techniques have been proposed to address this issue. Image super-resolution aims to reconstruct a high resolution image from single or multiple low resolution images captured of the same scene under different conditions. Based on the image type that needs to be super-resolved, image super-resolution includes texture and depth image super-resolutions. If classified based on the implementation methods, there are three main categories: interpolation-based, reconstruction-based and learning-based super-resolution algorithms. This thesis focuses on exploiting depth data in interpolation-based super-resolution algorithms for texture video and depth maps. Two novel texture and one depth super-resolution algorithms are proposed as the main contributions of this thesis. The first texture super-resolution algorithm is carried out in the Mixed Resolution (MR) multiview video system where at least one of the views is captured at Low Resolution (LR), while the others are captured at Full Resolution (FR). In order to reduce visual uncomfortableness and adapt MR video format for free-viewpoint television, the low resolution views are super-resolved to the target full resolution by the proposed virtual view assisted super resolution algorithm. The inter-view similarity is used to determine whether to fill the missing pixels in the super-resolved frame by virtual view pixels or by spatial interpolated pixels. The decision mechanism is steered by the texture characteristics of the neighbors of each missing pixel. Thus, the proposed method can recover the details in regions with edges while maintaining good quality at smooth areas by properly exploiting the high quality virtual view pixels and the directional correlation of pixels. The second texture super-resolution algorithm is based on the Multiview Video plus Depth (MVD) system, which consists of textures and the associated per-pixel depth data. In order to further reduce the transmitted data and the quality degradation of received video, a systematical framework to downsample the original MVD data and later on to super-resolved the LR views is proposed. At the encoder side, the rows of the two adjacent views are downsampled following an interlacing and complementary fashion, whereas, at the decoder side, the discarded pixels are recovered by fusing the virtual view pixels with the directional interpolated pixels from the complementary downsampled views. Consequently, with the assistance of virtual views, the proposed approach can effectively achieve these two goals. From previous two works, we can observe that depth data has big potential to be used in 3D video enhancement. However, due to the low spatial resolution of Time-of-Flight (ToF) depth camera generated depth images, their applications have been limited. Hence, in the last contribution of this thesis, a planar-surface-based depth map super-resolution approach is presented, which interpolates depth images by exploiting the equation of each detected planar surface. Both quantitative and qualitative experimental results demonstrate the effectiveness and robustness of the proposed approach over benchmark methods