591 research outputs found

    Learned Multi-View Texture Super-Resolution

    Full text link
    We present a super-resolution method capable of creating a high-resolution texture map for a virtual 3D object from a set of lower-resolution images of that object. Our architecture unifies the concepts of (i) multi-view super-resolution based on the redundancy of overlapping views and (ii) single-view super-resolution based on a learned prior of high-resolution (HR) image structure. The principle of multi-view super-resolution is to invert the image formation process and recover the latent HR texture from multiple lower-resolution projections. We map that inverse problem into a block of suitably designed neural network layers, and combine it with a standard encoder-decoder network for learned single-image super-resolution. Wiring the image formation model into the network avoids having to learn perspective mapping from textures to images, and elegantly handles a varying number of input views. Experiments demonstrate that the combination of multi-view observations and learned prior yields improved texture maps.Comment: 11 pages, 5 figures, 2019 International Conference on 3D Vision (3DV

    Transformées basées graphes pour la compression de nouvelles modalités d’image

    Get PDF
    Due to the large availability of new camera types capturing extra geometrical information, as well as the emergence of new image modalities such as light fields and omni-directional images, a huge amount of high dimensional data has to be stored and delivered. The ever growing streaming and storage requirements of these new image modalities require novel image coding tools that exploit the complex structure of those data. This thesis aims at exploring novel graph based approaches for adapting traditional image transform coding techniques to the emerging data types where the sampled information are lying on irregular structures. In a first contribution, novel local graph based transforms are designed for light field compact representations. By leveraging a careful design of local transform supports and a local basis functions optimization procedure, significant improvements in terms of energy compaction can be obtained. Nevertheless, the locality of the supports did not permit to exploit long term dependencies of the signal. This led to a second contribution where different sampling strategies are investigated. Coupled with novel prediction methods, they led to very prominent results for quasi-lossless compression of light fields. The third part of the thesis focuses on the definition of rate-distortion optimized sub-graphs for the coding of omni-directional content. If we move further and give more degree of freedom to the graphs we wish to use, we can learn or define a model (set of weights on the edges) that might not be entirely reliable for transform design. The last part of the thesis is dedicated to theoretically analyze the effect of the uncertainty on the efficiency of the graph transforms.En raison de la grande disponibilité de nouveaux types de caméras capturant des informations géométriques supplémentaires, ainsi que de l'émergence de nouvelles modalités d'image telles que les champs de lumière et les images omnidirectionnelles, il est nécessaire de stocker et de diffuser une quantité énorme de hautes dimensions. Les exigences croissantes en matière de streaming et de stockage de ces nouvelles modalités d’image nécessitent de nouveaux outils de codage d’images exploitant la structure complexe de ces données. Cette thèse a pour but d'explorer de nouvelles approches basées sur les graphes pour adapter les techniques de codage de transformées d'image aux types de données émergents où les informations échantillonnées reposent sur des structures irrégulières. Dans une première contribution, de nouvelles transformées basées sur des graphes locaux sont conçues pour des représentations compactes des champs de lumière. En tirant parti d’une conception minutieuse des supports de transformées locaux et d’une procédure d’optimisation locale des fonctions de base , il est possible d’améliorer considérablement le compaction d'énergie. Néanmoins, la localisation des supports ne permettait pas d'exploiter les dépendances à long terme du signal. Cela a conduit à une deuxième contribution où différentes stratégies d'échantillonnage sont étudiées. Couplés à de nouvelles méthodes de prédiction, ils ont conduit à des résultats très importants en ce qui concerne la compression quasi sans perte de champs de lumière statiques. La troisième partie de la thèse porte sur la définition de sous-graphes optimisés en distorsion de débit pour le codage de contenu omnidirectionnel. Si nous allons plus loin et donnons plus de liberté aux graphes que nous souhaitons utiliser, nous pouvons apprendre ou définir un modèle (ensemble de poids sur les arêtes) qui pourrait ne pas être entièrement fiable pour la conception de transformées. La dernière partie de la thèse est consacrée à l'analyse théorique de l'effet de l'incertitude sur l'efficacité des transformées basées graphes

    Unsupervised spectral sub-feature learning for hyperspectral image classification

    Get PDF
    Spectral pixel classification is one of the principal techniques used in hyperspectral image (HSI) analysis. In this article, we propose an unsupervised feature learning method for classification of hyperspectral images. The proposed method learns a dictionary of sub-feature basis representations from the spectral domain, which allows effective use of the correlated spectral data. The learned dictionary is then used in encoding convolutional samples from the hyperspectral input pixels to an expanded but sparse feature space. Expanded hyperspectral feature representations enable linear separation between object classes present in an image. To evaluate the proposed method, we performed experiments on several commonly used HSI data sets acquired at different locations and by different sensors. Our experimental results show that the proposed method outperforms other pixel-wise classification methods that make use of unsupervised feature extraction approaches. Additionally, even though our approach does not use any prior knowledge, or labelled training data to learn features, it yields either advantageous, or comparable, results in terms of classification accuracy with respect to recent semi-supervised methods

    Learning sparse representations of depth

    Full text link
    This paper introduces a new method for learning and inferring sparse representations of depth (disparity) maps. The proposed algorithm relaxes the usual assumption of the stationary noise model in sparse coding. This enables learning from data corrupted with spatially varying noise or uncertainty, typically obtained by laser range scanners or structured light depth cameras. Sparse representations are learned from the Middlebury database disparity maps and then exploited in a two-layer graphical model for inferring depth from stereo, by including a sparsity prior on the learned features. Since they capture higher-order dependencies in the depth structure, these priors can complement smoothness priors commonly used in depth inference based on Markov Random Field (MRF) models. Inference on the proposed graph is achieved using an alternating iterative optimization technique, where the first layer is solved using an existing MRF-based stereo matching algorithm, then held fixed as the second layer is solved using the proposed non-stationary sparse coding algorithm. This leads to a general method for improving solutions of state of the art MRF-based depth estimation algorithms. Our experimental results first show that depth inference using learned representations leads to state of the art denoising of depth maps obtained from laser range scanners and a time of flight camera. Furthermore, we show that adding sparse priors improves the results of two depth estimation methods: the classical graph cut algorithm by Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page

    Compression and Subjective Quality Assessment of 3D Video

    Get PDF
    In recent years, three-dimensional television (3D TV) has been broadly considered as the successor to the existing traditional two-dimensional television (2D TV) sets. With its capability of offering a dynamic and immersive experience, 3D video (3DV) is expected to expand conventional video in several applications in the near future. However, 3D content requires more than a single view to deliver the depth sensation to the viewers and this, inevitably, increases the bitrate compared to the corresponding 2D content. This need drives the research trend in video compression field towards more advanced and more efficient algorithms. Currently, the Advanced Video Coding (H.264/AVC) is the state-of-the-art video coding standard which has been developed by the Joint Video Team of ISO/IEC MPEG and ITU-T VCEG. This codec has been widely adopted in various applications and products such as TV broadcasting, video conferencing, mobile TV, and blue-ray disc. One important extension of H.264/AVC, namely Multiview Video Coding (MVC) was an attempt to multiple view compression by taking into consideration the inter-view dependency between different views of the same scene. This codec H.264/AVC with its MVC extension (H.264/MVC) can be used for encoding either conventional stereoscopic video, including only two views, or multiview video, including more than two views. In spite of the high performance of H.264/MVC, a typical multiview video sequence requires a huge amount of storage space, which is proportional to the number of offered views. The available views are still limited and the research has been devoted to synthesizing an arbitrary number of views using the multiview video and depth map (MVD). This process is mandatory for auto-stereoscopic displays (ASDs) where many views are required at the viewer side and there is no way to transmit such a relatively huge number of views with currently available broadcasting technology. Therefore, to satisfy the growing hunger for 3D related applications, it is mandatory to further decrease the bitstream by introducing new and more efficient algorithms for compressing multiview video and depth maps. This thesis tackles the 3D content compression targeting different formats i.e. stereoscopic video and depth-enhanced multiview video. Stereoscopic video compression algorithms introduced in this thesis mostly focus on proposing different types of asymmetry between the left and right views. This means reducing the quality of one view compared to the other view aiming to achieve a better subjective quality against the symmetric case (the reference) and under the same bitrate constraint. The proposed algorithms to optimize depth-enhanced multiview video compression include both texture compression schemes as well as depth map coding tools. Some of the introduced coding schemes proposed for this format include asymmetric quality between the views. Knowing that objective metrics are not able to accurately estimate the subjective quality of stereoscopic content, it is suggested to perform subjective quality assessment to evaluate different codecs. Moreover, when the concept of asymmetry is introduced, the Human Visual System (HVS) performs a fusion process which is not completely understood. Therefore, another important aspect of this thesis is conducting several subjective tests and reporting the subjective ratings to evaluate the perceived quality of the proposed coded content against the references. Statistical analysis is carried out in the thesis to assess the validity of the subjective ratings and determine the best performing test cases

    From nanometers to centimeters: Imaging across spatial scales with smart computer-aided microscopy

    Get PDF
    Microscopes have been an invaluable tool throughout the history of the life sciences, as they allow researchers to observe the miniscule details of living systems in space and time. However, modern biology studies complex and non-obvious phenotypes and their distributions in populations and thus requires that microscopes evolve from visual aids for anecdotal observation into instruments for objective and quantitative measurements. To this end, many cutting-edge developments in microscopy are fuelled by innovations in the computational processing of the generated images. Computational tools can be applied in the early stages of an experiment, where they allow for reconstruction of images with higher resolution and contrast or more colors compared to raw data. In the final analysis stage, state-of-the-art image analysis pipelines seek to extract interpretable and humanly tractable information from the high-dimensional space of images. In the work presented in this thesis, I performed super-resolution microscopy and wrote image analysis pipelines to derive quantitative information about multiple biological processes. I contributed to studies on the regulation of DNMT1 by implementing machine learning-based segmentation of replication sites in images and performed quantitative statistical analysis of the recruitment of multiple DNMT1 mutants. To study the spatiotemporal distribution of DNA damage response I performed STED microscopy and could provide a lower bound on the size of the elementary spatial units of DNA repair. In this project, I also wrote image analysis pipelines and performed statistical analysis to show a decoupling of DNA density and heterochromatin marks during repair. More on the experimental side, I helped in the establishment of a protocol for many-fold color multiplexing by iterative labelling of diverse structures via DNA hybridization. Turning from small scale details to the distribution of phenotypes in a population, I wrote a reusable pipeline for fitting models of cell cycle stage distribution and inhibition curves to high-throughput measurements to quickly quantify the effects of innovative antiproliferative antibody-drug-conjugates. The main focus of the thesis is BigStitcher, a tool for the management and alignment of terabyte-sized image datasets. Such enormous datasets are nowadays generated routinely with light-sheet microscopy and sample preparation techniques such as clearing or expansion. Their sheer size, high dimensionality and unique optical properties poses a serious bottleneck for researchers and requires specialized processing tools, as the images often do not fit into the main memory of most computers. BigStitcher primarily allows for fast registration of such many-dimensional datasets on conventional hardware using optimized multi-resolution alignment algorithms. The software can also correct a variety of aberrations such as fixed-pattern noise, chromatic shifts and even complex sample-induced distortions. A defining feature of BigStitcher, as well as the various image analysis scripts developed in this work is their interactivity. A central goal was to leverage the user's expertise at key moments and bring innovations from the big data world to the lab with its smaller and much more diverse datasets without replacing scientists with automated black-box pipelines. To this end, BigStitcher was implemented as a user-friendly plug-in for the open source image processing platform Fiji and provides the users with a nearly instantaneous preview of the aligned images and opportunities for manual control of all processing steps. With its powerful features and ease-of-use, BigStitcher paves the way to the routine application of light-sheet microscopy and other methods producing equally large datasets

    From nanometers to centimeters: Imaging across spatial scales with smart computer-aided microscopy

    Get PDF
    Microscopes have been an invaluable tool throughout the history of the life sciences, as they allow researchers to observe the miniscule details of living systems in space and time. However, modern biology studies complex and non-obvious phenotypes and their distributions in populations and thus requires that microscopes evolve from visual aids for anecdotal observation into instruments for objective and quantitative measurements. To this end, many cutting-edge developments in microscopy are fuelled by innovations in the computational processing of the generated images. Computational tools can be applied in the early stages of an experiment, where they allow for reconstruction of images with higher resolution and contrast or more colors compared to raw data. In the final analysis stage, state-of-the-art image analysis pipelines seek to extract interpretable and humanly tractable information from the high-dimensional space of images. In the work presented in this thesis, I performed super-resolution microscopy and wrote image analysis pipelines to derive quantitative information about multiple biological processes. I contributed to studies on the regulation of DNMT1 by implementing machine learning-based segmentation of replication sites in images and performed quantitative statistical analysis of the recruitment of multiple DNMT1 mutants. To study the spatiotemporal distribution of DNA damage response I performed STED microscopy and could provide a lower bound on the size of the elementary spatial units of DNA repair. In this project, I also wrote image analysis pipelines and performed statistical analysis to show a decoupling of DNA density and heterochromatin marks during repair. More on the experimental side, I helped in the establishment of a protocol for many-fold color multiplexing by iterative labelling of diverse structures via DNA hybridization. Turning from small scale details to the distribution of phenotypes in a population, I wrote a reusable pipeline for fitting models of cell cycle stage distribution and inhibition curves to high-throughput measurements to quickly quantify the effects of innovative antiproliferative antibody-drug-conjugates. The main focus of the thesis is BigStitcher, a tool for the management and alignment of terabyte-sized image datasets. Such enormous datasets are nowadays generated routinely with light-sheet microscopy and sample preparation techniques such as clearing or expansion. Their sheer size, high dimensionality and unique optical properties poses a serious bottleneck for researchers and requires specialized processing tools, as the images often do not fit into the main memory of most computers. BigStitcher primarily allows for fast registration of such many-dimensional datasets on conventional hardware using optimized multi-resolution alignment algorithms. The software can also correct a variety of aberrations such as fixed-pattern noise, chromatic shifts and even complex sample-induced distortions. A defining feature of BigStitcher, as well as the various image analysis scripts developed in this work is their interactivity. A central goal was to leverage the user's expertise at key moments and bring innovations from the big data world to the lab with its smaller and much more diverse datasets without replacing scientists with automated black-box pipelines. To this end, BigStitcher was implemented as a user-friendly plug-in for the open source image processing platform Fiji and provides the users with a nearly instantaneous preview of the aligned images and opportunities for manual control of all processing steps. With its powerful features and ease-of-use, BigStitcher paves the way to the routine application of light-sheet microscopy and other methods producing equally large datasets

    Fehlerkaschierte Bildbasierte Darstellungsverfahren

    Get PDF
    Creating photo-realistic images has been one of the major goals in computer graphics since its early days. Instead of modeling the complexity of nature with standard modeling tools, image-based approaches aim at exploiting real-world footage directly,as they are photo-realistic by definition. A drawback of these approaches has always been that the composition or combination of different sources is a non-trivial task, often resulting in annoying visible artifacts. In this thesis we focus on different techniques to diminish visible artifacts when combining multiple images in a common image domain. The results are either novel images, when dealing with the composition task of multiple images, or novel video sequences rendered in real-time, when dealing with video footage from multiple cameras.Fotorealismus ist seit jeher eines der großen Ziele in der Computergrafik. Anstatt die Komplexität der Natur mit standardisierten Modellierungswerkzeugen nachzubauen, gehen bildbasierte Ansätze den umgekehrten Weg und verwenden reale Bildaufnahmen zur Modellierung, da diese bereits per Definition fotorealistisch sind. Ein Nachteil dieser Variante ist jedoch, dass die Komposition oder Kombination mehrerer Quellbilder eine nichttriviale Aufgabe darstellt und häufig unangenehm auffallende Artefakte im erzeugten Bild nach sich zieht. In dieser Dissertation werden verschiedene Ansätze verfolgt, um Artefakte zu verhindern oder abzuschwächen, welche durch die Komposition oder Kombination mehrerer Bilder in einer gemeinsamen Bilddomäne entstehen. Im Ergebnis liefern die vorgestellten Verfahren neue Bilder oder neue Ansichten einer Bildsammlung oder Videosequenz, je nachdem, ob die jeweilige Aufgabe die Komposition mehrerer Bilder ist oder die Kombination mehrerer Videos verschiedener Kameras darstellt
    • …
    corecore