    Efficient Encoding of Wireless Capsule Endoscopy Images Using Direct Compression of Colour Filter Array Images

    Since its invention in 2001, wireless capsule endoscopy (WCE) has played an important role in the endoscopic examination of the gastrointestinal tract. During this period, WCE has undergone tremendous advances in technology, making it the first-line modality for diseases from bleeding to cancer in the small-bowel. Current research efforts are focused on evolving WCE to include functionality such as drug delivery, biopsy, and active locomotion. For the integration of these functionalities into WCE, two critical prerequisites are the image quality enhancement and the power consumption reduction. An efficient image compression solution is required to retain the highest image quality while reducing the transmission power. The issue is more challenging due to the fact that image sensors in WCE capture images in Bayer Colour filter array (CFA) format. Therefore, standard compression engines provide inferior compression performance. The focus of this thesis is to design an optimized image compression pipeline to encode the capsule endoscopic (CE) image efficiently in CFA format. To this end, this thesis proposes two image compression schemes. First, a lossless image compression algorithm is proposed consisting of an optimum reversible colour transformation, a low complexity prediction model, a corner clipping mechanism and a single context adaptive Golomb-Rice entropy encoder. The derivation of colour transformation that provides the best performance for a given prediction model is considered as an optimization problem. The low complexity prediction model works in raster order fashion and requires no buffer memory. The application of colour transformation yields lower inter-colour correlation and allows the efficient independent encoding of the colour components. The second compression scheme in this thesis is a lossy compression algorithm with a integer discrete cosine transformation at its core. Using the statistics obtained from a large dataset of CE image, an optimum colour transformation is derived using the principal component analysis (PCA). The transformed coefficients are quantized using optimized quantization table, which was designed with a focus to discard medically irrelevant information. A fast demosaicking algorithm is developed to reconstruct the colour image from the lossy CFA image in the decoder. Extensive experiments and comparisons with state-of-the-art lossless image compression methods establish the superiority of the proposed compression methods as simple and efficient image compression algorithm. The lossless algorithm can transmit the image in a lossless manner within the available bandwidth. On the other hand, performance evaluation of lossy compression algorithm indicates that it can deliver high quality images at low transmission power and low computation costs

    A graph learning approach for light field image compression

    In recent years, light field imaging has attracted the attention of the academic and industrial communities thanks to its enhanced rendering capabilities that allow to visualise contents in a more immersive and interactive way. However, those enhanced capabilities come at the cost of a considerable increase in content size when compared to traditional image and video applications. Thus, advanced compression schemes are needed to efficiently reduce the volume of data for storage and delivery of light field content. In this paper, we introduce a novel method for compression of light field images. The proposed solution is based on a graph learning approach to estimate the disparity among the views composing the light field. The graph is then used to reconstruct the entire light field from an arbitrary subset of encoded views. Experimental results show that our method is a promising alternative to current compression algorithms for light field images, with notable gains across all bitrates with respect to the state of the art

    Dense light field coding: a survey

    Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems. Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.info:eu-repo/semantics/publishedVersio

    Near-Lossless Coding of Plenoptic Camera Sensor Images for Archiving Light Field Array of Views

    In this paper we propose a near-lossless encoder for sensor images acquired by plenoptic cameras, and we investigate its usage for encoding in an archive all information needed for reconstructing high quality versions of the light field (LF) array of views(AoV). The near-lossless encoding of the plenoptic camera sensor image is realized by a modified version of the recently published sparse relevant regressors and contexts (SRRC) encoder. The lossy reconstruction is obtained in two nested loops: the outer one operates over the sensor image patches (each patch corresponding to a microlens image), and the inner loop operates over the pixels in the patch. In the latter, we enforce the SRRC predictors to use the already reconstructed lossy version of the sensor image. Then, we examine the usage of the near-lossless SRRC (NL-SRRC) codec as a building block for an archiving scheme including all information needed for running the plenoptic processing pipeline and obtaining the LF-AoV. Finally, we replace in the archiving scheme the NL-SRRC codec with other state of the art lossy codecs and compare the results, which show that NL-SRRC based archiving scheme achieves better performance for the range of high bitrates.acceptedVersionPeer reviewe

    Transformées basées graphes pour la compression de nouvelles modalités d’image

    Due to the large availability of new camera types capturing extra geometrical information, as well as the emergence of new image modalities such as light fields and omni-directional images, a huge amount of high dimensional data has to be stored and delivered. The ever growing streaming and storage requirements of these new image modalities require novel image coding tools that exploit the complex structure of those data. This thesis aims at exploring novel graph based approaches for adapting traditional image transform coding techniques to the emerging data types where the sampled information are lying on irregular structures. In a first contribution, novel local graph based transforms are designed for light field compact representations. By leveraging a careful design of local transform supports and a local basis functions optimization procedure, significant improvements in terms of energy compaction can be obtained. Nevertheless, the locality of the supports did not permit to exploit long term dependencies of the signal. This led to a second contribution where different sampling strategies are investigated. Coupled with novel prediction methods, they led to very prominent results for quasi-lossless compression of light fields. The third part of the thesis focuses on the definition of rate-distortion optimized sub-graphs for the coding of omni-directional content. If we move further and give more degree of freedom to the graphs we wish to use, we can learn or define a model (set of weights on the edges) that might not be entirely reliable for transform design. The last part of the thesis is dedicated to theoretically analyze the effect of the uncertainty on the efficiency of the graph transforms.En raison de la grande disponibilité de nouveaux types de caméras capturant des informations géométriques supplémentaires, ainsi que de l'émergence de nouvelles modalités d'image telles que les champs de lumière et les images omnidirectionnelles, il est nécessaire de stocker et de diffuser une quantité énorme de hautes dimensions. Les exigences croissantes en matière de streaming et de stockage de ces nouvelles modalités d’image nécessitent de nouveaux outils de codage d’images exploitant la structure complexe de ces données. Cette thèse a pour but d'explorer de nouvelles approches basées sur les graphes pour adapter les techniques de codage de transformées d'image aux types de données émergents où les informations échantillonnées reposent sur des structures irrégulières. Dans une première contribution, de nouvelles transformées basées sur des graphes locaux sont conçues pour des représentations compactes des champs de lumière. En tirant parti d’une conception minutieuse des supports de transformées locaux et d’une procédure d’optimisation locale des fonctions de base , il est possible d’améliorer considérablement le compaction d'énergie. Néanmoins, la localisation des supports ne permettait pas d'exploiter les dépendances à long terme du signal. Cela a conduit à une deuxième contribution où différentes stratégies d'échantillonnage sont étudiées. Couplés à de nouvelles méthodes de prédiction, ils ont conduit à des résultats très importants en ce qui concerne la compression quasi sans perte de champs de lumière statiques. La troisième partie de la thèse porte sur la définition de sous-graphes optimisés en distorsion de débit pour le codage de contenu omnidirectionnel. Si nous allons plus loin et donnons plus de liberté aux graphes que nous souhaitons utiliser, nous pouvons apprendre ou définir un modèle (ensemble de poids sur les arêtes) qui pourrait ne pas être entièrement fiable pour la conception de transformées. La dernière partie de la thèse est consacrée à l'analyse théorique de l'effet de l'incertitude sur l'efficacité des transformées basées graphes

    Digital Image Processing

    This book presents several recent advances that are related or fall under the umbrella of 'digital image processing', with the purpose of providing an insight into the possibilities offered by digital image processing algorithms in various fields. The presented mathematical algorithms are accompanied by graphical representations and illustrative examples for an enhanced readability. The chapters are written in a manner that allows even a reader with basic experience and knowledge in the digital image processing field to properly understand the presented algorithms. Concurrently, the structure of the information in this book is such that fellow scientists will be able to use it to push the development of the presented subjects even further

    Transmissão progressiva de imagens sintetizadas de light field

    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2018.Esta proposta estabelece um método otimizado baseado em taxa-distorção para transmitir imagens sintetizadas de light field. Resumidamente, uma imagem light field pode ser interpretada como um dado quadridimensional (4D) que possui tanto resolução espacial, quanto resolução angular, sendo que cada subimagem bidimensional desse dado 4D é tido como uma determinada perspectiva, isto é, uma imagem de subabertura (SAI, do inglês Sub-Aperture Image). Este trabalho visa modi car e aprimorar uma proposta anterior chamada de Comunicação Progressiva de Light Field (PLFC, do inglês Progressive Light Field Communication ), a qual trata da sintetização de imagens referentes a diferentes focos requisitados por um usuário. Como o PLFC, este trabalho busca fornecer informação suficiente para o usuário de modo que, conforme a transmissão avance, ele tenha condições de sintetizar suas próprias imagens de ponto focal, sem a necessidade de se enviar novas imagens. Assim, a primeira modificação proposta diz respeito à como escolher a cache inicial do usuário, determinando uma quantidade ideal de imagens de subabertura para enviar no início da transmissão. Propõe-se também um aprimoramento do processo de seleção de imagens adicionais por meio de um algoritmo de refinamento, o qual é aplicado inclusive na inicialização da cache. Esse novo processo de seleção lida com QPs (Passo de Quantização, do inglês Quantization Parameter ) dinâmicos durante a codificação e envolve não só os ganhos imediatos para a qualidade da imagem sintetizada, mas ainda considera as sintetizações subsequentes. Tal ideia já foi apresentada pelo PLFC, mas não havia sido implementada de maneira satisfatória. Estabelece-se ainda uma maneira automática para calcular o multiplicador de Lagrange que controla a influência do benefício futuro associado à transmissão de uma SAI. Por fim, descreve-se um modo simplificado de obter esse benefício futuro, reduzindo a complexidade computacional envolvida. Muitas são as utilidades de um sistema como este, podendo, por exemplo, ser usado para identificar algum elemento em uma imagem light field, ajustando apropriadamente o foco em questão. Além da proposta, os resultados obtidos são exibidos, sendo feita uma discussão acerca dos significativos ganhos conseguidos de até 32; 8% com relação ao PLFC anterior em termos de BD-Taxa. Esse ganho chega a ser de até 85; 8% em comparação com transmissões triviais de dados light field.This work proposes an optimized rate-distortion method to transmit light field synthesized images. Briefy, light eld images could be understood like quadridimensional (4D) data, which have both spatial and angular resolution, once each bidimensional subimage in this 4D image is a certain perspective, that is, a SAI (Sub-Aperture Image). This work aims to modify and to improve a previous proposal named PLFC (Progressive Light Field Communication), which addresses the image synthesis for diferent focal point images requested by an user. Like the PLFC, this work tries to provide enough information to the user so that, as the transmsission progress, he can synthesize his own focal point images, without the need to transmit new images. Thus, the first proposed modification refers to how the user's initial cache should be chosen, defining an ideal ammount of SAIs to send at the transmission begining. An improvement of the additional images selection process is also proposed by means of a refinement algorithm, which is applied even in the cache initialization. This new selection process works with dynamic QPs (Quantization Parameter) during encoding and involves not only the immediate gains for the synthesized image, but either considers the subsequent synthesis. This idea already was presented by PLFC, but had not been satisfactorily implemented. Moreover, this work proposes an automatic way to calculate the Lagrange multiplier which controls the in uence of the future benefit associated with the transmission of some SAI. Finally, a simplified manner of obtaining this future benefit is then described, reducing the computational complexity involved. The utilities of such a system are diverse and, for example, it can be used to identify some element in a light field image, adjusting the focus accordingly. Besides the proposal, the obtained results are shown, and a discussion is made about the significant achieved gains up to 32:8% compared to the previous PLFC in terms of BD-Rate. This gain is up to 85:8% in relation to trivial light field data transmissions

    Compression and visual quality assessment for light field contents

    Since its invention in the 19th century, photography has allowed to create durable images of the world around us by capturing the intensity of light that flows through a scene, first analogically by using light-sensitive material, and then, with the advent of electronic image sensors, digitally. However, one main limitation of both analog and digital photography lays in its inability to capture any information about the direction of light rays. Through traditional photography, each three-dimensional scene is projected onto a 2D plane; consequently, no information about the position of the 3D objects in space is retained. Light field photography aims at overcoming these limitations by recording the direction of light along with its intensity. In the past, several acquisition technologies have been presented to properly capture light field information, and portable devices have been commercialized to the general public. However, a considerably larger volume of data is generated when compared to traditional photography. Thus, new solutions must be designed to face the challenges light field photography poses in terms of storage, representation, and visualization of the acquired data. In particular, new and efficient compression algorithms are needed to sensibly reduce the amount of data that needs to be stored and transmitted, while maintaining an adequate level of perceptual quality. In designing new solutions to address the unique challenges posed by light field photography, one cannot forgo the importance of having reliable, reproducible means of evaluating their performance, especially in relation to the scenario in which they will be consumed. To that end, subjective assessment of visual quality is of paramount importance to evaluate the impact of compression, representation, and rendering models on user experience. Yet, the standardized methodologies that are commonly used to evaluate the visual quality of traditional media content, such as images and videos, are not equipped to tackle the challenges posed by light field photography. New subjective methodologies must be tailored for the new possibilities this new type of imaging offers in terms of rendering and visual experience. In this work, we address the aforementioned problems by both designing new methodologies for visual quality evaluation of light field contents, and outlining a new compression solution to efficiently reduce the amount of data that needs to be transmitted and stored. We first analyse how traditional methodologies for subjective evaluation of multimedia contents can be adapted to suit light field data, and, we propose new methodologies to reliably assess the visual quality while maintaining user engagement. Furthermore, we study how user behavior is affected by the visual quality of the data. We employ subjective quality assessment to compare several state-of-the-art solutions in light field coding, in order to find the most promising approaches to minimize the volume of data without compromising on the perceptual quality. To that means, we define and inspect several coding approaches for light field compression, and we investigate the impact of color subsampling on the final rendered content. Lastly, we propose a new coding approach to perform light field compression, showing significant improvement with respect to the state of the art

    Multiresolution image models and estimation techniques

