26 research outputs found

    Efficient Encoding of Wireless Capsule Endoscopy Images Using Direct Compression of Colour Filter Array Images

    Get PDF
    Since its invention in 2001, wireless capsule endoscopy (WCE) has played an important role in the endoscopic examination of the gastrointestinal tract. During this period, WCE has undergone tremendous advances in technology, making it the first-line modality for diseases from bleeding to cancer in the small-bowel. Current research efforts are focused on evolving WCE to include functionality such as drug delivery, biopsy, and active locomotion. For the integration of these functionalities into WCE, two critical prerequisites are the image quality enhancement and the power consumption reduction. An efficient image compression solution is required to retain the highest image quality while reducing the transmission power. The issue is more challenging due to the fact that image sensors in WCE capture images in Bayer Colour filter array (CFA) format. Therefore, standard compression engines provide inferior compression performance. The focus of this thesis is to design an optimized image compression pipeline to encode the capsule endoscopic (CE) image efficiently in CFA format. To this end, this thesis proposes two image compression schemes. First, a lossless image compression algorithm is proposed consisting of an optimum reversible colour transformation, a low complexity prediction model, a corner clipping mechanism and a single context adaptive Golomb-Rice entropy encoder. The derivation of colour transformation that provides the best performance for a given prediction model is considered as an optimization problem. The low complexity prediction model works in raster order fashion and requires no buffer memory. The application of colour transformation yields lower inter-colour correlation and allows the efficient independent encoding of the colour components. The second compression scheme in this thesis is a lossy compression algorithm with a integer discrete cosine transformation at its core. Using the statistics obtained from a large dataset of CE image, an optimum colour transformation is derived using the principal component analysis (PCA). The transformed coefficients are quantized using optimized quantization table, which was designed with a focus to discard medically irrelevant information. A fast demosaicking algorithm is developed to reconstruct the colour image from the lossy CFA image in the decoder. Extensive experiments and comparisons with state-of-the-art lossless image compression methods establish the superiority of the proposed compression methods as simple and efficient image compression algorithm. The lossless algorithm can transmit the image in a lossless manner within the available bandwidth. On the other hand, performance evaluation of lossy compression algorithm indicates that it can deliver high quality images at low transmission power and low computation costs

    Spectral Characterization of a Prototype SFA Camera for Joint Visible and NIR Acquisition

    Get PDF
    Multispectral acquisition improves machine vision since it permits capturing more information on object surface properties than color imaging. The concept of spectral filter arrays has been developed recently and allows multispectral single shot acquisition with a compact camera design. Due to filter manufacturing difficulties, there was, up to recently, no system available for a large span of spectrum, i.e., visible and Near Infra-Red acquisition. This article presents the achievement of a prototype of camera that captures seven visible and one near infra-red bands on the same sensor chip. A calibration is proposed to characterize the sensor, and images are captured. Data are provided as supplementary material for further analysis and simulations. This opens a new range of applications in security, robotics, automotive and medical fields

    Hyperspectral Demosaicing of Snapshot Camera Images Using Deep Learning

    Full text link
    Spectral imaging technologies have rapidly evolved during the past decades. The recent development of single-camera-one-shot techniques for hyperspectral imaging allows multiple spectral bands to be captured simultaneously (3x3, 4x4 or 5x5 mosaic), opening up a wide range of applications. Examples include intraoperative imaging, agricultural field inspection and food quality assessment. To capture images across a wide spectrum range, i.e. to achieve high spectral resolution, the sensor design sacrifices spatial resolution. With increasing mosaic size, this effect becomes increasingly detrimental. Furthermore, demosaicing is challenging. Without incorporating edge, shape, and object information during interpolation, chromatic artifacts are likely to appear in the obtained images. Recent approaches use neural networks for demosaicing, enabling direct information extraction from image data. However, obtaining training data for these approaches poses a challenge as well. This work proposes a parallel neural network based demosaicing procedure trained on a new ground truth dataset captured in a controlled environment by a hyperspectral snapshot camera with a 4x4 mosaic pattern. The dataset is a combination of real captured scenes with images from publicly available data adapted to the 4x4 mosaic pattern. To obtain real world ground-truth data, we performed multiple camera captures with 1-pixel shifts in order to compose the entire data cube. Experiments show that the proposed network outperforms state-of-art networks.Comment: German Conference on Pattern Recognition (GCPR) 202

    Lossless compression of color filter array mosaic images with visualization via JPEG 2000

    Get PDF
    Digital cameras have become ubiquitous for amateur and professional applications. The raw images captured by digital sensors typically take the form of color filter array (CFA) mosaic images, which must be "developed" (via digital signal processing) before they can be viewed. Photographers and scientists often repeat the "development process" using different parameters to obtain images suitable for different purposes. Since the development process is generally not invertible, it is commonly desirable to store the raw (or undeveloped) mosaic images indefinitely. Uncompressed mosaic image file sizes can be more than 30 times larger than those of developed images stored in JPEG format. Thus, data compression is of interest. Several compression methods for mosaic images have been proposed in the literature. However, they all require a custom decompressor followed by development-specific software to generate a displayable image. In this paper, a novel compression pipeline that removes these requirements is proposed. Specifically, mosaic images can be losslessly recovered from the resulting compressed files, and, more significantly, images can be directly viewed (decompressed and developed) using only a JPEG 2000 compliant image viewer. Experiments reveal that the proposed pipeline attains excellent visual quality, while providing compression performance competitive to that of state-of-the-art compression algorithms for mosaic images

    Joint Demosaicking / Rectification of Fisheye Camera Images using Multi-color Graph Laplacian Regulation

    Get PDF
    To compose one 360 degrees image from multiple viewpoint images taken from different fisheye cameras on a rig for viewing on a head-mounted display (HMD), a conventional processing pipeline first performs demosaicking on each fisheye camera's Bayer-patterned grid, then translates demosaicked pixels from the camera grid to a rectified image grid. By performing two image interpolation steps in sequence, interpolation errors can accumulate, and acquisition noise in each captured pixel can pollute its neighbors, resulting in correlated noise. In this paper, a joint processing framework is proposed that performs demosaicking and grid-to-grid mapping simultaneously, thus limiting noise pollution to one interpolation. Specifically, a reverse mapping function is first obtained from a regular on-grid location in the rectified image to an irregular off-grid location in the camera's Bayer-patterned image. For each pair of adjacent pixels in the rectified grid, its gradient is estimated using the pair's neighboring pixel gradients in three colors in the Bayer-patterned grid. A similarity graph is constructed based on the estimated gradients, and pixels are interpolated in the rectified grid directly via graph Laplacian regularization (GLR). To establish ground truth for objective testing, a large dataset containing pairs of simulated images both in the fisheye camera grid and the rectified image grid is built. Experiments show that the proposed joint demosaicking / rectification method outperforms competing schemes that execute demosaicking and rectification in sequence in both objective and subjective measures

    Joint Demosaicking / Rectification of Fisheye Camera Images using Multi-color Graph Laplacian Regulation

    Get PDF
    To compose one 360 degrees image from multiple viewpoint images taken from different fisheye cameras on a rig for viewing on a head-mounted display (HMD), a conventional processing pipeline first performs demosaicking on each fisheye camera's Bayer-patterned grid, then translates demosaicked pixels from the camera grid to a rectified image grid. By performing two image interpolation steps in sequence, interpolation errors can accumulate, and acquisition noise in each captured pixel can pollute its neighbors, resulting in correlated noise. In this paper, a joint processing framework is proposed that performs demosaicking and grid-to-grid mapping simultaneously, thus limiting noise pollution to one interpolation. Specifically, a reverse mapping function is first obtained from a regular on-grid location in the rectified image to an irregular off-grid location in the camera's Bayer-patterned image. For each pair of adjacent pixels in the rectified grid, its gradient is estimated using the pair's neighboring pixel gradients in three colors in the Bayer-patterned grid. A similarity graph is constructed based on the estimated gradients, and pixels are interpolated in the rectified grid directly via graph Laplacian regularization (GLR). To establish ground truth for objective testing, a large dataset containing pairs of simulated images both in the fisheye camera grid and the rectified image grid is built. Experiments show that the proposed joint demosaicking / rectification method outperforms competing schemes that execute demosaicking and rectification in sequence in both objective and subjective measures

    Spectral Characterization of a Prototype SFA Camera for Joint Visible and NIR Acquisition

    No full text
    International audienceMultispectral acquisition improves machine vision since it permits capturing more information on object surface properties than color imaging. The concept of spectral filter arrays has been developed recently and allows multispectral single shot acquisition with a compact camera design. Due to filter manufacturing difficulties, there was, up to recently, no system available for a large span of spectrum, i.e., visible and Near Infra-Red acquisition. This article presents the achievement of a prototype of camera that captures seven visible and one near infra-red bands on the same sensor chip. A calibration is proposed to characterize the sensor, and images are captured. Data are provided as supplementary material for further analysis and simulations. This opens a new range of applications in security, robotics, automotive and medical fields

    Side-Information For Steganography Design And Detection

    Get PDF
    Today, the most secure steganographic schemes for digital images embed secret messages while minimizing a distortion function that describes the local complexity of the content. Distortion functions are heuristically designed to predict the modeling error, or in other words, how difficult it would be to detect a single change to the original image in any given area. This dissertation investigates how both the design and detection of such content-adaptive schemes can be improved with the use of side-information. We distinguish two types of side-information, public and private: Public side-information is available to the sender and at least in part also to anybody else who can observe the communication. Content complexity is a typical example of public side-information. While it is commonly used for steganography, it can also be used for detection. In this work, we propose a modification to the rich-model style feature sets in both spatial and JPEG domain to inform such feature sets of the content complexity. Private side-information is available only to the sender. The previous use of private side-information in steganography was very successful but limited to steganography in JPEG images. Also, the constructions were based on heuristic with little theoretical foundations. This work tries to remedy this deficiency by introducing a scheme that generalizes the previous approach to an arbitrary domain. We also put forward a theoretical investigation of how to incorporate side-information based on a model of images. Third, we propose to use a novel type of side-information in the form of multiple exposures for JPEG steganography

    Scalable light field representation and coding

    Get PDF
    This Thesis aims to advance the state-of-the-art in light field representation and coding. In this context, proposals to improve functionalities like light field random access and scalability are also presented. As the light field representation constrains the coding approach to be used, several light field coding techniques to exploit the inherent characteristics of the most popular types of light field representations are proposed and studied, which are normally based on micro-images or sub-aperture-images. To encode micro-images, two solutions are proposed, aiming to exploit the redundancy between neighboring micro-images using a high order prediction model, where the model parameters are either explicitly transmitted or inferred at the decoder, respectively. In both cases, the proposed solutions are able to outperform low order prediction solutions. To encode sub-aperture-images, an HEVC-based solution that exploits their inherent intra and inter redundancies is proposed. In this case, the light field image is encoded as a pseudo video sequence, where the scanning order is signaled, allowing the encoder and decoder to optimize the reference picture lists to improve coding efficiency. A novel hybrid light field representation coding approach is also proposed, by exploiting the combined use of both micro-image and sub-aperture-image representation types, instead of using each representation individually. In order to aid the fast deployment of the light field technology, this Thesis also proposes scalable coding and representation approaches that enable adequate compatibility with legacy displays (e.g., 2D, stereoscopic or multiview) and with future light field displays, while maintaining high coding efficiency. Additionally, viewpoint random access, allowing to improve the light field navigation and to reduce the decoding delay, is also enabled with a flexible trade-off between coding efficiency and viewpoint random access.Esta Tese tem como objetivo avançar o estado da arte em representação e codificação de campos de luz. Neste contexto, são também apresentadas propostas para melhorar funcionalidades como o acesso aleatório ao campo de luz e a escalabilidade. Como a representação do campo de luz limita a abordagem de codificação a ser utilizada, são propostas e estudadas várias técnicas de codificação de campos de luz para explorar as características inerentes aos seus tipos mais populares de representação, que são normalmente baseadas em micro-imagens ou imagens de sub-abertura. Para codificar as micro-imagens, são propostas duas soluções, visando explorar a redundância entre micro-imagens vizinhas utilizando um modelo de predição de alta ordem, onde os parâmetros do modelo são explicitamente transmitidos ou inferidos no decodificador, respetivamente. Em ambos os casos, as soluções propostas são capazes de superar as soluções de predição de baixa ordem. Para codificar imagens de sub-abertura, é proposta uma solução baseada em HEVC que explora a inerente redundância intra e inter deste tipo de imagens. Neste caso, a imagem do campo de luz é codificada como uma pseudo-sequência de vídeo, onde a ordem de varrimento é sinalizada, permitindo ao codificador e decodificador otimizar as listas de imagens de referência para melhorar a eficiência da codificação. Também é proposta uma nova abordagem de codificação baseada na representação híbrida do campo de luz, explorando o uso combinado dos tipos de representação de micro-imagem e sub-imagem, em vez de usar cada representação individualmente. A fim de facilitar a rápida implantação da tecnologia de campo de luz, esta Tese também propõe abordagens escaláveis de codificação e representação que permitem uma compatibilidade adequada com monitores tradicionais (e.g., 2D, estereoscópicos ou multivista) e com futuros monitores de campo de luz, mantendo ao mesmo tempo uma alta eficiência de codificação. Além disso, o acesso aleatório de pontos de vista, permitindo melhorar a navegação no campo de luz e reduzir o atraso na descodificação, também é permitido com um equilíbrio flexível entre eficiência de codificação e acesso aleatório de pontos de vista
    corecore