41 research outputs found
Perceptually lossless coding of medical images - from abstraction to reality
This work explores a novel vision model based coding approach to encode medical images at a perceptually lossless quality, within the framework of the JPEG 2000 coding engine. Perceptually lossless encoding offers the best of both worlds, delivering images free of visual distortions and at the same time providing significantly greater compression ratio gains over its information lossless counterparts. This is achieved through a visual pruning function, embedded with an advanced model of the human visual system to accurately identify and to efficiently remove visually irrelevant/insignificant information. In addition, it maintains bit-stream compliance with the JPEG 2000 coding framework and subsequently is compliant with the Digital Communications in Medicine standard (DICOM). Equally, the pruning function is applicable to other Discrete Wavelet Transform based image coders, e.g., The Set Partitioning in Hierarchical Trees. Further significant coding gains are exploited through an artificial edge segmentatio n algorithm and a novel arithmetic pruning algorithm. The coding effectiveness and qualitative consistency of the algorithm is evaluated through a double-blind subjective assessment with 31 medical experts, performed using a novel 2-staged forced choice assessment that was devised for medical experts, offering the benefits of greater robustness and accuracy in measuring subjective responses. The assessment showed that no differences of statistical significance were perceivable between the original images and the images encoded by the proposed coder
3D Wavelet Transformation for Visual Data Coding With Spatio and Temporal Scalability as Quality Artifacts: Current State Of The Art
Several techniques based on the three–dimensional (3-D) discrete cosine transform (DCT) have been proposed for visual data coding. These techniques fail to provide coding coupled with quality and resolution scalability, which is a significant drawback for contextual domains, such decease diagnosis, satellite image analysis. This paper gives an overview of several state-of-the-art 3-D wavelet coders that do meet these requirements and mainly investigates various types of compression techniques those exists, and putting it all together for a conclusion on further research scope
3D Medical Image Lossless Compressor Using Deep Learning Approaches
The ever-increasing importance of accelerated information processing, communica-tion, and storing are major requirements within the big-data era revolution. With the extensive rise in data availability, handy information acquisition, and growing data rate, a critical challenge emerges in efficient handling. Even with advanced technical hardware developments and multiple Graphics Processing Units (GPUs) availability, this demand is still highly promoted to utilise these technologies effectively. Health-care systems are one of the domains yielding explosive data growth. Especially when considering their modern scanners abilities, which annually produce higher-resolution and more densely sampled medical images, with increasing requirements for massive storage capacity. The bottleneck in data transmission and storage would essentially be handled with an effective compression method. Since medical information is critical and imposes an influential role in diagnosis accuracy, it is strongly encouraged to guarantee exact reconstruction with no loss in quality, which is the main objective of any lossless compression algorithm. Given the revolutionary impact of Deep Learning (DL) methods in solving many tasks while achieving the state of the art results, includ-ing data compression, this opens tremendous opportunities for contributions. While considerable efforts have been made to address lossy performance using learning-based approaches, less attention was paid to address lossless compression. This PhD thesis investigates and proposes novel learning-based approaches for compressing 3D medical images losslessly.Firstly, we formulate the lossless compression task as a supervised sequential prediction problem, whereby a model learns a projection function to predict a target voxel given sequence of samples from its spatially surrounding voxels. Using such 3D local sampling information efficiently exploits spatial similarities and redundancies in a volumetric medical context by utilising such a prediction paradigm. The proposed NN-based data predictor is trained to minimise the differences with the original data values while the residual errors are encoded using arithmetic coding to allow lossless reconstruction.Following this, we explore the effectiveness of Recurrent Neural Networks (RNNs) as a 3D predictor for learning the mapping function from the spatial medical domain (16 bit-depths). We analyse Long Short-Term Memory (LSTM) models’ generalisabil-ity and robustness in capturing the 3D spatial dependencies of a voxel’s neighbourhood while utilising samples taken from various scanning settings. We evaluate our proposed MedZip models in compressing unseen Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities losslessly, compared to other state-of-the-art lossless compression standards.This work investigates input configurations and sampling schemes for a many-to-one sequence prediction model, specifically for compressing 3D medical images (16 bit-depths) losslessly. The main objective is to determine the optimal practice for enabling the proposed LSTM model to achieve a high compression ratio and fast encoding-decoding performance. A solution for a non-deterministic environments problem was also proposed, allowing models to run in parallel form without much compression performance drop. Compared to well-known lossless codecs, experimental evaluations were carried out on datasets acquired by different hospitals, representing different body segments, and have distinct scanning modalities (i.e. CT and MRI).To conclude, we present a novel data-driven sampling scheme utilising weighted gradient scores for training LSTM prediction-based models. The objective is to determine whether some training samples are significantly more informative than others, specifically in medical domains where samples are available on a scale of billions. The effectiveness of models trained on the presented importance sampling scheme was evaluated compared to alternative strategies such as uniform, Gaussian, and sliced-based sampling
An fpga-based loco-ans implementation for lossless and near-lossless image compression using high-level synthesis
MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliationsIn this work, we present and evaluate a hardware architecture for the LOCO-ANS (Low Complexity Lossless Compression with Asymmetric Numeral Systems) lossless and near-lossless image compressor, which is based on JPEG-LS standard. The design is implemented in two FPGA generations, evaluating its performance for different codec configurations. The tests show that the design is capable of up to 40.5 MPixels/s and 124 MPixels/s per lane for Zynq 7020 and UltraScale+ FPGAs, respectively. Compared to the single thread LOCO-ANS software implementation running in a 1.2 GHz Raspberry Pi 3B, each hardware lane achieves 6.5 times higher throughput, even when implemented in an older and cost-optimized chip like the Zynq 7020. Results are also presented for a lossless only version, which achieves a lower footprint and approximately 50% higher performance than the version that supports both lossless and near-lossless. Interestingly, these great results were obtained applying High-Level Synthesis, describing the coder with C++ code, which tends to establish a trade-off between design time and quality of results. These results show that the algorithm is very suitable for hardware implementation. Moreover, the implemented system is faster and achieves higher compression than the best previously available near-lossless JPEG-LS hardware implementationThis research was funded in part by the Spanish Research Agency under the project AgileMon (AEI PID2019-104451RB-C21
Scalable video compression with optimized visual performance and random accessibility
This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved.
The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling.
The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field.
The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate.
For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video
Distortion-constraint compression of three-dimensional CLSM images using image pyramid and vector quantization
The confocal microscopy imaging techniques, which allow optical sectioning, have
been successfully exploited in biomedical studies. Biomedical scientists can benefit
from more realistic visualization and much more accurate diagnosis by processing and
analysing on a three-dimensional image data. The lack of efficient image compression
standards makes such large volumetric image data slow to transfer over limited
bandwidth networks. It also imposes large storage space requirements and high cost in
archiving and maintenance.
Conventional two-dimensional image coders do not take into account inter-frame
correlations in three-dimensional image data. The standard multi-frame coders, like
video coders, although they have good performance in capturing motion information,
are not efficiently designed for coding multiple frames representing a stack of optical
planes of a real object. Therefore a real three-dimensional image compression
approach should be investigated.
Moreover the reconstructed image quality is a very important concern in compressing
medical images, because it could be directly related to the diagnosis accuracy. Most of
the state-of-the-arts methods are based on transform coding, for instance JPEG is based on discrete-cosine-transform CDCT) and JPEG2000 is based on discrete-
wavelet-transform (DWT). However in DCT and DWT methods, the control
of the reconstructed image quality is inconvenient, involving considerable costs in
computation, since they are fundamentally rate-parameterized methods rather than
distortion-parameterized methods. Therefore it is very desirable to develop a
transform-based distortion-parameterized compression method, which is expected to
have high coding performance and also able to conveniently and accurately control
the final distortion according to the user specified quality requirement.
This thesis describes our work in developing a distortion-constraint three-dimensional
image compression approach, using vector quantization techniques combined with
image pyramid structures. We are expecting our method to have:
1. High coding performance in compressing three-dimensional microscopic
image data, compared to the state-of-the-art three-dimensional image coders
and other standardized two-dimensional image coders and video coders.
2. Distortion-control capability, which is a very desirable feature in medical 2. Distortion-control capability, which is a very desirable feature in medical
image compression applications, is superior to the rate-parameterized methods
in achieving a user specified quality requirement.
The result is a three-dimensional image compression method, which has outstanding
compression performance, measured objectively, for volumetric microscopic images.
The distortion-constraint feature, by which users can expect to achieve a target image
quality rather than the compressed file size, offers more flexible control of the
reconstructed image quality than its rate-constraint counterparts in medical image
applications. Additionally, it effectively reduces the artifacts presented in other
approaches at low bit rates and also attenuates noise in the pre-compressed images.
Furthermore, its advantages in progressive transmission and fast decoding make it
suitable for bandwidth limited tele-communications and web-based image browsing
applications
An Information-theoretic Framework for Visualization
Abstract-In this paper, we examine whether or not information theory can be one of the theoretic frameworks for visualization. We formulate concepts and measurements for qualifying visual information. We illustrate these concepts with examples that manifest the intrinsic and implicit use of information theory in many existing visualization techniques. We outline the broad correlation between visualization and the major applications of information theory, while pointing out the difference in emphasis and some technical gaps. Our study provides compelling evidence that information theory can explain a significant number of phenomena or events in visualization, while no example has been found which is fundamentally in conflict with information theory. We also notice that the emphasis of some traditional applications of information theory, such as data compression or data communication, may not always suit visualization, as the former typically focuses on the efficient throughput of a communication channel, whilst the latter focuses on the effectiveness in aiding the perceptual and cognitive process for data understanding and knowledge discovery. These findings suggest that further theoretic developments are necessary for adopting and adapting information theory for visualization
Processing and codification images based on jpg standard
This project raises the necessity to use the image compression currently, and the different methods of compression and codification. Specifically, it will deepen the lossy compression standards with the JPEG [1] standard. The main goal of this project is to implement a Matlab program, which encode and compress an image of any format in a “jpg” format image, through JPEG standard premises. JPEG compresses images based on their spatial frequency, or level of detail in the image. Areas with low levels of detail, like blue sky, are compressed better than areas with high levels of detail, like hair, blades of trees, or hard-edged transitions. The JPEG algorithm takes advantage of the human eye's increased sensitivity to small differences in brightness versus small differences in color, especially at higher frequencies. The JPEG algorithm first transforms the image from RGB to the luminance/chrominance (Y-Cb-Cr) color space, or brightness/grayscale (Y) from the two color components. The algorithm then downsamples the color components and leaves the brightness component alone. Next, the JPEG algorithm approximates 8x8 blocks of pixels with a base value representing the average, plus some frequency coefficients for nearby variations. Quantization, then downsamples these DCT coefficients. Higher frequencies and chroma are quantized by larger coefficients than lower frequencies and luminance. Thus more of the brightness information is kept than the higher frequencies and color values. So the lower the level of detail and the fewer abrupt color or tonal transitions, the more efficient the JPEG algorithm becomes. ____________________________________________________________________________________________________________________________En este proyecto se aborda la necesidad de comprimir las imágenes en la
actualidad, además de explicar los diferentes métodos posibles para la compresión y
codificación de imágenes. En concreto, se va a profundizar en los estándares de
compresión con pérdidas, mediante el estándar JPEG. El pilar central del proyecto será la
realización de un programa en Matlab que codifique y comprima una imagen de cualquier
formato en una imagen con formato “jpg”, mediante las premisas del estándar JPEG.
La compresión de imágenes con JPEG está basada en la frecuencia espacial, o nivel
de detalle, de las imágenes. Las áreas con bajo nivel de detalle, es decir, homogéneas, se
pueden comprimir mejor que áreas con gran nivel de detalle o las transiciones de los
bordes. El algoritmo JPEG se aprovecha de la sensibilidad del ojo humano a pequeñas
diferencias de brillo frente a las de color, especialmente con altas frecuencias. El algoritmo
JPEG primero transforma la paleta de colores de la imagen RGB a un espacio de color de
luminancia/crominancia (Y-Cb-Cr), o brillo/escala de grises (Y) con las dos componentes
del color. El algoritmo a continuación disminuye las componentes del color y deja solo la
componente del brillo.
A continuación, se aproxima la imagen en bloques de 8x8 pixeles con un valor base
promedio, además de coeficientes de frecuencia de variaciones cercanas. Con la
cuantificación, se disminuyen la resolución de los coeficientes de la DCT. Las frecuencias
más altas y crominancias se cuantifican con los coeficientes de bajas frecuencias y
luminancia. De esta forma, se mantienen mayor información de brillo que de altas
frecuencias y colores. Por lo tanto, cuanto más homogénea sea la imagen (menor nivel de
detalle y menos transiciones tonales abruptas) más eficiente será el algoritmo JPEG.Ingeniería Técnica en Sistemas de Telecomunicació
A practical comparison between two powerful PCC codec’s
Recent advances in the consumption of 3D content creates the necessity of efficient ways to
visualize and transmit 3D content. As a result, methods to obtain that same content have
been evolving, leading to the development of new methods of representations, namely point
clouds and light fields. A point cloud represents a set of points with associated Cartesian coordinates associated with each point(x, y, z), as well as being able to contain even more information inside that point (color, material, texture, etc). This kind of representation changes
the way on how 3D content in consumed, having a wide range of applications, from videogaming to medical ones. However, since this type of data carries so much information within
itself, they are data-heavy, making the storage and transmission of content a daunting task.
To resolve this issue, MPEG created a point cloud coding normalization project, giving birth
to V-PCC (Video-based Point Cloud Coding) and G-PCC (Geometry-based Point Cloud Coding) for static content. Firstly, a general analysis of point clouds is made, spanning from their
possible solutions, to their acquisition. Secondly, point cloud codecs are studied, namely VPCC and G-PCC from MPEG. Then, a state of art study of quality evaluation is performed,
namely subjective and objective evaluation. Finally, a report on the JPEG Pleno Point Cloud,
in which an active colaboration took place, is made, with the comparative results of the two
codecs and used metrics.Os avanços recentes no consumo de conteúdo 3D vêm criar a necessidade de maneiras eficientes de visualizar e transmitir conteúdo 3D. Consequentemente, os métodos de obtenção
desse mesmo conteúdo têm vindo a evoluir, levando ao desenvolvimento de novas maneiras
de representação, nomeadamente point clouds e lightfields. Um point cloud (núvem de pontos) representa um conjunto de pontos com coordenadas cartesianas associadas a cada ponto
(x, y, z), além de poder conter mais informação dentro do mesmo (cor, material, textura,
etc). Este tipo de representação abre uma nova janela na maneira como se consome conteúdo 3D, tendo um elevado leque de aplicações, desde videojogos e realidade virtual a aplicações médicas. No entanto, este tipo de dados, ao carregarem com eles tanta informação,
tornam-se incrivelmente pesados, tornando o seu armazenamento e transmissão uma tarefa
hercúleana. Tendo isto em mente, a MPEG criou um projecto de normalização de codificação de point clouds, dando origem ao V-PCC (Video-based Point Cloud Coding) e G-PCC
(Geometry-based Point Cloud Coding) para conteúdo estático. Esta dissertação tem como
objectivo uma análise geral sobre os point clouds, indo desde as suas possívei utilizações
à sua aquisição. Seguidamente, é efectuado um estudo dos codificadores de point clouds,
nomeadamente o V-PCC e o G-PCC da MPEG, o estado da arte da avaliação de qualidade, objectiva e subjectiva, e finalmente, são reportadas as actividades da JPEG Pleno Point Cloud,
na qual se teve uma colaboração activa