304 research outputs found

    Influence of Dictionary Size on the Lossless Compression of Microarray Images

    Full text link
    A key challenge in the management of microarray data is the large size of images that constitute the output of microarray experiments. Therefore, only the expression values extracted from these experiments are generally made available. However, the extraction of expression data is effected by a variety of factors, such as the thresholds used for background intensity correction, method used for grid determination, and parameters used in foreground (spot)-background delineation. This information is not always available or consistent across experiments and impacts downstream data analysis. Furthermore, the lack of access to the image-based primary data often leads to costly replication of experiments. Currently, both lossy and lossless compression techniques have been developed for microarray images. While lossy algorithms deliver better compression, a significant advantage of the lossless techniques is that they guarantee against loss of information that is putatively of biological importance. A key challenge therefore is the development of more efficacious lossless compression techniques. Dictionary-based compression is one of the critical methods used in lossless microarray compression. However, the image-based microarray data has potentially infinite variability. So the selection and effect of the dictionary size on the compression rate is crucial. Our paper examines this problem and shows that increasing the dictionary size beyond a certain size, does not lead to better compression. Our investigations also point to strategies for determining the optimal dictionary size. 1

    Standard and specific compression techniques for DNA microarray images

    Get PDF
    We review the state of the art in DNA microarray image compression and provide original comparisons between standard and microarray-specific compression techniques that validate and expand previous work. First, we describe the most relevant approaches published in the literature and classify them according to the stage of the typical image compression process where each approach makes its contribution, and then we summarize the compression results reported for these microarray-specific image compression schemes. In a set of experiments conducted for this paper, we obtain new results for several popular image coding techniques that include the most recent coding standards. Prediction-based schemes CALIC and JPEG-LS are the best-performing standard compressors, but are improved upon by the best microarray-specific technique, Battiato's CNN-based scheme

    Compression of Microarray Images

    Get PDF

    Lossy-to-Lossless Compression of Biomedical Images Based on Image Decomposition

    Get PDF
    The use of medical imaging has increased in the last years, especially with magnetic resonance imaging (MRI) and computed tomography (CT). Microarray imaging and images that can be extracted from RNA interference (RNAi) experiments also play an important role for large-scale gene sequence and gene expression analysis, allowing the study of gene function, regulation, and interaction across a large number of genes and even across an entire genome. These types of medical image modalities produce huge amounts of data that, for several reasons, need to be stored or transmitted at the highest possible fidelity between various hospitals, medical organizations, or research units

    Effect of image compression and scaling on automated scoring of immunohistochemical stainings and segmentation of tumor epithelium

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Digital whole-slide scanning of tissue specimens produces large images demanding increasing storing capacity. To reduce the need of extensive data storage systems image files can be compressed and scaled down. The aim of this article is to study the effect of different levels of image compression and scaling on automated image analysis of immunohistochemical (IHC) stainings and automated tumor segmentation.</p> <p>Methods</p> <p>Two tissue microarray (TMA) slides containing 800 samples of breast cancer tissue immunostained against Ki-67 protein and two TMA slides containing 144 samples of colorectal cancer immunostained against EGFR were digitized with a whole-slide scanner. The TMA images were JPEG2000 wavelet compressed with four compression ratios: lossless, and 1:12, 1:25 and 1:50 lossy compression. Each of the compressed breast cancer images was furthermore scaled down either to 1:1, 1:2, 1:4, 1:8, 1:16, 1:32, 1:64 or 1:128. Breast cancer images were analyzed using an algorithm that quantitates the extent of staining in Ki-67 immunostained images, and EGFR immunostained colorectal cancer images were analyzed with an automated tumor segmentation algorithm. The automated tools were validated by comparing the results from losslessly compressed and non-scaled images with results from conventional visual assessments. Percentage agreement and kappa statistics were calculated between results from compressed and scaled images and results from lossless and non-scaled images.</p> <p>Results</p> <p>Both of the studied image analysis methods showed good agreement between visual and automated results. In the automated IHC quantification, an agreement of over 98% and a kappa value of over 0.96 was observed between losslessly compressed and non-scaled images and combined compression ratios up to 1:50 and scaling down to 1:8. In automated tumor segmentation, an agreement of over 97% and a kappa value of over 0.93 was observed between losslessly compressed images and compression ratios up to 1:25.</p> <p>Conclusions</p> <p>The results of this study suggest that images stored for assessment of the extent of immunohistochemical staining can be compressed and scaled significantly, and images of tumors to be segmented can be compressed without compromising computer-assisted analysis results using studied methods.</p> <p>Virtual slides</p> <p>The virtual slide(s) for this article can be found here: <url>http://www.diagnosticpathology.diagnomx.eu/vs/2442925476534995</url></p

    Analysis-driven lossy compression of DNA microarray images

    Get PDF
    DNA microarrays are one of the fastest-growing new technologies in the field of genetic research, and DNA microarray images continue to grow in number and size. Since analysis techniques are under active and ongoing development, storage, transmission and sharing of DNA microarray images need be addressed, with compression playing a significant role. However, existing lossless coding algorithms yield only limited compression performance (compression ratios below 2:1), whereas lossy coding methods may introduce unacceptable distortions in the analysis process. This work introduces a novel Relative Quantizer (RQ), which employs non-uniform quantization intervals designed for improved compression while bounding the impact on the DNA microarray analysis. This quantizer constrains the maximum relative error introduced into quantized imagery, devoting higher precision to pixels critical to the analysis process. For suitable parameter choices, the resulting variations in the DNA microarray analysis are less than half of those inherent to the experimental variability. Experimental results reveal that appropriate analysis can still be performed for average compression ratios exceeding 4.5:1

    Web-based manipulation of multiresolution micro-CT images

    Get PDF
    Micro Computed-Tomography (mu-CT) scanning is opening a new world for medical researchers. Scientific data of several tens of gigabytes per image is created and usually requires storage on a common server such as Picture Archiving and Communication Systems (PACS). Previewing this data online in a meaningful way is an essential part of these systems. Radiologists who have been working with CT data for a long time are commonly looking at two-dimensional slices of 3D image stacks. Conventional web-viewers such as Google Maps and Deep Zoom use tiled multiresolution-images for faster display of large 2D data. In the medical area this approach is being adapted for high resolution 2D images. Solutions that include basic image processing still rely on browser external solutions and high-performance client-machines. In this paper we optimized and modified Brain Maps API to create an interactive orthogonal-sectioning image viewer for medical mu-CT scans, based on JavaScript and HTML5. We show that tiling of images reduces the processing time by a factor of two. Different file formats are compared regarding their quality and time to display. As well a sample end-to-end application demonstrates the feasibility of this solution for custom made image acquisition systems

    Lossless compression of images with specific characteristics

    Get PDF
    Doutoramento em Engenharia ElectrotécnicaA compressão de certos tipos de imagens é um desafio para algumas normas de compressão de imagem. Esta tese investiga a compressão sem perdas de imagens com características especiais, em particular imagens simples, imagens de cor indexada e imagens de microarrays. Estamos interessados no desenvolvimento de métodos de compressão completos e no estudo de técnicas de pré-processamento que possam ser utilizadas em conjunto com as normas de compressão de imagem. A esparsidade do histograma, uma propriedade das imagens simples, é um dos assuntos abordados nesta tese. Desenvolvemos uma técnica de pré-processamento, denominada compactação de histogramas, que explora esta propriedade e que pode ser usada em conjunto com as normas de compressão de imagem para um melhoramento significativo da eficiência de compressão. A compactação de histogramas e os algoritmos de reordenação podem ser usados como préprocessamento para melhorar a compressão sem perdas de imagens de cor indexada. Esta tese apresenta vários algoritmos e um estudo abrangente dos métodos já existentes. Métodos específicos, como é o caso da decomposição em árvores binárias, são também estudados e propostos. O uso de microarrays em biologia encontra-se em franca expansão. Devido ao elevado volume de dados gerados por experiência, são necessárias técnicas de compressão sem perdas. Nesta tese, exploramos a utilização de normas de compressão sem perdas e apresentamos novos algoritmos para codificar eficientemente este tipo de imagens, baseados em modelos de contexto finito e codificação aritmética.The compression of some types of images is a challenge for some standard compression techniques. This thesis investigates the lossless compression of images with specific characteristics, namely simple images, color-indexed images and microarray images. We are interested in the development of complete compression methods and in the study of preprocessing algorithms that could be used together with standard compression methods. The histogram sparseness, a property of simple images, is addressed in this thesis. We developed a preprocessing technique, denoted histogram packing, that explores this property and can be used with standard compression methods for improving significantly their efficiency. Histogram packing and palette reordering algorithms can be used as a preprocessing step for improving the lossless compression of color-indexed images. This thesis presents several algorithms and a comprehensive study of the already existing methods. Specific compression methods, such as binary tree decomposition, are also addressed. The use of microarray expression data in state-of-the-art biology has been well established and due to the significant volume of data generated per experiment, efficient lossless compression methods are needed. In this thesis, we explore the use of standard image coding techniques and we present new algorithms to efficiently compress this type of images, based on finite-context modeling and arithmetic coding

    Algoritmos de compressão sem perdas para imagens de microarrays e alinhamento de genomas completos

    Get PDF
    Doutoramento em InformáticaNowadays, in the 21st century, the never-ending expansion of information is a major global concern. The pace at which storage and communication resources are evolving is not fast enough to compensate this tendency. In order to overcome this issue, sophisticated and efficient compression tools are required. The goal of compression is to represent information with as few bits as possible. There are two kinds of compression, lossy and lossless. In lossless compression, information loss is not tolerated so the decoded information is exactly the same as the encoded one. On the other hand, in lossy compression some loss is acceptable. In this work we focused on lossless methods. The goal of this thesis was to create lossless compression tools that can be used in two types of data. The first type is known in the literature as microarray images. These images have 16 bits per pixel and a high spatial resolution. The other data type is commonly called Whole Genome Alignments (WGA), in particularly applied to MAF files. Regarding the microarray images, we improved existing microarray-specific methods by using some pre-processing techniques (segmentation and bitplane reduction). Moreover, we also developed a compression method based on pixel values estimates and a mixture of finite-context models. Furthermore, an approach based on binary-tree decomposition was also considered. Two compression tools were developed to compress MAF files. The first one based on a mixture of finite-context models and arithmetic coding, where only the DNA bases and alignment gaps were considered. The second tool, designated as MAFCO, is a complete compression tool that can handle all the information that can be found in MAF files. MAFCO relies on several finite-context models and allows parallel compression/decompression of MAF files.Hoje em dia, no século XXI, a expansão interminável de informação é uma grande preocupação mundial. O ritmo ao qual os recursos de armazenamento e comunicação estão a evoluir não é suficientemente rápido para compensar esta tendência. De forma a ultrapassar esta situação, são necessárias ferramentas de compressão sofisticadas e eficientes. A compressão consiste em representar informação utilizando a menor quantidade de bits possível. Existem dois tipos de compressão, com e sem perdas. Na compressão sem perdas, a perda de informação não é tolerada, por isso a informação descodificada é exatamente a mesma que a informação que foi codificada. Por outro lado, na compressão com perdas alguma perda é aceitável. Neste trabalho, focámo-nos apenas em métodos de compressão sem perdas. O objetivo desta tese consistiu na criação de ferramentas de compressão sem perdas para dois tipos de dados. O primeiro tipo de dados é conhecido na literatura como imagens de microarrays. Estas imagens têm 16 bits por píxel e uma resolução espacial elevada. O outro tipo de dados é geralmente denominado como alinhamento de genomas completos, particularmente aplicado a ficheiros MAF. Relativamente às imagens de microarrays, melhorámos alguns métodos de compressão específicos utilizando algumas técnicas de pré-processamento (segmentação e redução de planos binários). Além disso, desenvolvemos também um método de compressão baseado em estimação dos valores dos pixéis e em misturas de modelos de contexto-finito. Foi também considerada, uma abordagem baseada em decomposição em árvore binária. Foram desenvolvidas duas ferramentas de compressão para ficheiros MAF. A primeira ferramenta, é baseada numa mistura de modelos de contexto-finito e codificação aritmética, onde apenas as bases de ADN e os símbolos de alinhamento foram considerados. A segunda, designada como MAFCO, é uma ferramenta de compressão completa que consegue lidar com todo o tipo de informação que pode ser encontrada nos ficheiros MAF. MAFCO baseia-se em vários modelos de contexto-finito e permite compressão/descompressão paralela de ficheiros MAF
    corecore