2,640 research outputs found

    Region-based representations of image and video: segmentation tools for multimedia services

    Get PDF
    This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standards are discussed. The review is structured around the strategies used by the algorithms (transition based or homogeneity based) and the decision spaces (spatial, spatio-temporal, and temporal). The second part of this paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in a systematic and universal way and what has to be application dependent. It is shown in particular how a single partition tree created with an extremely simple similarity feature can support a large number of segmentation applications: spatial segmentation, motion estimation, region-based coding, semantic object extraction, and region-based retrieval.Peer ReviewedPostprint (published version

    TransformĂ©es basĂ©es graphes pour la compression de nouvelles modalitĂ©s d’image

    Get PDF
    Due to the large availability of new camera types capturing extra geometrical information, as well as the emergence of new image modalities such as light fields and omni-directional images, a huge amount of high dimensional data has to be stored and delivered. The ever growing streaming and storage requirements of these new image modalities require novel image coding tools that exploit the complex structure of those data. This thesis aims at exploring novel graph based approaches for adapting traditional image transform coding techniques to the emerging data types where the sampled information are lying on irregular structures. In a first contribution, novel local graph based transforms are designed for light field compact representations. By leveraging a careful design of local transform supports and a local basis functions optimization procedure, significant improvements in terms of energy compaction can be obtained. Nevertheless, the locality of the supports did not permit to exploit long term dependencies of the signal. This led to a second contribution where different sampling strategies are investigated. Coupled with novel prediction methods, they led to very prominent results for quasi-lossless compression of light fields. The third part of the thesis focuses on the definition of rate-distortion optimized sub-graphs for the coding of omni-directional content. If we move further and give more degree of freedom to the graphs we wish to use, we can learn or define a model (set of weights on the edges) that might not be entirely reliable for transform design. The last part of the thesis is dedicated to theoretically analyze the effect of the uncertainty on the efficiency of the graph transforms.En raison de la grande disponibilitĂ© de nouveaux types de camĂ©ras capturant des informations gĂ©omĂ©triques supplĂ©mentaires, ainsi que de l'Ă©mergence de nouvelles modalitĂ©s d'image telles que les champs de lumiĂšre et les images omnidirectionnelles, il est nĂ©cessaire de stocker et de diffuser une quantitĂ© Ă©norme de hautes dimensions. Les exigences croissantes en matiĂšre de streaming et de stockage de ces nouvelles modalitĂ©s d’image nĂ©cessitent de nouveaux outils de codage d’images exploitant la structure complexe de ces donnĂ©es. Cette thĂšse a pour but d'explorer de nouvelles approches basĂ©es sur les graphes pour adapter les techniques de codage de transformĂ©es d'image aux types de donnĂ©es Ă©mergents oĂč les informations Ă©chantillonnĂ©es reposent sur des structures irrĂ©guliĂšres. Dans une premiĂšre contribution, de nouvelles transformĂ©es basĂ©es sur des graphes locaux sont conçues pour des reprĂ©sentations compactes des champs de lumiĂšre. En tirant parti d’une conception minutieuse des supports de transformĂ©es locaux et d’une procĂ©dure d’optimisation locale des fonctions de base , il est possible d’amĂ©liorer considĂ©rablement le compaction d'Ă©nergie. NĂ©anmoins, la localisation des supports ne permettait pas d'exploiter les dĂ©pendances Ă  long terme du signal. Cela a conduit Ă  une deuxiĂšme contribution oĂč diffĂ©rentes stratĂ©gies d'Ă©chantillonnage sont Ă©tudiĂ©es. CouplĂ©s Ă  de nouvelles mĂ©thodes de prĂ©diction, ils ont conduit Ă  des rĂ©sultats trĂšs importants en ce qui concerne la compression quasi sans perte de champs de lumiĂšre statiques. La troisiĂšme partie de la thĂšse porte sur la dĂ©finition de sous-graphes optimisĂ©s en distorsion de dĂ©bit pour le codage de contenu omnidirectionnel. Si nous allons plus loin et donnons plus de libertĂ© aux graphes que nous souhaitons utiliser, nous pouvons apprendre ou dĂ©finir un modĂšle (ensemble de poids sur les arĂȘtes) qui pourrait ne pas ĂȘtre entiĂšrement fiable pour la conception de transformĂ©es. La derniĂšre partie de la thĂšse est consacrĂ©e Ă  l'analyse thĂ©orique de l'effet de l'incertitude sur l'efficacitĂ© des transformĂ©es basĂ©es graphes

    Competitive Segmentation Performance on Near-lossless and Lossy Compressed Remote Sensing Images

    Get PDF
    Image segmentation lies at the heart of multiple image processing chains, and achieving accurate segmentation is of utmost importance as it impacts later processing. Image segmentation has recently gained interest in the field of remote sensing, mostly due to the widespread availability of remote sensing data. This increased availability poses the problem of transmitting and storing large volumes of data. Compression is a common strategy to alleviate this problem. However, lossy or near-lossless compression prevents a perfect reconstruction of the recovered data. This letter investigates the image segmentation performance in data reconstructed after a near-lossless or a lossy compression. Two image segmentation algorithms and two compression standards are evaluated on data from sev- eral instruments. Experimental results reveal that segmentation performance over previously near-lossless and lossy compressed images is not markedly reduced at low and moderate compression ratios. In some scenarios, accurate segmentation performance can be achieved even for high compression ratios

    Object-based video representations: shape compression and object segmentation

    Get PDF
    Object-based video representations are considered to be useful for easing the process of multimedia content production and enhancing user interactivity in multimedia productions. Object-based video presents several new technical challenges, however. Firstly, as with conventional video representations, compression of the video data is a requirement. For object-based representations, it is necessary to compress the shape of each video object as it moves in time. This amounts to the compression of moving binary images. This is achieved by the use of a technique called context-based arithmetic encoding. The technique is utilised by applying it to rectangular pixel blocks and as such it is consistent with the standard tools of video compression. The blockbased application also facilitates well the exploitation of temporal redundancy in the sequence of binary shapes. For the first time, context-based arithmetic encoding is used in conjunction with motion compensation to provide inter-frame compression. The method, described in this thesis, has been thoroughly tested throughout the MPEG-4 core experiment process and due to favourable results, it has been adopted as part of the MPEG-4 video standard. The second challenge lies in the acquisition of the video objects. Under normal conditions, a video sequence is captured as a sequence of frames and there is no inherent information about what objects are in the sequence, not to mention information relating to the shape of each object. Some means for segmenting semantic objects from general video sequences is required. For this purpose, several image analysis tools may be of help and in particular, it is believed that video object tracking algorithms will be important. A new tracking algorithm is developed based on piecewise polynomial motion representations and statistical estimation tools, e.g. the expectationmaximisation method and the minimum description length principle

    Image Segmentation using Human Visual System Properties with Applications in Image Compression

    Get PDF
    In order to represent a digital image, a very large number of bits is required. For example, a 512 X 512 pixel, 256 gray level image requires over two million bits. This large number of bits is a substantial drawback when it is necessary to store or transmit a digital image. Image compression, often referred to as image coding, attempts to reduce the number of bits used to represent an image, while keeping the degradation in the decoded image to a minimum. One approach to image compression is segmentation-based image compression. The image to be compressed is segmented, i.e. the pixels in the image are divided into mutually exclusive spatial regions based on some criteria. Once the image has been segmented, information is extracted describing the shapes and interiors of the image segments. Compression is achieved by efficiently representing the image segments. In this thesis we propose an image segmentation technique which is based on centroid-linkage region growing, and takes advantage of human visual system (HVS) properties. We systematically determine through subjective experiments the parameters for our segmentation algorithm which produce the most visually pleasing segmented images, and demonstrate the effectiveness of our method. We also propose a method for the quantization of segmented images based on HVS contrast sensitivity, arid investigate the effect of quantization on segmented images

    Neural-based Compression Scheme for Solar Image Data

    Full text link
    Studying the solar system and especially the Sun relies on the data gathered daily from space missions. These missions are data-intensive and compressing this data to make them efficiently transferable to the ground station is a twofold decision to make. Stronger compression methods, by distorting the data, can increase data throughput at the cost of accuracy which could affect scientific analysis of the data. On the other hand, preserving subtle details in the compressed data requires a high amount of data to be transferred, reducing the desired gains from compression. In this work, we propose a neural network-based lossy compression method to be used in NASA's data-intensive imagery missions. We chose NASA's SDO mission which transmits 1.4 terabytes of data each day as a proof of concept for the proposed algorithm. In this work, we propose an adversarially trained neural network, equipped with local and non-local attention modules to capture both the local and global structure of the image resulting in a better trade-off in rate-distortion (RD) compared to conventional hand-engineered codecs. The RD variational autoencoder used in this work is jointly trained with a channel-dependent entropy model as a shared prior between the analysis and synthesis transforms to make the entropy coding of the latent code more effective. Our neural image compression algorithm outperforms currently-in-use and state-of-the-art codecs such as JPEG and JPEG-2000 in terms of the RD performance when compressing extreme-ultraviolet (EUV) data. As a proof of concept for use of this algorithm in SDO data analysis, we have performed coronal hole (CH) detection using our compressed images, and generated consistent segmentations, even at a compression rate of ∌0.1\sim0.1 bits per pixel (compared to 8 bits per pixel on the original data) using EUV data from SDO.Comment: Accepted for publication in IEEE Transactions on Aerospace and Electronic Systems (TAES). arXiv admin note: text overlap with arXiv:2210.0647

    The density connectivity information bottleneck

    Full text link
    Clustering with the agglomerative Information Bottleneck (aIB) algorithm suffers from the sub-optimality problem, which cannot guarantee to preserve as much relative information as possible. To handle this problem, we introduce a density connectivity chain, by which we consider not only the information between two data elements, but also the information among the neighbors of a data element. Based on this idea, we propose DCIB, a Density Connectivity Information Bottleneck algorithm that applies the Information Bottleneck method to quantify the relative information during the clustering procedure. As a hierarchical algorithm, the DCIB algorithm produces a pruned clustering tree-structure and gets clustering results in different sizes in a single execution. The experiment results in the documentation clustering indicate that the DCIB algorithm can preserve more relative information and achieve higher precision than the aIB algorithm.<br /

    High compression image and image sequence coding

    Get PDF
    The digital representation of an image requires a very large number of bits. This number is even larger for an image sequence. The goal of image coding is to reduce this number, as much as possible, and reconstruct a faithful duplicate of the original picture or image sequence. Early efforts in image coding, solely guided by information theory, led to a plethora of methods. The compression ratio reached a plateau around 10:1 a couple of years ago. Recent progress in the study of the brain mechanism of vision and scene analysis has opened new vistas in picture coding. Directional sensitivity of the neurones in the visual pathway combined with the separate processing of contours and textures has led to a new class of coding methods capable of achieving compression ratios as high as 100:1 for images and around 300:1 for image sequences. Recent progress on some of the main avenues of object-based methods is presented. These second generation techniques make use of contour-texture modeling, new results in neurophysiology and psychophysics and scene analysis
    • 

    corecore