2,605 research outputs found

    Coding of details in very low bit-rate video systems

    Get PDF
    In this paper, the importance of including small image features at the initial levels of a progressive second generation video coding scheme is presented. It is shown that a number of meaningful small features called details should be coded, even at very low data bit-rates, in order to match their perceptual significance to the human visual system. We propose a method for extracting, perceptually selecting and coding of visual details in a video sequence using morphological techniques. Its application in the framework of a multiresolution segmentation-based coding algorithm yields better results than pure segmentation techniques at higher compression ratios, if the selection step fits some main subjective requirements. Details are extracted and coded separately from the region structure and included in the reconstructed images in a later stage. The bet of considering the local background of a given detail for its perceptual selection breaks the concept ofPeer ReviewedPostprint (published version

    Prediction of image partitions using Fourier descriptors: application to segmentation-based coding schemes

    Get PDF
    This paper presents a prediction technique for partition sequences. It uses a region-by-region approach that consists of four steps: region parameterization, region prediction, region ordering, and partition creation. The time evolution of each region is divided into two types: regular motion and shape deformation. Both types of evolution are parameterized by means of the Fourier descriptors and they are separately predicted in the Fourier domain. The final predicted partition is built from the ordered combination of the predicted regions, using morphological tools. With this prediction technique, two different applications are addressed in the context of segmentation-based coding approaches. Noncausal partition prediction is applied to partition interpolation, and examples using complete partitions are presented. In turn, causal partition prediction is applied to partition extrapolation for coding purposes, and examples using complete partitions as well as sequences of binary images-shape information in video object planes (VOPs)-are presented.Peer ReviewedPostprint (published version

    Exploiting surroundedness for saliency detection: a boolean map approach

    Full text link
    We demonstrate the usefulness of surroundedness for eye fixation prediction by proposing a Boolean Map based Saliency model (BMS). In our formulation, an image is characterized by a set of binary images, which are generated by randomly thresholding the image's feature maps in a whitened feature space. Based on a Gestalt principle of figure-ground segregation, BMS computes a saliency map by discovering surrounded regions via topological analysis of Boolean maps. Furthermore, we draw a connection between BMS and the Minimum Barrier Distance to provide insight into why and how BMS can properly captures the surroundedness cue via Boolean maps. The strength of BMS is verified by its simplicity, efficiency and superior performance compared with 10 state-of-the-art methods on seven eye tracking benchmark datasets.US National Science Foundation; 1059218; 1029430http://cs-people.bu.edu/jmzhang/BMS/BMS_iccv13_preprint.pdfAccepted manuscrip

    Significance linked connected component analysis plus

    Get PDF
    Dr. Xinhua Zhuang, Dissertation Supervisor.Field of Study: Computer Science."May 2018."An image coding algorithm, SLCCA Plus, is introduced in this dissertation. SLCCA Plus is a wavelet-based subband coding method. In wavelet-based subband coding, the input images will go through a wavelet transform and be decomposed into wavelet subband pyramids. Then the characteristics of the wavelet coefficients within and among subbands will be utilized to removing the redundancy. The rest information will be organized and go through entropy encoding. SLCCA Plus contains a series improvement method to the SLCCA. Before SLCCA, there are three top-ranked wavelet image coders. Namely, Embedded Zerotree Wavelet coder (EZW), Morphological Representation of Wavelet Date (MEWD), and Set Partitioning in Hierarchical Trees (SPIHT). They exploit either inter-subband relation among zero wavelet coefficients or within-subband clustering. SLCCA, on the other hand, outperforms these three coders by exploring both the inter- subband coefficients relations and within-subband clustering of significant wavelet coefficients. SLCCA Plus strengthens SLCCA in the following aspects: Intelligence quantization, enhanced cluster filter, potential-significant shared-zero, and improved context models. The purpose of the first three improvements is to remove redundancy information further while keeping the image error as low as possible. As a result, they achieve a better trade-off between bit cost and image quality. Moreover, the improved context lowers the entropy by refining the classification of symbols in cluster sequence and magnitude bit-planes. Lower entropy means the adaptive arithmetic coding can achieve a better coding gain. For performance evaluation, SLCCA Plus is compared to SLCCA and JPEG2000. On average, SLCCA Plus achieves 7% bit saving over JPEG 2000 and 4% over SLCCA. The results comparison shows that SLCCA Plus shows more texture and edge details at a lower bitrate.Includes bibliographical references (pages 88-92)

    Learning to Sieve: Prediction of Grading Curves from Images of Concrete Aggregate

    Get PDF
    A large component of the building material concrete consists of aggregate with varying particle sizes between 0.125 and 32 mm. Its actual size distribution significantly affects the quality characteristics of the final concrete in both, the fresh and hardened states. The usually unknown variations in the size distribution of the aggregate particles, which can be large especially when using recycled aggregate materials, are typically compensated by an increased usage of cement which, however, has severe negative impacts on economical and ecological aspects of the concrete production. In order to allow a precise control of the target properties of the concrete, unknown variations in the size distribution have to be quantified to enable a proper adaptation of the concrete's mixture design in real time. To this end, this paper proposes a deep learning based method for the determination of concrete aggregate grading curves. In this context, we propose a network architecture applying multi-scale feature extraction modules in order to handle the strongly diverse object sizes of the particles. Furthermore, we propose and publish a novel dataset of concrete aggregate used for the quantitative evaluation of our method

    Segmentation-based mesh design for motion estimation

    Get PDF
    Dans la plupart des codec vidéo standard, l'estimation des mouvements entre deux images se fait généralement par l'algorithme de concordance des blocs ou encore BMA pour « Block Matching Algorithm ». BMA permet de représenter l'évolution du contenu des images en décomposant normalement une image par blocs 2D en mouvement translationnel. Cette technique de prédiction conduit habituellement à de sévères distorsions de 1'artefact de bloc lorsque Ie mouvement est important. De plus, la décomposition systématique en blocs réguliers ne dent pas compte nullement du contenu de l'image. Certains paramètres associes aux blocs, mais inutiles, doivent être transmis; ce qui résulte d'une augmentation de débit de transmission. Pour paillier a ces défauts de BMA, on considère les deux objectifs importants dans Ie codage vidéo, qui sont de recevoir une bonne qualité d'une part et de réduire la transmission a très bas débit d'autre part. Dans Ie but de combiner les deux exigences quasi contradictoires, il est nécessaire d'utiliser une technique de compensation de mouvement qui donne, comme transformation, de bonnes caractéristiques subjectives et requiert uniquement, pour la transmission, l'information de mouvement. Ce mémoire propose une technique de compensation de mouvement en concevant des mailles 2D triangulaires a partir d'une segmentation de l'image. La décomposition des mailles est construite a partir des nœuds repartis irrégulièrement Ie long des contours dans l'image. La décomposition résultant est ainsi basée sur Ie contenu de l'image. De plus, étant donné la même méthode de sélection des nœuds appliquée à l'encodage et au décodage, la seule information requise est leurs vecteurs de mouvement et un très bas débit de transmission peut ainsi être réalise. Notre approche, comparée avec BMA, améliore à la fois la qualité subjective et objective avec beaucoup moins d'informations de mouvement. Dans la premier chapitre, une introduction au projet sera présentée. Dans Ie deuxième chapitre, on analysera quelques techniques de compression dans les codec standard et, surtout, la populaire BMA et ses défauts. Dans Ie troisième chapitre, notre algorithme propose et appelé la conception active des mailles a base de segmentation, sera discute en détail. Ensuite, les estimation et compensation de mouvement seront décrites dans Ie chapitre 4. Finalement, au chapitre 5, les résultats de simulation et la conclusion seront présentés.Abstract: In most video compression standards today, the generally accepted method for temporal prediction is motion compensation using block matching algorithm (BMA). BMA represents the scene content evolution with 2-D rigid translational moving blocks. This kind of predictive scheme usually leads to distortions such as block artefacts especially when the motion is important. The two most important aims in video coding are to receive a good quality on one hand and a low bit-rate on the other. This thesis proposes a motion compensation scheme using segmentation-based 2-D triangular mesh design method. The mesh is constructed by irregularly spread nodal points selected along image contour. Based on this, the generated mesh is, to a great extent, image content based. Moreover, the nodes are selected with the same method on the encoder and decoder sides, so that the only information that has to be transmitted are their motion vectors, and thus very low bit-rate can be achieved. Compared with BMA, our approach could improve subjective and objective quality with much less motion information."--Résumé abrégé par UM
    • …
    corecore