Search CORE

4 research outputs found

Recommended from our members

Novel entropy coding and its application of the compression of 3D image and video signals

Author: Amal Mehanna
Publication venue: Brunel University London
Publication date: 01/01/2013
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonThe broadcast industry is moving future Digital Television towards Super high resolution TV (4k or 8k) and/or 3D TV. This ultimately will increase the demand on data rate and subsequently the demand for highly efficient codecs. One of the techniques that researchers found it one of the promising technologies in the industry in the next few years is 3D Integral Image and Video due to its simplicity and mimics the reality, independently on viewer aid, one of the challenges of the 3D Integral technology is to improve the compression algorithms to adequate the high resolution and exploit the advantages of the characteristics of this technology. The research scope of this thesis includes designing a novel coding for the 3D Integral image and video compression. Firstly to address the compression of 3D Integral imaging the research proposes novel entropy coding which will be implemented first on 2D traditional images content in order to compare it with the other traditional common standards then will be applied on 3D Integra image and video. This approach seeks to achieve high performance represented by high image quality and low bit rate in association with low computational complexity. Secondly, new algorithm will be proposed in an attempt to improve and develop the transform techniques performance, initially by using a new adaptive 3D-DCT algorithm then by proposing a new hybrid 3D DWT-DCT algorithm via exploiting the advantages of each technique and get rid of the artifact that each technique of them suffers from. Finally, the proposed entropy coding will be further implemented to the 3D integral video in association with another proposed algorithm that based on calculating the motion vector on the average viewpoint for each frame. This approach seeks to minimize the complexity and reduce the speed without affecting the Human Visual System (HVS) performance. Number of block matching techniques will be used to investigate the best block matching technique that is adequate for the new proposed 3D integral video algorithm

Brunel University Research Archive

Compression of 3D models with NURBS

Author: Santa Cruz Ducci Diego
Publication venue: Lausanne, EPFL
Publication date: 16/03/2005
Field of study

With recent progress in computing, algorithmics and telecommunications, 3D models are increasingly used in various multimedia applications. Examples include visualization, gaming, entertainment and virtual reality. In the multimedia domain 3D models have been traditionally represented as polygonal meshes. This piecewise planar representation can be thought of as the analogy of bitmap images for 3D surfaces. As bitmap images, they enjoy great flexibility and are particularly well suited to describing information captured from the real world, through, for instance, scanning processes. They suffer, however, from the same shortcomings, namely limited resolution and large storage size. The compression of polygonal meshes has been a very active field of research in the last decade and rather efficient compression algorithms have been proposed in the literature that greatly mitigate the high storage costs. However, such a low level description of a 3D shape has a bounded performance. More efficient compression should be reachable through the use of higher level primitives. This idea has been explored to a great extent in the context of model based coding of visual information. In such an approach, when compressing the visual information a higher level representation (e.g., 3D model of a talking head) is obtained through analysis methods. This can be seen as an inverse projection problem. Once this task is fullled, the resulting parameters of the model are coded instead of the original information. It is believed that if the analysis module is efficient enough, the total cost of coding (in a rate distortion sense) will be greatly reduced. The relatively poor performance and high complexity of currently available analysis methods (except for specific cases where a priori knowledge about the nature of the objects is available), has refrained a large deployment of coding techniques based on such an approach. Progress in computer graphics has however changed this situation. In fact, nowadays, an increasing number of pictures, video and 3D content are generated by synthesis processing rather than coming from a capture device such as a camera or a scanner. This means that the underlying model in the synthesis stage can be used for their efficient coding without the need for a complex analysis module. In other words it would be a mistake to attempt to compress a low level description (e.g., a polygonal mesh) when a higher level one is available from the synthesis process (e.g., a parametric surface). This is, however, what is usually done in the multimedia domain, where higher level 3D model descriptions are converted to polygonal meshes, if anything by the lack of standard coded formats for the former. On a parallel but related path, the way we consume audio-visual information is changing. As opposed to recent past and a large part of today's applications, interactivity is becoming a key element in the way we consume information. In the context of interest in this dissertation, this means that when coding visual information (an image or a video for instance), previously obvious considerations such as decision on sampling parameters are not so obvious anymore. In fact, as in an interactive environment the effective display resolution can be controlled by the user through zooming, there is no clear optimal setting for the sampling period. This means that because of interactivity, the representation used to code the scene should allow the display of objects in a variety of resolutions, and ideally up to infinity. One way to resolve this problem would be by extensive over-sampling. But this approach is unrealistic and too expensive to implement in many situations. The alternative would be to use a resolution independent representation. In the realm of 3D modeling, such representations are usually available when the models are created by an artist on a computer. The scope of this dissertation is precisely the compression of 3D models in higher level forms. The direct coding in such a form should yield improved rate-distortion performance while providing a large degree of resolution independence. There has not been, so far, any major attempt to efficiently compress these representations, such as parametric surfaces. This thesis proposes a solution to overcome this gap. A variety of higher level 3D representations exist, of which parametric surfaces are a popular choice among designers. Within parametric surfaces, Non-Uniform Rational B-Splines (NURBS) enjoy great popularity as a wide range of NURBS based modeling tools are readily available. Recently, NURBS has been included in the Virtual Reality Modeling Language (VRML) and its next generation descendant eXtensible 3D (X3D). The nice properties of NURBS and their widespread use has lead us to choose them as the form we use for the coded representation. The primary goal of this dissertation is the definition of a system for coding 3D NURBS models with guaranteed distortion. The basis of the system is entropy coded differential pulse coded modulation (DPCM). In the case of NURBS, guaranteeing the distortion is not trivial, as some of its parameters (e.g., knots) have a complicated influence on the overall surface distortion. To this end, a detailed distortion analysis is performed. In particular, previously unknown relations between the distortion of knots and the resulting surface distortion are demonstrated. Compression efficiency is pursued at every stage and simple yet efficient entropy coder realizations are defined. The special case of degenerate and closed surfaces with duplicate control points is addressed and an efficient yet simple coding is proposed to compress the duplicate relationships. Encoder aspects are also analyzed. Optimal predictors are found that perform well across a wide class of models. Simplification techniques are also considered for improved compression efficiency at negligible distortion cost. Transmission over error prone channels is also considered and an error resilient extension defined. The data stream is partitioned by independently coding small groups of surfaces and inserting the necessary resynchronization markers. Simple strategies for achieving the desired level of protection are proposed. The same extension also serves the purpose of random access and on-the-fly reordering of the data stream

Infoscience - École polytechnique fédérale de Lausanne

Codage d'images avec et sans pertes à basse complexité et basé contenu

Author: Liu Yi
Publication venue: HAL CCSD
Publication date: 18/03/2015
Field of study

This doctoral research project aims at designing an improved solution of the still image codec called LAR (Locally Adaptive Resolution) for both compression performance and complexity. Several image compression standards have been well proposed and used in the multimedia applications, but the research does not stop the progress for the higher coding quality and/or lower coding consumption. JPEG was standardized twenty years ago, while it is still a widely used compression format today. With a better coding efficiency, the application of the JPEG 2000 is limited by its larger computation cost than the JPEG one. In 2008, the JPEG Committee announced a Call for Advanced Image Coding (AIC). This call aims to standardize potential technologies going beyond existing JPEG standards. The LAR codec was proposed as one response to this call. The LAR framework tends to associate the compression efficiency and the content-based representation. It supports both lossy and lossless coding under the same structure. However, at the beginning of this study, the LAR codec did not implement the rate-distortion-optimization (RDO). This shortage was detrimental for LAR during the AIC evaluation step. Thus, in this work, it is first to characterize the impact of the main parameters of the codec on the compression efficiency, next to construct the RDO models to configure parameters of LAR for achieving optimal or sub-optimal coding efficiencies. Further, based on the RDO models, a “quality constraint” method is introduced to encode the image at a given target MSE/PSNR. The accuracy of the proposed technique, estimated by the ratio between the error variance and the setpoint, is about 10%. Besides, the subjective quality measurement is taken into consideration and the RDO models are locally applied in the image rather than globally. The perceptual quality is improved with a significant gain measured by the objective quality metric SSIM (structural similarity). Aiming at a low complexity and efficient image codec, a new coding scheme is also proposed in lossless mode under the LAR framework. In this context, all the coding steps are changed for a better final compression ratio. A new classification module is also introduced to decrease the entropy of the prediction errors. Experiments show that this lossless codec achieves the equivalent compression ratio to JPEG 2000, while saving 76% of the time consumption in average in encoding and decoding.Ce projet de recherche doctoral vise à proposer solution améliorée du codec de codage d’images LAR (Locally Adaptive Resolution), à la fois d’un point de vue performances de compression et complexité. Plusieurs standards de compression d’images ont été proposés par le passé et mis à profit dans de nombreuses applications multimédia, mais la recherche continue dans ce domaine afin d’offrir de plus grande qualité de codage et/ou de plus faibles complexité de traitements. JPEG fut standardisé il y a vingt ans, et il continue pourtant à être le format de compression le plus utilisé actuellement. Bien qu’avec de meilleures performances de compression, l’utilisation de JPEG 2000 reste limitée due à sa complexité plus importe comparée à JPEG. En 2008, le comité de standardisation JPEG a lancé un appel à proposition appelé AIC (Advanced Image Coding). L’objectif était de pouvoir standardiser de nouvelles technologies allant au-delà des standards existants. Le codec LAR fut alors proposé comme réponse à cet appel. Le système LAR tend à associer une efficacité de compression et une représentation basée contenu. Il supporte le codage avec et sans pertes avec la même structure. Cependant, au début de cette étude, le codec LAR ne mettait pas en oeuvre de techniques d’optimisation débit/distorsions (RDO), ce qui lui fut préjudiciable lors de la phase d’évaluation d’AIC. Ainsi dans ce travail, il s’agit dans un premier temps de caractériser l’impact des principaux paramètres du codec sur l’efficacité de compression, sur la caractérisation des relations existantes entre efficacité de codage, puis de construire des modèles RDO pour la configuration des paramètres afin d’obtenir une efficacité de codage proche de l’optimal. De plus, basée sur ces modèles RDO, une méthode de « contrôle de qualité » est introduite qui permet de coder une image à une cible MSE/PSNR donnée. La précision de la technique proposée, estimée par le rapport entre la variance de l’erreur et la consigne, est d’environ 10%. En supplément, la mesure de qualité subjective est prise en considération et les modèles RDO sont appliqués localement dans l’image et non plus globalement. La qualité perceptuelle est visiblement améliorée, avec un gain significatif mesuré par la métrique de qualité objective SSIM. Avec un double objectif d’efficacité de codage et de basse complexité, un nouveau schéma de codage LAR est également proposé dans le mode sans perte. Dans ce contexte, toutes les étapes de codage sont modifiées pour un meilleur taux de compression final. Un nouveau module de classification est également introduit pour diminuer l’entropie des erreurs de prédiction. Les expérimentations montrent que ce codec sans perte atteint des taux de compression équivalents à ceux de JPEG 2000, tout en économisant 76% du temps de codage et de décodage

Compression d'images satellite par post-transformées dans le domaine ondelettes

Author: Delaunay Xavier
Publication venue: INPT
Publication date: 12/11/2008
Field of study

Le CNES s'intéresse aux nouvelles transformées pour accroître les performances de la compression d'images à bord des satellites d'observation de la Terre. Dans cette thèse nous étudions les post-transformées. Elles sont appliquées après la transformée en ondelettes. Chaque bloc de coefficients d'ondelettes est de nouveau transformé dans une base sélectionnée dans un dictionnaire par minimisation d'un critère débit-distorsion. Nous commençons par mettre en évidence des dépendances entre coefficients d'ondelettes qui limitent les performances en compression. Nous étudions ensuite la transformée en bandelettes par blocs, qui est à l'origine des post-transformées, et nous en optimisons les paramètres pour la compression d'images satellite. En particulier nous adaptons la méthode d'optimisation de Shoham et Gersho à la sélection de la meilleure base de bandelettes. Nous en déduisons une formule du multiplicateur de Lagrange optimal quiintervient dans le critère de sélection. Dans un deuxième temps, nous analysons les dépendances entre coefficients d'ondelettes qui ne sont pas prises en comptes par les bandelettes et nous définissons de nouvelles bases de post-transformées. Les bases construites par ACP minimisent les corrélations entre coefficients post-transformés et compactent l'énergie de chaque bloc sur un petit nombre de coefficients. Cette propriété est exploitée lors du codage entropique. Enfin, nous modifions le critère de sélection des bases pour adapter la post-transformée à une compression progressive. Nous employons alors la post-transformée de Hadamard dans le codeur du CCSDS le tout ayant une faible complexité calculatoire. ABSTRACT : The French Space Agency, CNES, is interested in the transforms derived from the wavelets in order to increase the image compression efficiency on-board of Earth observation satellites. In this thesis, the post-transforms are studied. They are employed after the wavelet transform. Each block of wavelet coefficients is further transformed in a basis selected among a dictionary by minimization of a rate- istortion criterion. First, we emphasize dependencies between wavelet coefficients limiting the compression efficiency. Then, we study the bandelet transform by blocks, from which the post-transforms derive, and we optimize its parameter for the compression of satellite images. Particularly, we adapt Shoham and Gersho optimization method to the problem of the selection of the best bandelet basis. We deduce from these results an expression of the optimal Lagrangian multiplier used in the rate-distortion criterion. Next, we analyze dependencies between wavelet coefficient which are not exploited by the bandelet transform and we define new post-transform bases. Bases build by PCA minimize the correlations between post-transformed coefficients and compact the energy of each block on a small number of coefficients. This feature is exploited during the entropy coding process. Last, we modify the bases selection criterion to adapt the post-transform to progressive compression schemes. We then employ the Hadamard post-transform with the CCSDS image encoder to obtain a low computational complexity yet efficient compression schem

Open Archive Toulouse Archive Ouverte

Institut National Polytechnique de Toulouse (Theses)