1,527 research outputs found

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicações de vídeo, o papel do vídeo decodificado é muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicações mais poderosas por meio de sinais de vídeo,é cada vez mais crítico não apenas considerar a qualidade do conteúdo objetivando sua visualização, mas também possibilitar meios de realizar busca por conteúdos semelhantes. Requisitos de visualização e de busca são considerados, por exemplo, em modernas aplicações de vídeo vigilância e comunicações pessoais. No entanto, as atuais soluções de codificação de vídeo são fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho é propor uma solução de codificação de vídeo de propósito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, é proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels é combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros são codificados usando um conjunto de pares de keypoints casados, possibilitando não apenas visualização, mas também provendo ao decodificador valiosas informações de features visuais, extraídas no codificador a partir do conteúdo original, que são instrumentais em aplicações de busca. A solução proposta emprega um esquema flexível de otimização Lagrangiana onde o processamento baseado em pixel é combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrão HEVC tanto em termos de visualização quanto de busca

    Transformées basées graphes pour la compression de nouvelles modalités d’image

    Get PDF
    Due to the large availability of new camera types capturing extra geometrical information, as well as the emergence of new image modalities such as light fields and omni-directional images, a huge amount of high dimensional data has to be stored and delivered. The ever growing streaming and storage requirements of these new image modalities require novel image coding tools that exploit the complex structure of those data. This thesis aims at exploring novel graph based approaches for adapting traditional image transform coding techniques to the emerging data types where the sampled information are lying on irregular structures. In a first contribution, novel local graph based transforms are designed for light field compact representations. By leveraging a careful design of local transform supports and a local basis functions optimization procedure, significant improvements in terms of energy compaction can be obtained. Nevertheless, the locality of the supports did not permit to exploit long term dependencies of the signal. This led to a second contribution where different sampling strategies are investigated. Coupled with novel prediction methods, they led to very prominent results for quasi-lossless compression of light fields. The third part of the thesis focuses on the definition of rate-distortion optimized sub-graphs for the coding of omni-directional content. If we move further and give more degree of freedom to the graphs we wish to use, we can learn or define a model (set of weights on the edges) that might not be entirely reliable for transform design. The last part of the thesis is dedicated to theoretically analyze the effect of the uncertainty on the efficiency of the graph transforms.En raison de la grande disponibilité de nouveaux types de caméras capturant des informations géométriques supplémentaires, ainsi que de l'émergence de nouvelles modalités d'image telles que les champs de lumière et les images omnidirectionnelles, il est nécessaire de stocker et de diffuser une quantité énorme de hautes dimensions. Les exigences croissantes en matière de streaming et de stockage de ces nouvelles modalités d’image nécessitent de nouveaux outils de codage d’images exploitant la structure complexe de ces données. Cette thèse a pour but d'explorer de nouvelles approches basées sur les graphes pour adapter les techniques de codage de transformées d'image aux types de données émergents où les informations échantillonnées reposent sur des structures irrégulières. Dans une première contribution, de nouvelles transformées basées sur des graphes locaux sont conçues pour des représentations compactes des champs de lumière. En tirant parti d’une conception minutieuse des supports de transformées locaux et d’une procédure d’optimisation locale des fonctions de base , il est possible d’améliorer considérablement le compaction d'énergie. Néanmoins, la localisation des supports ne permettait pas d'exploiter les dépendances à long terme du signal. Cela a conduit à une deuxième contribution où différentes stratégies d'échantillonnage sont étudiées. Couplés à de nouvelles méthodes de prédiction, ils ont conduit à des résultats très importants en ce qui concerne la compression quasi sans perte de champs de lumière statiques. La troisième partie de la thèse porte sur la définition de sous-graphes optimisés en distorsion de débit pour le codage de contenu omnidirectionnel. Si nous allons plus loin et donnons plus de liberté aux graphes que nous souhaitons utiliser, nous pouvons apprendre ou définir un modèle (ensemble de poids sur les arêtes) qui pourrait ne pas être entièrement fiable pour la conception de transformées. La dernière partie de la thèse est consacrée à l'analyse théorique de l'effet de l'incertitude sur l'efficacité des transformées basées graphes

    High dynamic range video compression exploiting luminance masking

    Get PDF

    Low bit-rate image sequence coding

    Get PDF

    Discontinuity-Aware Base-Mesh Modeling of Depth for Scalable Multiview Image Synthesis and Compression

    Full text link
    This thesis is concerned with the challenge of deriving disparity from sparsely communicated depth for performing disparity-compensated view synthesis for compression and rendering of multiview images. The modeling of depth is essential for deducing disparity at view locations where depth is not available and is also critical for visibility reasoning and occlusion handling. This thesis first explores disparity derivation methods and disparity-compensated view synthesis approaches. Investigations reveal the merits of adopting a piece-wise continuous mesh description of depth for deriving disparity at target view locations to enable disparity-compensated backward warping of texture. Visibility information can be reasoned due to the correspondence relationship between views that a mesh model provides, while the connectivity of a mesh model assists in resolving depth occlusion. The recent JPEG 2000 Part-17 extension defines tools for scalable coding of discontinuous media using breakpoint-dependent DWT, where breakpoints describe discontinuity boundary geometry. This thesis proposes a method to efficiently reconstruct depth coded using JPEG 2000 Part-17 as a piece-wise continuous mesh, where discontinuities are driven by the encoded breakpoints. Results show that the proposed mesh can accurately represent decoded depth while its complexity scales along with decoded depth quality. The piece-wise continuous mesh model anchored at a single viewpoint or base-view can be augmented to form a multi-layered structure where the underlying layers carry depth information of regions that are occluded at the base-view. Such a consolidated mesh representation is termed a base-mesh model and can be projected to many viewpoints, to deduce complete disparity fields between any pair of views that are inherently consistent. Experimental results demonstrate the superior performance of the base-mesh model in multiview synthesis and compression compared to other state-of-the-art methods, including the JPEG Pleno light field codec. The proposed base-mesh model departs greatly from conventional pixel-wise or block-wise depth models and their forward depth mapping for deriving disparity ingrained in existing multiview processing systems. When performing disparity-compensated view synthesis, there can be regions for which reference texture is unavailable, and inpainting is required. A new depth-guided texture inpainting algorithm is proposed to restore occluded texture in regions where depth information is either available or can be inferred using the base-mesh model

    Towards Computational Efficiency of Next Generation Multimedia Systems

    Get PDF
    To address throughput demands of complex applications (like Multimedia), a next-generation system designer needs to co-design and co-optimize the hardware and software layers. Hardware/software knobs must be tuned in synergy to increase the throughput efficiency. This thesis provides such algorithmic and architectural solutions, while considering the new technology challenges (power-cap and memory aging). The goal is to maximize the throughput efficiency, under timing- and hardware-constraints

    Study of phase noise in optical coherent systems

    Get PDF
    Le bruit de phase est un problème important dans la conception de systèmes cohérents optiques. Bien que le bruit de phase soit étudié énormément dans les communications sans fil, certains aspects de bruit de phase sont nouveaux dans des systèmes cohérents optiques. Dans cette thèse, nous explorons les statistiques de bruit de phase dans les systèmes optiques cohérentes et proposons une nouvelle technique pour améliorer la robustesse du système envers le bruit de phase. Notre première contribution traite de l’étude des statistiques de bruit de phase en présence de compensation électronique de la dispersion chromatique (CD) dans des systèmes cohérents. Nous montrons que le modèle proposé précédemment pour l’interaction de CD avec bruit de phase doit être modifié à cause d’un modèle trop simple pour la récupération de phase. Nous dérivons une expression plus précise pour le bruit de phase estimé par la récupération de phase avec décision dirigée (DD), et utilisons cette expression pour modifier les statistiques de décision pour les symboles reçus. Nous calculons le taux d’erreur binaire (BER) pour le format de transmission DQPSK semi-analytiquement en utilisant nos statistiques de décision modifiées et montrons que pour la récupération de phase idéale, le BER semi-analytique est bien assorti avec le BER simulé avec la technique Monte-Carlo (MC). Notre deuxième contribution est l’adaptation d’une technique de codage MLCM pour les systèmes cohérents limités par le bruit de phase et le bruit blanc additif Gaussien (AWGN). Nous montrons que la combinaison d’une constellation optimisée pour le bruit de phase avec MLCM offre un système robuste à complexité modérée. Nous vérifions que la performance de MLCM dans des systèmes cohérents avec constellations 16-aires se détériorés par le bruit de phase non-linéaire et de Wiener. Pour le bruit de phase non-linéaire, notre conception de MLCM démontre une performance supérieure par rapport àune conception de MLCM déjà présente dans la littérature. Pour le bruit de phase de Wiener, nous comparons deux format de transmission, constellations carrées et optimisée pour bruit de phase, et deux techniques de codage, MLCM et codage à débit uniforme. Nos résultats expérimentaux pour BER après codage suivent les mêmes tendances que le BER simulé et confirment notre conception.Phase noise is an important issue in designing today’s optical coherent systems. Although phase noise is studied heavily in wireless communications, some aspects of phase noise are novel in optical coherent systems. In this thesis we explore phase noise statistics in optical coherent systems and propose a novel technique to increase system robustness toward phase noise. Our first contribution deals with the study of phase noise statistics in the presence of electronic chromatic dispersion (CD) compensation in coherent systems. We show that previously proposed model for phase noise and CD interaction must be modified due to an overly simple model of carrier phase recovery. We derive a more accurate expression for the estimated phase noise of decision directed (DD) carrier phase recovery, and use this expression to modify the decision statistics of received symbols. We calculate bit error rate (BER) of a differential quadrature phase shift keying (DQPSK) system semi-analytically using our modified decision statistics and show that for ideal DD carrier phase recovery the semi-analytical BER matches the BER simulated via Monte-Carlo (MC) technique. We show that the semi-analytical BER is a lower bound of simulated BER from Viterbi-Viterbi (VV) carrier phase recovery for a wide range of practical system parameters. Our second contribution is concerned with adapting a multi-level coded modulation (MLCM) technique for phase noise and additive white Gaussian noise (AWGN) limited coherent system. We show that the combination of a phase noise optimized constellation with MLCM offers a phase-noise robust system at moderate complexity. We propose a numerical method to design set-partitioning (mapping bits to symbols) and optimizing code rates for minimum block error rate (BLER).We verify MLCM performance in coherent systems of 16-ary constellations impaired by nonlinear and Wiener phase noise. For nonlinear phase noise, superior performance of our MLCM design over a previously designed MLCM system is demonstrated in terms of BLER. For Wiener phase noise, we compare optimized and square 16-QAM constellations assuming either MLCM or uniform rate coding. We compare post forward error correction (FEC) BER in addition to BLER by both simulation and experiment and show that superior BLER performance is translated into post FEC BER. Our experimental post FEC BER results follow the same trends as simulated BER, validating our design
    • …
    corecore