1,974 research outputs found

    Transparent encryption with scalable video communication: Lower-latency, CABAC-based schemes

    Get PDF
    Selective encryption masks all of the content without completely hiding it, as full encryption would do at a cost in encryption delay and increased bandwidth. Many commercial applications of video encryption do not even require selective encryption, because greater utility can be gained from transparent encryption, i.e. allowing prospective viewers to glimpse a reduced quality version of the content as a taster. Our lightweight selective encryption scheme when applied to scalable video coding is well suited to transparent encryption. The paper illustrates the gains in reducing delay and increased distortion arising from a transparent encryption that leaves reduced quality base layer in the clear. Reduced encryption of B-frames is a further step beyond transparent encryption in which the computational overhead reduction is traded against content security and limited distortion. This spectrum of video encryption possibilities is analyzed in this paper, though all of the schemes maintain decoder compatibility and add no bitrate overhead as a result of jointly encoding and encrypting the input video by virtue of carefully selecting the entropy coding parameters that are encrypted. The schemes are suitable both for H.264 and HEVC codecs, though demonstrated in the paper for H.264. Selected Content Adaptive Binary Arithmetic Coding (CABAC) parameters are encrypted by a lightweight Exclusive OR technique, which is chosen for practicality

    Highly efficient low-level feature extraction for video representation and retrieval.

    Get PDF
    PhDWitnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords

    Contributions in image and video coding

    Get PDF
    Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vídeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vídeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vídeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vídeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saída totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vídeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vídeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuídos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte típicas de vídeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e híbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os níveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os níveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric

    Digital rights management techniques for H.264 video

    Get PDF
    This work aims to present a number of low-complexity digital rights management (DRM) methodologies for the H.264 standard. Initially, requirements to enforce DRM are analyzed and understood. Based on these requirements, a framework is constructed which puts forth different possibilities that can be explored to satisfy the objective. To implement computationally efficient DRM methods, watermarking and content based copy detection are then chosen as the preferred methodologies. The first approach is based on robust watermarking which modifies the DC residuals of 4×4 macroblocks within I-frames. Robust watermarks are appropriate for content protection and proving ownership. Experimental results show that the technique exhibits encouraging rate-distortion (R-D) characteristics while at the same time being computationally efficient. The problem of content authentication is addressed with the help of two methodologies: irreversible and reversible watermarks. The first approach utilizes the highest frequency coefficient within 4×4 blocks of the I-frames after CAVLC en- tropy encoding to embed a watermark. The technique was found to be very effect- ive in detecting tampering. The second approach applies the difference expansion (DE) method on IPCM macroblocks within P-frames to embed a high-capacity reversible watermark. Experiments prove the technique to be not only fragile and reversible but also exhibiting minimal variation in its R-D characteristics. The final methodology adopted to enforce DRM for H.264 video is based on the concept of signature generation and matching. Specific types of macroblocks within each predefined region of an I-, B- and P-frame are counted at regular intervals in a video clip and an ordinal matrix is constructed based on their count. The matrix is considered to be the signature of that video clip and is matched with longer video sequences to detect copies within them. Simulation results show that the matching methodology is capable of not only detecting copies but also its location within a longer video sequence. Performance analysis depict acceptable false positive and false negative rates and encouraging receiver operating charac- teristics. Finally, the time taken to match and locate copies is significantly low which makes it ideal for use in broadcast and streaming applications

    Platforms for handling and development of audiovisual data

    Get PDF
    Estágio realizado na MOG Solutions e orientado por Vítor TeixeiraTese de mestrado integrado. Engenharia Informátca e Computação. Faculdade de Engenharia. Universidade do Porto. 200

    Analysis and Comparison of Modern Video Compression Standards for Random-access Light-field Compression

    Get PDF
    Light-field (LF) 3D displays are anticipated to be the next-generation 3D displays by providing smooth motion parallax, wide field of view (FOV), and higher depth range than the current autostereoscopic displays. The projection-based multi-view LF 3D displays bring the desired new functionalities through a set of projection engines creating light sources for the continuous light field to be created. Such displays require a high number of perspective views as an input to fully exploit the visualization capabilities and viewing angle provided by the LF technology. Delivering, processing and de/compressing this amount of views pose big technical challenges. However, when processing light fields in a distributed system, access patterns in ray space are quite regular, some processing nodes do not need all views, moreover the necessary views are used only partially. This trait could be exploited by partial decoding of pictures to help providing less complex and thus real-time operation. However, none of the recent video coding standards (e.g., Advanced Video Coding (AVC)/H.264 and High Efficiency Video Coding (HEVC)/H.265 standards) provides partial decoding of video pictures. Such feature can be achieved by partitioning video pictures into partitions that can be processed independently at the cost of lowering the compression efficiency. Examples of such partitioning features introduced by the modern video coding standards include slices and tiles, which enable random access into the video bitstreams with a specific granularity. In addition, some extra requirements have to be imposed on the standard partitioning tools in order to be applicable in the context of partial decoding. This leads to partitions called self-contained which refers to isolated or independently decodable regions in the video pictures. This work studies the problem of creating self-contained partitions in the conventional AVC/H.264 and HEVC/H.265 standards, and HEVC 3D extensions including multi-view (i.e., MV-HEVC) and 3D (i.e., 3D-HEVC) extensions using slices and tiles, respectively. The requirements that need to be fulfilled in order to build self-contained partitions are described, and an encoder-side solution is proposed. Further, the work examines how slicing/tiling can be used to facilitate random access into the video bitstreams, how the number of slices/tiles affects the compression ratio considering different prediction structures, and how much effect partial decoding has on decoding time. Overall, the experimental results indicate that the finer the partitioning is, the higher the compression loss occurs. The usage of self-contained partitions makes the decoding operation very efficient and less complex

    Resource-Constrained Low-Complexity Video Coding for Wireless Transmission

    Get PDF
    corecore