24,552 research outputs found

    A Family of Hierarchical Encoding Techniques for Image and Video Communications

    Get PDF
    As the demand for image and video transmission and interactive multimedia applications continues to grow, scalable image and video compression that has robust behavior over unreliable channels are of increasing interest. These desktop applications require scalability as a main feature due to its heterogeneous nature, since participants in an interactive multimedia application have different needs and processing power. Also, the encoding and decoding algorithm complexity must be low due to the practical considerations of low-cost low-power receiver terminals. This requires image and video encoding techniques that jointly considers compression, scalability, robustness, and simplicity. In this dissertation, we present a family of image and video-encoding techniques, which are developed to support conferencing applications. We achieve scalability, robustness and low computational complexity by building our encoding techniques based on the quadtree and octree representation methods. First we developed an image encoding technique using the quadtree representation of images and vector quantization. We use a mean-removal technique to separate the means image and the difference image. The difference image is then encoded as a breadth first traversal of the quadtree corresponding to the image. Vector quantization is then used to compress the quadtree nodes based on the spatial locality of the quadtree data. Our next step was to use the quadtree-based image encoding technique as a base for developing a differential video encoding technique. We extended it to encode video by applying the well-known IPB technique to the image encoding system. Then, we explore another method of extending our image encoding technique to encode video streams. The basic idea was to use exactly the same three steps used in our image encoding technique, mean removal, conversion to tree structure, and vector quantization, and replace the quadtree structure with an octree structure. The octree is the three-dimensional equivalent of the quadtree. We divide the sequence of frames into groups and view each group as a three-dimensional object. By encoding frames together, we can obtain substantial savings in encoding time and better compression results. Finally, we combined both the differential quadtree and octree approaches to generate a new hybrid encoding technique. We encode one frame using the quadtree-based image encoding technique, and then encode the following group of frames as a differential octree based upon the first frame. Using a set of experiments, the quadtree-based image encoding and differential video encoding techniques were shown to provide reasonable compression in comparison with similar techniques, while the octree and hybrid video encoding techniques gave impressive compression results. Furthermore, we demonstrated that our encoding techniques are time efficient compared to the more common frequency based techniques. We also compare their scalability feature favorably with other well-known scalable techniques. Moreover, we demonstrated their ability to tolerate and conceal error. The new encoding techniques proved to be efficient methods of encoding for interactive multimedia applications

    In-Band Disparity Compensation for Multiview Image Compression and View Synthesis

    Get PDF

    Optimized mobile thin clients through a MPEG-4 BiFS semantic remote display framework

    Get PDF
    According to the thin client computing principle, the user interface is physically separated from the application logic. In practice only a viewer component is executed on the client device, rendering the display updates received from the distant application server and capturing the user interaction. Existing remote display frameworks are not optimized to encode the complex scenes of modern applications, which are composed of objects with very diverse graphical characteristics. In order to tackle this challenge, we propose to transfer to the client, in addition to the binary encoded objects, semantic information about the characteristics of each object. Through this semantic knowledge, the client is enabled to react autonomously on user input and does not have to wait for the display update from the server. Resulting in a reduction of the interaction latency and a mitigation of the bursty remote display traffic pattern, the presented framework is of particular interest in a wireless context, where the bandwidth is limited and expensive. In this paper, we describe a generic architecture of a semantic remote display framework. Furthermore, we have developed a prototype using the MPEG-4 Binary Format for Scenes to convey the semantic information to the client. We experimentally compare the bandwidth consumption of MPEG-4 BiFS with existing, non-semantic, remote display frameworks. In a text editing scenario, we realize an average reduction of 23% of the data peaks that are observed in remote display protocol traffic

    Towards a multimedia remote viewer for mobile thin clients

    Get PDF
    Be there a traditional mobile user wanting to connect to a remote multimedia server. In order to allow them to enjoy the same user experience remotely (play, interact, edit, store and share capabilities) as in a traditional fixed LAN environment, several dead-locks are to be dealt with: (1) a heavy and heterogeneous content should be sent through a bandwidth constrained network; (2) the displayed content should be of good quality; (3) user interaction should be processed in real-time and (4) the complexity of the practical solution should not exceed the features of the mobile client in terms of CPU, memory and battery. The present paper takes this challenge and presents a fully operational MPEG-4 BiFS solution

    Motion and disparity estimation with self adapted evolutionary strategy in 3D video coding

    Get PDF
    Real world information, obtained by humans is three dimensional (3-D). In experimental user-trials, subjective assessments have clearly demonstrated the increased impact of 3-D pictures compared to conventional flat-picture techniques. It is reasonable, therefore, that we humans want an imaging system that produces pictures that are as natural and real as things we see and experience every day. Three-dimensional imaging and hence, 3-D television (3DTV) are very promising approaches expected to satisfy these desires. Integral imaging, which can capture true 3D color images with only one camera, has been seen as the right technology to offer stress-free viewing to audiences of more than one person. In this paper, we propose a novel approach to use Evolutionary Strategy (ES) for joint motion and disparity estimation to compress 3D integral video sequences. We propose to decompose the integral video sequence down to viewpoint video sequences and jointly exploit motion and disparity redundancies to maximize the compression using a self adapted ES. A half pixel refinement algorithm is then applied by interpolating macro blocks in the previous frame to further improve the video quality. Experimental results demonstrate that the proposed adaptable ES with Half Pixel Joint Motion and Disparity Estimation can up to 1.5 dB objective quality gain without any additional computational cost over our previous algorithm.1Furthermore, the proposed technique get similar objective quality compared to the full search algorithm by reducing the computational cost up to 90%

    ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain

    Get PDF
    We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated.Comment: Accepted to CVPR Blockchain Workshop 201
    • 

    corecore