185 research outputs found

    Implementation of BMA based motion estimation hardware accelerator in HDL

    Full text link
    Motion Estimation in MPEG (Motion Pictures Experts Group) video is a temporal prediction technique. The basic principle of motion estimation is that in most cases, consecutive video frames will be similar except for changes induced by objects moving within the frames. Motion Estimation performs a comprehensive 2-dimensional spatial search for each luminance macroblock (16x16 pixel block). MPEG does not define how this search should be performed. This is a detail that the system designer can choose to implement in one of many possible ways. It is well known that a full, exhaustive search over a wide 2-dimensional area yields the best matching results in most cases, but this performance comes at an extreme computational cost to the encoder. Some lower cost encoders might choose to limit the pixel search range, or use other techniques usually at some cost to the video quality which gives rise to a trade-off; Such algorithms used in image processing are generally computationally expensive. FPGAs are capable of running graphics algorithms at the speed comparable to dedicated graphics chips. At the same time they are configurable through high-level programming languages, e.g. Verilog, VHDL. The work presented entirely focuses upon a Hardware Accelerator capable of performing Motion Estimation, based upon Block Matching Algorithm. The SAD based Full Search Motion Estimation coded using Verilog HDL, relies upon a 32x32 pixel search area to find the best match for single 16x16 macroblock; Keywords. Motion Estimation, MPEG, macroblock, FPGA, SAD, Verilog, VHDL

    A framework for multimedia playback and analysis of MPEG-2 videos with FFmpeg

    Get PDF
    Fast Forward Motion Pictures Expert Group (FFmpeg) is a well-known, high performance, cross platform open source library for recording, streaming, and playback of video and audio in various formats, namely, Motion Pictures Expert Group (MPEG), H.264, Audio Video Interleave (AVI), just to name a few. With FFmpeg current licensing options, it is also suitable for both open source and commercial software development. FFmpeg contains over 100 open source codecs for video encoding and decoding. Given the complexities of MPEG standards, FFmpeg still lacks a framework for (1) seeking to a particular image frame in a video, which is needed for accurate annotation at the frame level for applications in fields such as medical domain, digital communications and commercial video broadcasting and (2) motion vectors extraction for analysis of motion patterns in video content. Most importantly, FFmpeg code base is not well documented, which has raised a significant difficulty for developing an extension. As our contributions, we extended FFmpeg code base to include new APIs and libraries support accurate frame-level seek, motion vector extraction, and MPEG-2 video encoding/decoding. We documented FFmpeg MPEG-2 codec to facilitate future software development. We evaluated the performance of our implementation against a high-performance third-party commercial software development kit on videos captured from television broadcasts and from endoscopy procedures. To evaluate the usability of our libraries, we integrated them with some commercial applications. In the following sections, we will discuss our software architecture, important implementation details, performance evaluation results, and lessons learned

    Multiplexing video traffic using frame-skipping aggregation technique.

    Get PDF
    by Alan Yeung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1998.Includes bibliographical references (leaves 53-[56]).Abstract also in Chinese.Chapter 1 --- Introduction --- p.1Chapter 2 --- MPEG Overview --- p.5Chapter 3 --- Framework of Frame-Skipping Lossy Aggregation --- p.10Chapter 3.1 --- Video Frames Delivery using Round-Robin Scheduling --- p.10Chapter 3.2 --- Underflow Safety Margin on Receiver Buffers --- p.12Chapter 3.3 --- Algorithm in Frame-Skipping Aggregation Controller --- p.13Chapter 4 --- Replacement of Skipped Frames in MPEG Sequence --- p.17Chapter 5 --- Subjective Assessment Test on Frame-Skipped Video --- p.21Chapter 5.1 --- Test Settings and Material --- p.22Chapter 5.2 --- Choice of Test Methods --- p.23Chapter 5.3 --- Test Procedures --- p.25Chapter 5.4 --- Test Results --- p.26Chapter 6 --- Performance Study --- p.29Chapter 6.1 --- Experiment 1: Number of Supportable Streams --- p.31Chapter 6.2 --- Experiment 2: Frame-Skipping Rate When Multiplexing on a Leased T3 Link --- p.33Chapter 6.3 --- Experiment 3: Bandwidth Usage --- p.35Chapter 6.4 --- Experiment 4: Optimal USMT --- p.38Chapter 7 --- Implementation Considerations --- p.41Chapter 8 --- Conclusions --- p.45Chapter A --- The Construction of Stuffed Artificial B Frame --- p.48Bibliography --- p.5

    Network streaming and compression for mixed reality tele-immersion

    Get PDF
    Bulterman, D.C.A. [Promotor]Cesar, P.S. [Copromotor

    Faster and Accurate Compressed Video Action Recognition Straight from the Frequency Domain

    Full text link
    Human action recognition has become one of the most active field of research in computer vision due to its wide range of applications, like surveillance, medical, industrial environments, smart homes, among others. Recently, deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos. Most of the existing deep learning approaches have been designed for processing video information as RGB image sequences. For this reason, a preliminary decoding process is required, since video data are often stored in a compressed format. However, a high computational load and memory usage is demanded for decoding a video. To overcome this problem, we propose a deep neural network capable of learning straight from compressed video. Our approach was evaluated on two public benchmarks, the UCF-101 and HMDB-51 datasets, demonstrating comparable recognition performance to the state-of-the-art methods, with the advantage of running up to 2 times faster in terms of inference speed

    Algorithms for compression of high dynamic range images and video

    Get PDF
    The recent advances in sensor and display technologies have brought upon the High Dynamic Range (HDR) imaging capability. The modern multiple exposure HDR sensors can achieve the dynamic range of 100-120 dB and LED and OLED display devices have contrast ratios of 10^5:1 to 10^6:1. Despite the above advances in technology the image/video compression algorithms and associated hardware are yet based on Standard Dynamic Range (SDR) technology, i.e. they operate within an effective dynamic range of up to 70 dB for 8 bit gamma corrected images. Further the existing infrastructure for content distribution is also designed for SDR, which creates interoperability problems with true HDR capture and display equipment. The current solutions for the above problem include tone mapping the HDR content to fit SDR. However this approach leads to image quality associated problems, when strong dynamic range compression is applied. Even though some HDR-only solutions have been proposed in literature, they are not interoperable with current SDR infrastructure and are thus typically used in closed systems. Given the above observations a research gap was identified in the need for efficient algorithms for the compression of still images and video, which are capable of storing full dynamic range and colour gamut of HDR images and at the same time backward compatible with existing SDR infrastructure. To improve the usability of SDR content it is vital that any such algorithms should accommodate different tone mapping operators, including those that are spatially non-uniform. In the course of the research presented in this thesis a novel two layer CODEC architecture is introduced for both HDR image and video coding. Further a universal and computationally efficient approximation of the tone mapping operator is developed and presented. It is shown that the use of perceptually uniform colourspaces for internal representation of pixel data enables improved compression efficiency of the algorithms. Further proposed novel approaches to the compression of metadata for the tone mapping operator is shown to improve compression performance for low bitrate video content. Multiple compression algorithms are designed, implemented and compared and quality-complexity trade-offs are identified. Finally practical aspects of implementing the developed algorithms are explored by automating the design space exploration flow and integrating the high level systems design framework with domain specific tools for synthesis and simulation of multiprocessor systems. The directions for further work are also presented

    3D coding tools final report

    Get PDF
    Livrable D4.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D4.3 du projet. Son titre : 3D coding tools final repor

    A multi-objective performance optimisation framework for video coding

    Get PDF
    Digital video technologies have become an essential part of the way visual information is created, consumed and communicated. However, due to the unprecedented growth of digital video technologies, competition for bandwidth resources has become fierce. This has highlighted a critical need for optimising the performance of video encoders. However, there is a dual optimisation problem, wherein, the objective is to reduce the buffer and memory requirements while maintaining the quality of the encoded video. Additionally, through the analysis of existing video compression techniques, it was found that the operation of video encoders requires the optimisation of numerous decision parameters to achieve the best trade-offs between factors that affect visual quality; given the resource limitations arising from operational constraints such as memory and complexity. The research in this thesis has focused on optimising the performance of the H.264/AVC video encoder, a process that involved finding solutions for multiple conflicting objectives. As part of this research, an automated tool for optimising video compression to achieve an optimal trade-off between bit rate and visual quality, given maximum allowed memory and computational complexity constraints, within a diverse range of scene environments, has been developed. Moreover, the evaluation of this optimisation framework has highlighted the effectiveness of the developed solution

    Challenges and solutions in H.265/HEVC for integrating consumer electronics in professional video systems

    Get PDF

    Semi-automatic video object segmentation for multimedia applications

    Get PDF
    A semi-automatic video object segmentation tool is presented for segmenting both still pictures and image sequences. The approach comprises both automatic segmentation algorithms and manual user interaction. The still image segmentation component is comprised of a conventional spatial segmentation algorithm (Recursive Shortest Spanning Tree (RSST)), a hierarchical segmentation representation method (Binary Partition Tree (BPT)), and user interaction. An initial segmentation partition of homogeneous regions is created using RSST. The BPT technique is then used to merge these regions and hierarchically represent the segmentation in a binary tree. The semantic objects are then manually built by selectively clicking on image regions. A video object-tracking component enables image sequence segmentation, and this subsystem is based on motion estimation, spatial segmentation, object projection, region classification, and user interaction. The motion between the previous frame and the current frame is estimated, and the previous object is then projected onto the current partition. A region classification technique is used to determine which regions in the current partition belong to the projected object. User interaction is allowed for object re-initialisation when the segmentation results become inaccurate. The combination of all these components enables offline video sequence segmentation. The results presented on standard test sequences illustrate the potential use of this system for object-based coding and representation of multimedia
    • 

    corecore