4,409 research outputs found

    Hardware acceleration architectures for MPEG-Based mobile video platforms: a brief overview

    Get PDF
    This paper presents a brief overview of past and current hardware acceleration (HwA) approaches that have been proposed for the most computationally intensive compression tools of the MPEG-4 standard. These approaches are classified based on their historical evolution and architectural approach. An analysis of both evolutionary and functional classifications is carried out in order to speculate on the possible trends of the HwA architectures to be employed in mobile video platforms

    Energy-efficient acceleration of MPEG-4 compression tools

    Get PDF
    We propose novel hardware accelerator architectures for the most computationally demanding algorithms of the MPEG-4 video compression standard-motion estimation, binary motion estimation (for shape coding), and the forward/inverse discrete cosine transforms (incorporating shape adaptive modes). These accelerators have been designed using general low-energy design philosophies at the algorithmic/architectural abstraction levels. The themes of these philosophies are avoiding waste and trading area/performance for power and energy gains. Each core has been synthesised targeting TSMC 0.09 μm TCBN90LP technology, and the experimental results presented in this paper show that the proposed cores improve upon the prior art

    Depth-based Multi-View 3D Video Coding

    Get PDF

    FlowNet: Learning Optical Flow with Convolutional Networks

    Full text link
    Convolutional neural networks (CNNs) have recently been very successful in a variety of computer vision tasks, especially on those linked to recognition. Optical flow estimation has not been among the tasks where CNNs were successful. In this paper we construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task. We propose and compare two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations. Since existing ground truth data sets are not sufficiently large to train a CNN, we generate a synthetic Flying Chairs dataset. We show that networks trained on this unrealistic data still generalize very well to existing datasets such as Sintel and KITTI, achieving competitive accuracy at frame rates of 5 to 10 fps.Comment: Added supplementary materia

    Efficient Video Transport over Lossy Networks

    Full text link
    Nowadays, packet video is an important application of the Internet. Unfortunately the capacity of the Internet is still very heterogeneous because it connects high bandwidth ATM networks as well as low bandwidth ISDN dial in lines. The MPEG-2 and MPEG-4 video compression standards provide efficient video encoding for high and low bandwidth media streams. In particular they include two paradigms which make those standards suitable for the transmission of video via heterogeneous networks. Both support layered video streams and MPEG-4 additionally allows the independent coding of video objects. In this paper we discuss those two paradigms, give an overview of the MPEG video compression standards and describe transport protocols for Real Time Media transport over lossy networks. Furthermore, we propose a real-time segmentation approach for extracting video objects in teleteaching scenarios

    Joint source-channel multistream coding and optical network adapter design for video over IP

    Full text link

    Video object segmentation for future multimedia applications

    Get PDF
    An efficient representation of two-dimensional visual objects is specified by an emerging audiovisual compression standard known as MPEG-4. It incorporates the advantages of segmentation-based video compression (whereby objects are encoded independently, facilitating content-based functionalities), and also the advantages of more traditional block-based approaches (such as low delay and compression efficiency). What is not specified, however, is the method of extracting semantic objects from a scene corresponding to a video segmentation task. An accurate, robust and flexible solution to this is essential to enable the future multimedia applications possible with MPEG-4. Two categories of video segmentation approaches can be identified: supervised and unsupervised. A representative set of unsupervised approaches is discussed. These approaches are found to be suitable for real-time MPEG-4 applications. However, they are not suitable for off-line applications which require very accurate segmentations of entire semantic objects. This is because an automatic segmentation process cannot solve the ill-posed problem of extracting semantic meaning from a scene. Supervised segmentation incorporates user interaction so that semantic objects in a scene can be defined. A representative set of supervised approaches with greater or lesser degrees of interaction is discussed. Three new approaches to the problem, each more sophisticated than the last, are presented by the author. The most sophisticated is an object-based approach in which an automatic segmentation and tracking algorithm is used to perform a segmentation of a scene in terms of the semantic objects defined by the user. The approach relies on maximum likelihood estimation of the parameters of mixtures of multimodal multivariate probability distribution functions. The approach is an enhanced and modified version of an existing approach yielding more sophisticated object modelling. The segmentation results obtained are comparable to those of existing approaches and in many cases better. It is concluded that the author’s approach is ideal as a content extraction tool for future off-line MPEG-4 applications
    corecore