241 research outputs found

    Statistical framework for video decoding complexity modeling and prediction

    Get PDF
    Video decoding complexity modeling and prediction is an increasingly important issue for efficient resource utilization in a variety of applications, including task scheduling, receiver-driven complexity shaping, and adaptive dynamic voltage scaling. In this paper we present a novel view of this problem based on a statistical framework perspective. We explore the statistical structure (clustering) of the execution time required by each video decoder module (entropy decoding, motion compensation, etc.) in conjunction with complexity features that are easily extractable at encoding time (representing the properties of each module's input source data). For this purpose, we employ Gaussian mixture models (GMMs) and an expectation-maximization algorithm to estimate the joint execution-time - feature probability density function (PDF). A training set of typical video sequences is used for this purpose in an offline estimation process. The obtained GMM representation is used in conjunction with the complexity features of new video sequences to predict the execution time required for the decoding of these sequences. Several prediction approaches are discussed and compared. The potential mismatch between the training set and new video content is addressed by adaptive online joint-PDF re-estimation. An experimental comparison is performed to evaluate the different approaches and compare the proposed prediction scheme with related resource prediction schemes from the literature. The usefulness of the proposed complexity-prediction approaches is demonstrated in an application of rate-distortion-complexity optimized decoding

    Description-driven Adaptation of Media Resources

    Get PDF
    The current multimedia landscape is characterized by a significant diversity in terms of available media formats, network technologies, and device properties. This heterogeneity has resulted in a number of new challenges, such as providing universal access to multimedia content. A solution for this diversity is the use of scalable bit streams, as well as the deployment of a complementary system that is capable of adapting scalable bit streams to the constraints imposed by a particular usage environment (e.g., the limited screen resolution of a mobile device). This dissertation investigates the use of an XML-driven (Extensible Markup Language) framework for the format-independent adaptation of scalable bit streams. Using this approach, the structure of a bit stream is first translated into an XML description. In a next step, the resulting XML description is transformed to reflect a desired adaptation of the bit stream. Finally, the transformed XML description is used to create an adapted bit stream that is suited for playback in the targeted usage environment. The main contribution of this dissertation is BFlavor, a new tool for exposing the syntax of binary media resources as an XML description. Its development was inspired by two other technologies, i.e. MPEG-21 BSDL (Bitstream Syntax Description Language) and XFlavor (Formal Language for Audio-Visual Object Representation, extended with XML features). Although created from a different point of view, both languages offer solutions for translating the syntax of a media resource into an XML representation for further processing. BFlavor (BSDL+XFlavor) harmonizes the two technologies by combining their strengths and eliminating their weaknesses. The expressive power and performance of a BFlavor-based content adaptation chain, compared to tool chains entirely based on either BSDL or XFlavor, were investigated by several experiments. One series of experiments targeted the exploitation of multi-layered temporal scalability in H.264/AVC, paying particular attention to the use of sub-sequences and hierarchical coding patterns, as well as to the use of metadata messages to communicate the bit stream structure to the adaptation logic. BFlavor was the only tool to offer an elegant and practical solution for XML-driven adaptation of H.264/AVC bit streams in the temporal domain

    New Trends in Multimedia Standards: MPEG4 and JPEG2000

    Get PDF

    Robust and efficient video/image transmission

    Get PDF
    The Internet has become a primary medium for information transmission. The unreliability of channel conditions, limited channel bandwidth and explosive growth of information transmission requests, however, hinder its further development. Hence, research on robust and efficient delivery of video/image content is demanding nowadays. Three aspects of this task, error burst correction, efficient rate allocation and random error protection are investigated in this dissertation. A novel technique, called successive packing, is proposed for combating multi-dimensional (M-D) bursts of errors. A new concept of basis interleaving array is introduced. By combining different basis arrays, effective M-D interleaving can be realized. It has been shown that this algorithm can be implemented only once and yet optimal for a set of error bursts having different sizes for a given two-dimensional (2-D) array. To adapt to variable channel conditions, a novel rate allocation technique is proposed for FineGranular Scalability (FGS) coded video, in which real data based rate-distortion modeling is developed, constant quality constraint is adopted and sliding window approach is proposed to adapt to the variable channel conditions. By using the proposed technique, constant quality is realized among frames by solving a set of linear functions. Thus, significant computational simplification is achieved compared with the state-of-the-art techniques. The reduction of the overall distortion is obtained at the same time. To combat the random error during the transmission, an unequal error protection (UEP) method and a robust error-concealment strategy are proposed for scalable coded video bitstreams

    Rinnakkainen toteutus H.265 videokoodaus standardille

    Get PDF
    The objective of this study was to research the scalability of the parallel features in the new H.265 video compression standard, also know as High Efficiency Video Coding (HEVC). Compared to its predecessor, the H.264 standard, H.265 typically achieves around 50% bitrate reduction for the same subjective video quality. Especially videos with higher resolution (Full HD and beyond) achieve better compression ratios. Also a better utilization of parallel computing resources is provided. H.265 introduces two novel parallelization features: Tiles and Wavefront Parallel Processing (WPP). In Tiles, each video frame is divided into areas that can be decoded without referencing to other areas in the same frame. In WPP, the relations between code blocks in a frame are encoded so that the decoding process can progress through the frame as a front using multiple threads. In this study, the reference implementation for the H.265 decoder was augmented to support both of these parallelization features. The performance of the parallel implementations was measured using three different setups. From the measurement results it could be seen that the introduction of more CPU cores reduced the total decode time of the video frames to a certain point. When using the Tiles feature, it was observed that the encoding geometry, i.e. how each frame was divided into individually decodable areas, had a noticeable effect on the decode times with certain thread counts. When using WPP, it was observed that what was mostly synchronization overhead, sometimes had a negative effect on the decode times when using larger (4-12) amounts of threads.TÀmÀn tutkimuksen aiheena oli tutkia uuden H.265 videonpakkausstandardin (tunnetaan myös nimellÀ HEVC (engl. High Efficiency Video Coding)) rinnakkaisuusominaisuuksien skaalautuvuutta. Verrattuna edeltÀjÀÀnsÀ, H.264 videonpakkaustandardiin, H.265 tyypillisesti saavuttaa samalla kuvanlaadulla noin 50% pienemmÀn pakkauskoon. Erityisesti suuren resoluution videoilla (Full HD ja suuremmat) pakkaustehokkuuden paremmuus korostuu. Huomiota on kiinnitetty myös moniydinprosessoreiden hyödyntÀmiseen videokoodauksessa. H.265 tarjoaa kaksi uutta rinnakkaisuusominaisuutta: niin kutsutut Tiles- ja WPP-menetelmÀt (engl. \emph{Wavefront Parallel Processing}). Tiles-menetelmÀssÀ jokainen videon kuva jaetaan alueisiin, jotka voidaan purkaa viittaamatta saman kuvan muihin alueisiin. WPP-menetelmÀssÀ suhteet kuvan lohkoihin pakataan siten ettÀ purkamisprosessi pystyy etenemÀÀn kuvan lÀpi rintamana hyödyntÀen useampia sÀikeitÀ. TÀssÀ tutkimuksessa H.265 videodekooderin referenssitoteutusta laajennettiin tukemaan molempia nÀistÀ rinnakkaisuusominaisuuksista. SuorituskykyÀ mitattiin kÀyttÀen kolmea eri mittausasetelmaa. Mittaustuloksista ilmeni, ettÀ prosessoriydinten lukumÀÀrÀn kasvattaminen nopeutti videoiden purkamista tiettyyn pisteeseen asti. Tiles-menetelmÀÀ mitatessa havaittiin, ettÀ alueiden geometrialla, eli kuinka kuva jaettiin riippumattomiin alueisiin, on huomattava vaikutus purkamisnopeuteen tietyillÀ sÀiemÀÀrillÀ. WPP-menetelmÀÀ mitattaessa havaittiin ettÀ korkeampiin sÀiemÀÀriin (4-12) siirryttÀessÀ purkamisnopeus alkoi hidastua. TÀmÀ johtui pÀÀasiassa sÀikeiden keskinÀiseen synkronointiin kuluvasta ajasta

    A Parallel implementation of an mpeg-2 encoder using message-passing

    Get PDF
    The days of film are waning as digital cameras and digital video cameras are becoming commonplace. Uncompressed digital video can consume large amounts of space, making it cumbersome to store efficiently. A method of video compression was developed by the Motion Pictures Expert Group (MPEG), and is now an international standard with the International Organization for Standardization (ISO). This thesis deals with the MPEG-2 Video standard, ISO/IEC 13818-2 [2]. The goal of this thesis is to explore the applications of MPEG-2 encoding in a parallel processing paradigm. To achieve this, a sequential MPEG-2 software encoder was obtained from the MPEG Software Simulation Group (MSSG) [18] and modified to be run, in parallel, on a cluster of single-processor Linux workstations using the Message Passing Interface (MPI) [11, 10, 3]. A multi-threaded pipeline of the encoding process was created using Pthreads [6]. The resulting pipelined parallel encoder has been shown to produce compliant elementary MPEG-2 bitstreams for progressive video sequences. Results of simulation showed that the parallel encoder always performed better than the sequential version as the number of processors scaled. However, it did not exhibit the ideal linear speedup that all parallel programs aim to achieve. This is due to the program executing on a set of resources not ideal for the multi-threaded pipeline. The ensuing chapters will provide the motivation for this work, and an overview of MPEG in addition to parallel processing and programming. Also forthcoming will be how it was achieved and the results produced. Supplementary applications of this work will also be discussed

    Resource Allocation and Performance Analysis for Multiuser Video Transmission over Doubly Selective Channels

    Get PDF
    We consider an uplink multicarrier system with multiple video users who want to send compressed video data to the base station. In the time domain, we model the time varying channel using Jakes’ model, and in the frequency domain, each subcarrier is assumed to be independently fading. The video is scalably coded in units of group of pictures (GOP), and users have different video rate distortion (RD) functions. At the beginning of the GOP, the base station collects both the RD information and instantaneous channel state information (CSI) for subcarrier allocation purposes. We design a cross layer resource allocation algorithm to assign subcarriers to the users based on both the demand of the video and the quality of the channel. Once the resource allocation decision is made, the users then periodically adapt the modulation format of the subcarriers allocated according to the evolution of the CSI for the duration of the GOP. We show that our cross layer resource allocation robustly outperforms two baseline algorithms, each of which uses only one layer of information for resource allocation

    Compressed-domain transcoding of H.264/AVC and SVC video streams

    Get PDF
    • 

    corecore