43 research outputs found

    Video adaptation for mobile digital television

    Get PDF
    Mobile digital television is one of the new services introduced recently by telecommunications operators in the market. Due to the possibilities of personalization and interaction provided, together with the increasing demand of this type of portable services, it would be expected to be a successful technology in near future. Video contents stored and transmitted over the networks deployed to provide mobile digital television need to be compressed to reduce the resources required. The compression scheme chosen by the great majority of these networks is H.264/AVC. Compressed video bitstreams have to be adapted to heterogeneous networks and a wide range of terminals. To deal with this problem scalable video coding schemes were proposed and standardized providing temporal, spatial and quality scalability using layers within the encoded bitstream. Because existing H.264/AVC contents cannot benefit from scalability tools, efficient techniques for migration of single-layer to scalable contents are desirable for supporting these mobile digital television systems. This paper proposes a technique to convert from single-layer H.264/AVC bitstream to a scalable bitstream with temporal scalability. Applying this approach, a reduction of 60% of coding complexity is achieved while maintaining the coding efficiency

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    On transcoding a B-frame to a P-frame in the compressed domain

    Get PDF
    2007-2008 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

    Compressed-domain transcoding of H.264/AVC and SVC video streams

    Get PDF

    Guided Transcoding for Next-Generation Video Coding (HEVC)

    Get PDF
    Video content is the dominant traffic type on mobile networks today and this portion is only expected to increase in the future. In this thesis we investigate ways of reducing bit rates for adaptive streaming applications in the latest video coding standard, H.265 / High Efficiency Video Coding (HEVC). The current models for offering different-resolution versions of video content in a dynamic way, so called adaptive streaming, require either large amounts of storage capacity where full encodings of the material is kept at all times, or extremely high computational power in order to regenerate content on-demand. Guided transcoding aims at finding a middle-ground were we can store and transmit less data, at full or near-full quality, while still keeping computational complexity low. This is achieved by shifting the computationally heavy operations to a preprocessing step where so called side-information is generated. The side-information can then be used to quickly reconstruct sequences on-demand -- even when running on generic, non-specialized, hardware. Two method for generating side-information, pruning and deflation, are compared on a varying set of standardized HEVC test sequences and the respective upsides and downsides of each method are discussed.Genom att slänga bort viss information från en komprimerad video och sedan återskapa sekvensen i realtid kan vi minska behovet av lagringsutrymme för adaptiv videostreaming med 20–30%. Detta med helt bibehållen bildkvalité eller endast små försämringar. ==================== Adaptiv streaming Streaming är ett populärt sätt att skicka video över internet där en sekvens delas upp i korta segment som skickas kontinuerligt till användaren. Dessa segment kan skickas med varierande kvalité, och en modell där vi automatiskt känner av nätverkets belastning och dynamiskt anpassar kvalitén kallas för adaptiv streaming. Detta är ett system som används av SVT Play, TV4 Play och YouTube. HD- eller UltraHD-video måste komprimeras för att kunna skickas över ett nätverk – den tar helt enkelt för stor plats annars. Video som kodas med den senaste komprimeringsstandarden, HEVC/H.265, blir upp emot 700 gånger mindre med minimala försämringar av bildkvalitén. Ett segment på tio sekunder som tar 1,5 GB att skicka i rå form kan då komprimeras till strax över 2 MB. För att kunna erbjuda tittaren en videosekvens – en film eller ett TV-program – i varierande kvalité, skapar man olika kodningar av materialet. Generellt har vi inte möjlighet att förändra kvalitén på en sekvens i efterhand – omkodning av även en kort HD-video tar timmar att genomföra – så för att adaptiv streaming ska kunna fungera i praktiken genereras alla versioner på förhand och sparas undan. Men detta kräver stort lagringsutrymme. Guided transcoding Guided transcoding (”guidad omkodning”) erbjuder ett sätt att minska behovet av lagringsutrymme genom att slänga bort viss information och sedan återskapa den vid behov i ett senare skede. Vi gör detta för varje sekvens av lägre kvalité, men behåller högsta kvalitén som den är. En stympad lågkvalité-video tillsammans med videon av högsta kvalitén kan sedan användas för att exakt återskapa sekvensen. Denna process är mycket snabb i jämförelse med vanlig omkodning, så vi kan med kort varsel generera videokodningar av varierande kvalité. Vi har undersökt två metoder för plocka bort och återskapa videoinformation: pruning och deflation. Den första ger små försämringar i bildkvalitén och sparar närmare 30% lagringsutrymme. Den senare har ingen påverkan på bildkvalitén men sparar bara drygt 20% i utrymme

    An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion

    Get PDF
    As wireless and wired network connectivity is rapidly expanding and the number of network users is steadily increasing, it has become more and more important to support universal access of multimedia content over the whole network. A big challenge, however, is the great diversity of network devices from full screen computers to small smart phones. This leads to research on transcoding, which involves in efficiently reformatting compressed data from its original high resolution to a desired spatial resolution supported by the displaying device. Particularly, there is a great momentum in the multimedia industry for H.264-based transcoding as H.264 has been widely employed as a mandatory player feature in applications ranging from television broadcast to video for mobile devices. While H.264 contains many new features for effective video coding with excellent rate distortion (RD) performance, a major issue for transcoding H.264 compressed video from one spatial resolution to another is the computational complexity. Specifically, it is the motion compensated prediction (MCP) part. MCP is the main contributor to the excellent RD performance of H.264 video compression, yet it is very time consuming. In general, a brute-force search is used to find the best motion vectors for MCP. In the scenario of transcoding, however, an immediate idea for improving the MCP efficiency for the re-encoding procedure is to utilize the motion vectors in the original compressed stream. Intuitively, motion in the high resolution scene is highly related to that in the down-scaled scene. In this thesis, we study homogeneous video transcoding from H.264 to H.264. Specifically, for the video transcoding with arbitrary spatial resolution conversion, we propose a motion vector estimation algorithm based on a multiple linear regression model, which systematically utilizes the motion information in the original scenes. We also propose a practical solution for efficiently determining a reference frame to take the advantage of the new feature of multiple references in H.264. The performance of the algorithm was assessed in an H.264 transcoder. Experimental results show that, as compared with a benchmark solution, the proposed method significantly reduces the transcoding complexity without degrading much the video quality

    On Transcoding a B-Frame to a P-Frame in the Compressed Domain

    Full text link

    Efficient Support for Application-Specific Video Adaptation

    Get PDF
    As video applications become more diverse, video must be adapted in different ways to meet the requirements of different applications when there are insufficient resources. In this dissertation, we address two sorts of requirements that cannot be addressed by existing video adaptation technologies: (i) accommodating large variations in resolution and (ii) collecting video effectively in a multi-hop sensor network. In addition, we also address requirements for implementing video adaptation in a sensor network. Accommodating large variation in resolution is required by the existence of display devices with widely disparate screen sizes. Existing resolution adaptation technologies usually aim at adapting video between two resolutions. We examine the limitations of these technologies that prevent them from supporting a large number of resolutions efficiently. We propose several hybrid schemes and study their performance. Among these hybrid schemes, Bonneville, a framework that combines multiple encodings with limited scalability, can make good trade-offs when organizing compressed video to support a wide range of resolutions. Video collection in a sensor network requires adapting video in a multi-hop storeand- forward network and with multiple video sources. This task cannot be supported effectively by existing adaptation technologies, which are designed for real-time streaming applications from a single source over IP-style end-to-end connections. We propose to adapt video in the network instead of at the network edge. We also propose a framework, Steens, to compose adaptation mechanisms on multiple nodes. We design two signaling protocols in Steens to coordinate multiple nodes. Our simulations show that in-network adaptation can use buffer space on intermediate nodes for adaptation and achieve better video quality than conventional network-edge adaptation. Our simulations also show that explicit collaboration among multiple nodes through signaling can improve video quality, waste less bandwidth, and maintain bandwidth-sharing fairness. The implementation of video adaptation in a sensor network requires system support for programmability, retaskability, and high performance. We propose Cascades, a component-based framework, to provide the required support. A prototype implementation of Steens in this framework shows that the performance overhead is less than 5% compared to a hard-coded C implementation
    corecore