8 research outputs found

    Layer Selection in Progressive Transmission of Motion-Compensated JPEG2000 Video

    Get PDF
    MCJ2K (Motion-Compensated JPEG2000) is a video codec based on MCTF (Motion- Compensated Temporal Filtering) and J2K (JPEG2000). MCTF analyzes a sequence of images, generating a collection of temporal sub-bands, which are compressed with J2K. The R/D (Rate-Distortion) performance in MCJ2K is better than the MJ2K (Motion JPEG2000) extension, especially if there is a high level of temporal redundancy. MCJ2K codestreams can be served by standard JPIP (J2K Interactive Protocol) servers, thanks to the use of only J2K standard file formats. In bandwidth-constrained scenarios, an important issue in MCJ2K is determining the amount of data of each temporal sub-band that must be transmitted to maximize the quality of the reconstructions at the client side. To solve this problem, we have proposed two rate-allocation algorithms which provide reconstructions that are progressive in quality. The first, OSLA (Optimized Sub-band Layers Allocation), determines the best progression of quality layers, but is computationally expensive. The second, ESLA (Estimated-Slope sub-band Layers Allocation), is sub-optimal in most cases, but much faster and more convenient for real-time streaming scenarios. An experimental comparison shows that even when a straightforward motion compensation scheme is used, the R/D performance of MCJ2K competitive is compared not only to MJ2K, but also with respect to other standard scalable video codecs

    JPIP proxy server with prefetching strategies based on user-navigation model and semantic map

    Get PDF
    The efficient transmission of large resolution images and, in particular, the interactive transmission of images in a client-server scenario, is an important aspect for many applications. Among the current image compression standards, JPEG2000 excels for its interactive transmission capabilities. In general, three mechanisms are employed to optimize the transmission of images when using the JPEG2000 Interactive Protocol (JPIP): 1) packet re-sequencing at the server; 2) prefetching at the client; and 3) proxy servers along the network infrastructure. To avoid the congestion of the network, prefetching mechanisms are not commonly employed when many clients within a local area network (LAN) browse images from a remote server. Aimed to maximize the responsiveness of all the clients within a LAN, this work proposes the use of prefetching strategies at the proxy server -rather than at the clients. The main insight behind the proposed prefetching strategies is a user-navigation model and a semantic map that predict the future requests of the clients. Experimental results indicate that the introduction of these strategies into a JPIP proxy server enhances the browsing experience of the end-users notably

    Advanced heterogeneous video transcoding

    Get PDF
    PhDVideo transcoding is an essential tool to promote inter-operability between different video communication systems. This thesis presents two novel video transcoders, both operating on bitstreams of the cur- rent H.264/AVC standard. The first transcoder converts H.264/AVC bitstreams to a Wavelet Scalable Video Codec (W-SVC), while the second targets the emerging High Efficiency Video Coding (HEVC). Scalable Video Coding (SVC) enables low complexity adaptation of compressed video, providing an efficient solution for content delivery through heterogeneous networks. The transcoder proposed here aims at exploiting the advantages offered by SVC technology when dealing with conventional coders and legacy video, efficiently reusing information found in the H.264/AVC bitstream to achieve a high rate-distortion performance at a low complexity cost. Its main features include new mode mapping algorithms that exploit the W-SVC larger macroblock sizes, and a new state-of-the-art motion vector composition algorithm that is able to tackle different coding configurations in the H.264/AVC bitstream, including IPP or IBBP with multiple reference frames. The emerging video coding standard, HEVC, is currently approaching the final stage of development prior to standardization. This thesis proposes and evaluates several transcoding algorithms for the HEVC codec. In particular, a transcoder based on a new method that is capable of complexity scalability, trading off rate-distortion performance for complexity reduction, is proposed. Furthermore, other transcoding solutions are explored, based on a novel content-based modeling approach, in which the transcoder adapts its parameters based on the contents of the sequence being encoded. Finally, the application of this research is not constrained to these transcoders, as many of the techniques developed aim to contribute to advance the research on this field, and have the potential to be incorporated in different video transcoding architectures

    Annotierte interaktive nichtlineare Videos - Software Suite, Download- und Cache-Management

    Get PDF
    Modern Web technology makes the dream of fully interactive and enriched video come true. Nowadays it is possible to organize videos in a non-linear way playing in a sequence unknown in advance. Furthermore, additional information can be added to the video, ranging from short descriptions to animated images and further videos. This affords an easy and efficient to use authoring tool which is capable of the management of the single media objects, as well as a clear arrangement of the links between the parts. Tools of this kind can be found rarely and do mostly not provide the full range of needed functions. While providing an interactive experience to the viewer in the Web player, parallel plot sequences and additional information lead to an increased download volume. This may cause pauses during playback while elements have to be downloaded which are displayed with the video. A good quality of experience for these videos with small waiting times and a playback without interruptions is desired. This work presents the SIVA Suite to create the previously described annotated interactive non-linear videos. We propose a video model for interactivity, non-linearity, and annotations, which is implemented in an XML format, an authoring tool, and a player. Video is the main medium, whereby different scenes are linked to a scene graph. Time controlled additional content called annotations, like text, images, audio files, or videos, is added to the scenes. The user is able to navigate in the scene graph by selecting a button at a button panel. Furthermore, other navigational elements like a table of contents or a keyword search are provided. Besides the SIVA Suite, this thesis presents algorithms and strategies for download and cache management to provide a good quality of experience while watching the annotated interactive non-linear videos. Therefor, we implemented a standard-independent player framework. Integrated into a simulation environment, the framework allows to evaluate algorithms and strategies for the calculation of start-up times, and the selection of elements to pre-fetch into and delete from the cache. Their interaction during the playback of non-linear video contents can be analyzed. The algorithms and strategies can be used to minimize interruptions in the video flow after user interactions. Our extensive evaluation showed that our techniques result in faster start-up times and lesser interruptions in the video flow than those of other players. Knowledge of the structure of an interactive non-linear video can be used to minimize the start-up time at the beginning of a video while minimizing an increase in the overall download volume.Moderne Web-Technologien lassen den Traum von voll interaktiven und bereicherten Videos wahr werden. Heutzutage ist es möglich, Videos in nicht-linearer Art und Weise zu organisieren, welche dann in einer vorher unbekannten Reihenfolge abgespielt werden können. Weiterhin können den Videos Zusatzinformationen in Form von kurzen Beschreibungen über animierte Bilder bis hin zu weiteren Videos hinzugefügt werden. Dies erfordert ein einfach und effizient zu bedienendes Autorenwerkzeug, das in der Lage ist, sowohl einzelne Medien-Objekte zu verwalten, als auch die Verbindungen zwischen den einzelnen Teilen klar darzustellen. Tools dieser Art sind selten und bieten meist nicht den vollen benötigten Funktionsumfang. Während dem Betrachter dieses interaktive Erlebnis im Web Player zur Verfügung gestellt wird, führen parallele Handlungsstränge und zusätzliche Inhalte zu einem erhöhten Download-Volumen. Dies kann zu Pausen während der Wiedergabe führen, in denen Elemente vom Server geladen werden müssen, welche mit dem Video angezeigt werden sollen. Ein gutes Benutzungserlebnis für solche Videos kann durch geringe Wartezeiten und eine unterbrechungsfreie Wiedergabe erreicht werden. Diese Arbeit stellt die SIVA Suite vor, mit der die zuvor beschriebenen annotierten interaktiven nicht-linearen Videos erstellt werden können. Wir bilden Interaktivität, Nichtlinearität und Annotationen in einem Video-Model ab. Dieses wird in unserem XML-Format, Autorentool und Player umgesetzt. Als Leitmedium werden hierbei Videos verwendet, welche aufgeteilt in Szenen zu einer Graphstruktur zusammengefügt werden können. Zeitlich gesteuerte zusätzliche Inhalte, sogenannte Annotationen, wie Texte, Bilder, Audio-Dateien und Videos, werden den Szenen hinzugefügt. Der Betrachter kann im Szenengraph navigieren, indem er in einem bereitgestellten Button-Panel eine Nachfolgeszene auswählt. Andere Navigationselemente sind ein Inhaltsverzeichnis sowie eine Suchfunktion. Neben der SIVA Suite beschreibt diese Arbeit Algorithmen und Strategien für Download und Cache Management, um eine gute Nutzungserfahrung während der Betrachtung der annotierten interaktiven nicht-linearen Videos zu bieten. Ein Webstandard-unabhängiges Playerframework erlaubt es, das Zusammenspiel von Algorithmen und Strategien zu evaluieren, welche für die Berechnung der Start-Zeitpunkte für die Wiedergabe, sowie die Auswahl von vorauszuladenden sowie zu löschenden Elemente verwendet werden. Ziel ist es, Unterbrechungen zu minimieren, wenn der Ablauf des Videos durch Benutzerinteraktion beeinflusst wird. Unsere umfassende Evaluation zeigte, dass es möglich ist, kürzere Startup-Zeiten und weniger Unterbrechungen mit unseren Strategien zu erreichen, als bei der Verwendung der Strategien anderer Player. Die Kenntnis der Struktur des interaktiven nicht-linearen Videos kann dazu verwendet werden, die Startzeit am Anfang der Szenen zu minimieren, während das Download-Volumen nicht erhöht wird

    JPEG2000-based scalable interactive video (JSIV)

    Full text link
    Video is considered one of the main applications of modern day's Internet. Despite its importance, the interactivity available from current implementations is limited to pause and random access to a set of predetermined access points. In this work, we propose a novel and innovative approach which provides considerably better interactivity and we coin the term JPEG2000-Based Scalable Interactive Video (JSIV) for it. JSIV relies on three main concepts: storing the video sequence as independent JPEG2000 frames to provide for quality and spatial resolution scalability, as well as temporal and spatial accessibility; prediction and conditional replenishment of precincts to exploit inter-frame redundancy; and loosely-coupled server and client policies. The concept of loosely-coupled client and server policies is central to JSIV. With these policies, the server optimally selects the number of quality layers for each precinct it transmits and decides on any side-information that needs to be transmitted while the client attempts to make most of the received (distorted) frames. In particular, the client decides which precincts are predicted and which are decoded from received data (or possibly filled with zeros in the absence of received data). Thus, in JSIV, a predicted frame typically has some of its precincts predicted from nearby frames while others are decoded from received intra-coded precincts; JSIV never uses frame differences or prediction residues.The philosophy behind these policies is that neither the server nor the client drives the video streaming interaction, but rather the server dynamically selects and sends the pieces that, it thinks, best serve the client needs and, in turn, the client makes most of the pieces of information it has. The JSIV paradigm postulates that if both the client and the server policies are intelligent enough and make reasonable decisions, then the decisions made by the server are likely to have the expected impact on the client's decisions. We solve the general JSIV optimization problem by employing Lagrange-style rate-distortion optimization in a two pass iterative approach. We show that this approach converges under workable conditions, and we also show that the optimal solution for a given rate is not necessarily embedded in the optimal solution for a higher rate. The flexibility of the JSIV paradigm enables us to use it in a variety of frame prediction arrangements. In this work, we focus only on JSIV with sequential prediction arrangement (similar to IPPP\ldots) and hierarchical B-frames prediction arrangement.We show that JSIV can provide the sought-after quality and spatial scalability in addition to temporal and spatial accessibility. We also demonstrate a novel way in which a JSIV client can use its cache in improving the quality of reconstructed video. In general, JSIV can serve a wide range of usage scenarios, but we expect that real-time and interactive applications, such as teleconferencing and surveillance, would benefit most from it. Experimental results show that JSIV's performance is slightly inferior to that of existing predictive coding standards in conventional streaming applications; however, JSIV produces significant improvements when its scalability and accessibility features, such as the region of interest, are employed
    corecore