Search CORE

28 research outputs found

{VOXEL}: {C}ross-Layer Optimization for Video Streaming with Imperfect Transmission

Author: Appel M.
Chandrasekaran B.
Feldmann A.
Palmer M.
Sitaraman R.
Spiteri K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Delivering videos under less-than-ideal network conditions without compromising end-users' quality of experiences is a hard problem. Virtually all prior work follow a piecemeal approach - -either "tweaking"the fully reliable transport layer or making the client "smarter."We propose VOXEL, a cross-layer optimization system for video streaming. We use VOXEL to demonstrate how to combine application-provided "insights"with a partially reliable protocol for optimizing video streaming. To this end, we present a novel ABR algorithm that explicitly trades off losses for improving end-users' video-watching experiences. VOXEL is fully compatible with DASH, and backward-compatible with VOXEL-unaware servers and clients. In our experiments emulating a wide range of network conditions, VOXEL outperforms the state-of-the-art: We stream videos in the 90th-percentile with up to 97% less rebuffering than the state-of-the-art without sacrificing visual fidelity. We also demonstrate the benefits of VOXEL for small-buffer regimes like the emerging use case of low-latency and live streaming. In a survey of 54 real users, 84% of the participants indicated that they prefer videos streamed using VOXEL compared to the state-of-the-art

VU Research Portal

MPG.PuRe

VOXEL: Cross-layer optimization for video streaming with imperfect transmission

Author: Appel Malte
Chandrasekaran Balakrishnan
Feldmann Anja
Palmer Mirko
Sitaraman Ramesh K.
Spiteri Kevin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2021
Field of study

VU Research Portal

Data-driven visual quality estimation using machine learning

Author: Göring Steve
Publication venue
Publication date: 01/01/2022
Field of study

Heutzutage werden viele visuelle Inhalte erstellt und sind zugänglich, was auf Verbesserungen der Technologie wie Smartphones und das Internet zurückzuführen ist. Es ist daher notwendig, die von den Nutzern wahrgenommene Qualität zu bewerten, um das Erlebnis weiter zu verbessern. Allerdings sind nur wenige der aktuellen Qualitätsmodelle speziell für höhere Auflösungen konzipiert, sagen mehr als nur den Mean Opinion Score vorher oder nutzen maschinelles Lernen. Ein Ziel dieser Arbeit ist es, solche maschinellen Modelle für höhere Auflösungen mit verschiedenen Datensätzen zu trainieren und zu evaluieren. Als Erstes wird eine objektive Analyse der Bildqualität bei höheren Auflösungen durchgeführt. Die Bilder wurden mit Video-Encodern komprimiert, hierbei weist AV1 die beste Qualität und Kompression auf. Anschließend werden die Ergebnisse eines Crowd-Sourcing-Tests mit einem Labortest bezüglich Bildqualität verglichen. Weiterhin werden auf Deep Learning basierende Modelle für die Vorhersage von Bild- und Videoqualität beschrieben. Das auf Deep Learning basierende Modell ist aufgrund der benötigten Ressourcen für die Vorhersage der Videoqualität in der Praxis nicht anwendbar. Aus diesem Grund werden pixelbasierte Videoqualitätsmodelle vorgeschlagen und ausgewertet, die aussagekräftige Features verwenden, welche Bild- und Bewegungsaspekte abdecken. Diese Modelle können zur Vorhersage von Mean Opinion Scores für Videos oder sogar für anderer Werte im Zusammenhang mit der Videoqualität verwendet werden, wie z.B. einer Bewertungsverteilung. Die vorgestellte Modellarchitektur kann auf andere Videoprobleme angewandt werden, wie z.B. Videoklassifizierung, Vorhersage der Qualität von Spielevideos, Klassifikation von Spielegenres oder der Klassifikation von Kodierungsparametern. Ein wichtiger Aspekt ist auch die Verarbeitungszeit solcher Modelle. Daher wird ein allgemeiner Ansatz zur Beschleunigung von State-of-the-Art-Videoqualitätsmodellen vorgestellt, der zeigt, dass ein erheblicher Teil der Verarbeitungszeit eingespart werden kann, während eine ähnliche Vorhersagegenauigkeit erhalten bleibt. Die Modelle sind als Open Source veröffentlicht, so dass die entwickelten Frameworks für weitere Forschungsarbeiten genutzt werden können. Außerdem können die vorgestellten Ansätze als Bausteine für neuere Medienformate verwendet werden.Today a lot of visual content is accessible and produced, due to improvements in technology such as smartphones and the internet. This results in a need to assess the quality perceived by users to further improve the experience. However, only a few of the state-of-the-art quality models are specifically designed for higher resolutions, predict more than mean opinion score, or use machine learning. One goal of the thesis is to train and evaluate such machine learning models of higher resolutions with several datasets. At first, an objective evaluation of image quality in case of higher resolutions is performed. The images are compressed using video encoders, and it is shown that AV1 is best considering quality and compression. This evaluation is followed by the analysis of a crowdsourcing test in comparison with a lab test investigating image quality. Afterward, deep learning-based models for image quality prediction and an extension for video quality are proposed. However, the deep learning-based video quality model is not practically usable because of performance constrains. For this reason, pixel-based video quality models using well-motivated features covering image and motion aspects are proposed and evaluated. These models can be used to predict mean opinion scores for videos, or even to predict other video quality-related information, such as a rating distributions. The introduced model architecture can be applied to other video problems, such as video classification, gaming video quality prediction, gaming genre classification or encoding parameter estimation. Furthermore, one important aspect is the processing time of such models. Hence, a generic approach to speed up state-of-the-art video quality models is introduced, which shows that a significant amount of processing time can be saved, while achieving similar prediction accuracy. The models have been made publicly available as open source so that the developed frameworks can be used for further research. Moreover, the presented approaches may be usable as building blocks for newer media formats

Digitale Bibliothek Thüringen

Towards enabling cross-layer information sharing to improve today's content delivery systems

Author: Palmer Mirko Romano
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2022
Field of study

Content is omnipresent and without content the Internet would not be what it is today. End users consume content throughout the day, from checking the latest news on Twitter in the morning, to streaming music in the background (while working), to streaming movies or playing online games in the evening, and to using apps (e.g., sleep trackers) even while we sleep in the night. All of these different kinds of content have very specific and different requirements on a transport—on one end, online gaming often requires a low latency connection but needs little throughput, and, on the other, streaming a video requires high throughput, but it performs quite poorly under packet loss. Yet, all content is transferred opaquely over the same transport, adhering to a strict separation of network layers. Even a modern transport protocol such as Multi-Path TCP, which is capable of utilizing multiple paths, cannot take the (above) requirements or needs of that content into account for its path selection. In this work we challenge the layer separation and show that sharing information across the layers is beneficial for consuming web and video content. To this end, we created an event-based simulator for evaluating how applications can make informed decisions about which interfaces to use delivering different content based on a set of pre-defined policies that encode the (performance) requirements or needs of that content. Our policies achieve speedups of a factor of two in 20% of our cases, have benefits in more than 50%, and create no overhead in any of the cases. For video content we created a full streaming system that allows an even finer grained information sharing between the transport and the application. Our streaming system, called VOXEL, enables applications to select dynamically and on a frame granularity which video data to transfer based on the current network conditions. VOXEL drastically reduces video stalls in the 90th-percentile by up to 97% while not sacrificing the stream's visual fidelity. We confirmed our performance improvements in a real-user study where 84% of the participants clearly preferred watching videos streamed with VOXEL over the state-of-the-art.Inhalte sind allgegenwärtig und ohne Inhalte wäre das Internet nicht das, was es heute ist. Endbenutzer konsumieren Inhalte von früh bis spät - es beginnt am Morgen mit dem Lesen der neusten Nachrichten auf Twitter, dem online hören von Musik während der Arbeit, wird fortgeführt mit dem Schauen von Filmen über Online-Streaming Dienste oder dem spielen von Mehrspieler Online Spielen am Abend, und sogar dem, mit dem Internet synchronisierten, Überwachens des eigenen Schlafes in der Nacht. All diese verschiedenen Arten von Inhalten haben sehr spezifische und unterschiedliche Ansprüche an den Transport über das Internet - auf der einen Seite sind es Online Spiele, die eine sehr geringe Latenz, aber kaum Durchsatz benötigen, auf der Anderen gibt es Video-Streaming Dienste, die einen sehr hohen Datendurchsatz benötigen, aber, sehr nur schlecht mit Paketverlust umgehen können. Jedoch werden all diese Inhalte über den selben, undurchsichtigen, Transportweg übertragen, weil an eine strikte Unterteilung der Netzwerk- und Transportschicht festgehalten wird. Sogar ein modernes Übertragungsprotokoll wie MPTCP, welches es ermöglicht mehrere Netzwerkpfade zu nutzen, kann die (oben genannten) Anforderungen oder Bedürfnisse des Inhaltes, nicht für die Pfadselektierung, in Betracht ziehen. In dieser Arbeit fordern wir die Trennung der Schichten heraus und zeigen, dass ein Informationsaustausch zwischen den Netzwerkschichten von großem Vorteil für das Konsumieren von Webseiten und Video Inhalten sein kann. Hierzu haben wir einen Ereignisorientierten Simulator entwickelt, mit dem wir untersuchten wie Applikationen eine informierte Entscheidung darüber treffen können, welche Netzwerkschnittstellen für verschiedene Inhalte, basierend auf vordefinierten Regeln, welche die Leistungsvorgaben oder Bedürfnisse eines Inhalts kodieren, benutzt werden sollen. Unsere Regeln erreichen eine Verbesserung um einen Faktor von Zwei in 20% unserer Testfälle, haben einen Vorteil in mehr als 50% der Fälle und erzeugen in keinem Fall einen Mehraufwand. Für Video Inhalte haben wir ein komplettes Video-Streaming System entwickelt, welches einen noch feingranulareren Informationsaustausch zwischen der Applikation und des Transportes ermöglicht. Unser, VOXEL genanntes, System ermöglicht es Applikationen dynamisch und auf Videobild Granularität zu bestimmen welche Videodaten, entsprechend der aktuellen Netzwerksituation, übertragen werden sollen. VOXEL kann das stehenbleiben von Videos im 90%-Perzentil drastisch, um bis zu 97%, reduzieren, ohne dabei die visuelle Qualität des übertragenen Videos zu beeinträchtigen. Wir haben unsere Leistungsverbesserung in einer Studie mit echten Benutzern bestätigt, bei der 84% der Befragten es, im vergleich zum aktuellen Stand der Technik, klar bevorzugten Videos zu schauen, die über VOXEL übertragen wurden

Acronym

Bitstream-based video quality modeling and analysis of HTTP-based adaptive streaming

Author: Ramachandra Rao Rakesh Rao
Publication venue
Publication date: 01/01/2023
Field of study

Die Verbreitung erschwinglicher Videoaufnahmetechnologie und verbesserte Internetbandbreiten ermöglichen das Streaming von hochwertigen Videos (Auflösungen > 1080p, Bildwiederholraten ≥ 60fps) online. HTTP-basiertes adaptives Streaming ist die bevorzugte Methode zum Streamen von Videos, bei der Videoparameter an die verfügbare Bandbreite angepasst wird, was sich auf die Videoqualität auswirkt. Adaptives Streaming reduziert Videowiedergabeunterbrechnungen aufgrund geringer Netzwerkbandbreite, wirken sich jedoch auf die wahrgenommene Qualität aus, weswegen eine systematische Bewertung dieser notwendig ist. Diese Bewertung erfolgt üblicherweise für kurze Abschnitte von wenige Sekunden und während einer Sitzung (bis zu mehreren Minuten). Diese Arbeit untersucht beide Aspekte mithilfe perzeptiver und instrumenteller Methoden. Die perzeptive Bewertung der kurzfristigen Videoqualität umfasst eine Reihe von Labortests, die in frei verfügbaren Datensätzen publiziert wurden. Die Qualität von längeren Sitzungen wurde in Labortests mit menschlichen Betrachtern bewertet, die reale Betrachtungsszenarien simulieren. Die Methodik wurde zusätzlich außerhalb des Labors für die Bewertung der kurzfristigen Videoqualität und der Gesamtqualität untersucht, um alternative Ansätze für die perzeptive Qualitätsbewertung zu erforschen. Die instrumentelle Qualitätsevaluierung wurde anhand von bitstrom- und hybriden pixelbasierten Videoqualitätsmodellen durchgeführt, die im Zuge dieser Arbeit entwickelt wurden. Dazu wurde die Modellreihe AVQBits entwickelt, die auf den Labortestergebnissen basieren. Es wurden vier verschiedene Modellvarianten von AVQBits mit verschiedenen Inputinformationen erstellt: Mode 3, Mode 1, Mode 0 und Hybrid Mode 0. Die Modellvarianten wurden untersucht und schneiden besser oder gleichwertig zu anderen aktuellen Modellen ab. Diese Modelle wurden auch auf 360°- und Gaming-Videos, HFR-Inhalte und Bilder angewendet. Darüber hinaus wird ein Langzeitintegrationsmodell (1 - 5 Minuten) auf der Grundlage des ITU-T-P.1203.3-Modells präsentiert, das die verschiedenen Varianten von AVQBits mit sekündigen Qualitätswerten als Videoqualitätskomponente des vorgeschlagenen Langzeitintegrationsmodells verwendet. Alle AVQBits-Varianten, das Langzeitintegrationsmodul und die perzeptiven Testdaten wurden frei zugänglich gemacht, um weitere Forschung zu ermöglichen.The pervasion of affordable capture technology and increased internet bandwidth allows high-quality videos (resolutions > 1080p, framerates ≥ 60fps) to be streamed online. HTTP-based adaptive streaming is the preferred method for streaming videos, adjusting video quality based on available bandwidth. Although adaptive streaming reduces the occurrences of video playout being stopped (called “stalling”) due to narrow network bandwidth, the automatic adaptation has an impact on the quality perceived by the user, which results in the need to systematically assess the perceived quality. Such an evaluation is usually done on a short-term (few seconds) and overall session basis (up to several minutes). In this thesis, both these aspects are assessed using subjective and instrumental methods. The subjective assessment of short-term video quality consists of a series of lab-based video quality tests that have resulted in publicly available datasets. The overall integral quality was subjectively assessed in lab tests with human viewers mimicking a real-life viewing scenario. In addition to the lab tests, the out-of-the-lab test method was investigated for both short-term video quality and overall session quality assessment to explore the possibility of alternative approaches for subjective quality assessment. The instrumental method of quality evaluation was addressed in terms of bitstream- and hybrid pixel-based video quality models developed as part of this thesis. For this, a family of models, namely AVQBits has been conceived using the results of the lab tests as ground truth. Based on the available input information, four different instances of AVQBits, that is, a Mode 3, a Mode 1, a Mode 0, and a Hybrid Mode 0 model are presented. The model instances have been evaluated and they perform better or on par with other state-of-the-art models. These models have further been applied to 360° and gaming videos, HFR content, and images. Also, a long-term integration (1 - 5 mins) model based on the ITU-T P.1203.3 model is presented. In this work, the different instances of AVQBits with the per-1-sec scores output are employed as the video quality component of the proposed long-term integration model. All AVQBits variants as well as the long-term integration module and the subjective test data are made publicly available for further research

Digitale Bibliothek Thüringen

Network-aware video streaming for future media Internet

Author: Viola Roberto
Publication venue
Publication date: 26/10/2021
Field of study

272 p

Archivo Digital para la Docencia y la Investigación

Video QoE Estimation using Network Measurement Data

Author: Mangla Tarun
Publication venue: Georgia Institute of Technology
Publication date: 15/09/2021
Field of study

More than even before, last-mile Internet Service Providers (ISPs) need to efficiently provision and manage their networks to meet the growing demand for Internet video (expected to be 82% of the global IP traffic in 2022). This network optimization requires ISPs to have an in-depth understanding of end-user video Quality of Experience (QoE). Understanding video QoE, however, is challenging for ISPs as they generally do not have access to applications at end user devices to observe key objective metrics impacting QoE. Instead, they have to rely on measurement of network traffic to estimate objective QoE metrics and use it for troubleshooting QoE issues. However, this can be challenging for HTTP-based Adaptive Streaming (HAS) video, the de facto standard for streaming over the Internet, because of the complex relationship between the network observable metrics and the video QoE metrics. This largely results from its robustness to short-term variations in the underlying network conditions due to the use of the video buffer and bitrate adaptation. In this thesis, we develop approaches that use network measurement to infer video QoE. In developing inference approaches, we provide a toolbox of techniques suitable for a diversity of streaming contexts as well as different types of network measurement data. We first develop two approaches for QoE estimation that model video sessions based on the network traffic dynamics of the HAS protocol under two different streaming contexts. Our first approach, MIMIC, estimates unencrypted video QoE using HTTP logs. We do a large-scale validation of MIMIC using ground truth QoE metrics from a popular video streaming service. We also deploy MIMIC in a real-world cellular network and demonstrate some preliminary use cases of QoE estimation for ISPs. Our second approach is called eMIMIC that estimates QoE metrics for encrypted video using packet-level traces. We evaluate eMIMIC using an automated experimental framework under realistic network conditions and show that it outperforms state-of-the-art QoE estimation approaches. Finally, we develop an approach to address the scalability challenges of QoE inference. We leverage machine learning to infer QoE from coarse-granular but light-weight network data in the form of Transport Layer Security (TLS) transactions. We analyze the scalability and accuracy trade-off in using such data for inference. Our evaluation shows that that the TLS transaction data can be used for detecting video performance issues with a reasonable accuracy and significantly lower computation overhead as compared to packet-level traces.Ph.D

Scholarly Materials And Research @ Georgia Tech

Seamless multimedia delivery within a heterogeneous wireless networks environment: are we there yet?

Author: Comsa I.
Comsa I.
Trestian R.
Trestian R.
Tuysuz M.
Tuysuz M.
Publication venue: IEEE
Publication date: 01/01/2018
Field of study

The increasing popularity of live video streaming from mobile devices such as Facebook Live, Instagram Stories, Snapchat, etc. pressurises the network operators to increase the capacity of their networks. However, a simple increase in system capacity will not be enough without considering the provisioning of Quality of Experience (QoE) as the basis for network control, customer loyalty and retention rate and thus increase in network operators revenue. As QoE is gaining strong momentum especially with increasing users’ quality expectations, the focus is now on proposing innovative solutions to enable QoE when delivering video content over heterogeneous wireless networks. In this context, this paper presents an overview of multimedia delivery solutions, identifies the problems and provides a comprehensive classification of related state-of-the-art approaches following three key directions: adaptation, energy efficiency and multipath content delivery. Discussions, challenges and open issues on the seamless multimedia provisioning faced by the current and next generation of wireless networks are also provided

Middlesex University Research Repository