55 research outputs found

    SpatioTemporal Feature Integration and Model Fusion for Full Reference Video Quality Assessment

    Full text link
    Perceptual video quality assessment models are either frame-based or video-based, i.e., they apply spatiotemporal filtering or motion estimation to capture temporal video distortions. Despite their good performance on video quality databases, video-based approaches are time-consuming and harder to efficiently deploy. To balance between high performance and computational efficiency, Netflix developed the Video Multi-method Assessment Fusion (VMAF) framework, which integrates multiple quality-aware features to predict video quality. Nevertheless, this fusion framework does not fully exploit temporal video quality measurements which are relevant to temporal video distortions. To this end, we propose two improvements to the VMAF framework: SpatioTemporal VMAF and Ensemble VMAF. Both algorithms exploit efficient temporal video features which are fed into a single or multiple regression models. To train our models, we designed a large subjective database and evaluated the proposed models against state-of-the-art approaches. The compared algorithms will be made available as part of the open source package in https://github.com/Netflix/vmaf

    Estimation of the QoE for video streaming services based on facial expressions and gaze direction

    Get PDF
    As the multimedia technologies evolve, the need to control their quality becomes even more important making the Quality of Experience (QoE) measurements a key priority. Machine Learning (ML) can support this task providing models to analyse the information extracted by the multimedia. It is possible to divide the ML models applications in the following categories: 1) QoE modelling: ML is used to define QoE models which provide an output (e.g., perceived QoE score) for any given input (e.g., QoE influence factor). 2) QoE monitoring in case of encrypted traffic: ML is used to analyze passive traffic monitored data to obtain insight into degradations perceived by end-users. 3) Big data analytics: ML is used for the extraction of meaningful and useful information from the collected data, which can further be converted to actionable knowledge and utilized in managing QoE. The QoE estimation quality task can be carried out by using two approaches: the objective approach and subjective one. As the two names highlight, they are referred to the pieces of information that the model analyses. The objective approach analyses the objective features extracted by the network connection and by the used media. As objective parameters, the state-of-the-art shows different approaches that use also the features extracted by human behaviour. The subjective approach instead, comes as a result of the rating approach, where the participants were asked to rate the perceived quality using different scales. This approach had the problem of being a time-consuming approach and for this reason not all the users agree to compile the questionnaire. Thus the direct evolution of this approach is the ML model adoption. A model can substitute the questionnaire and evaluate the QoE, depending on the data that analyses. By modelling the human response to the perceived quality on multimedia, QoE researchers found that the parameters extracted from the users could be different, like Electroencephalogram (EEG), Electrocardiogram (ECG), waves of the brain. The main problem with these techniques is the hardware. In fact, the user must wear electrodes in case of ECG and EEG, and also if the obtained results from these methods are relevant, their usage in a real context could be not feasible. For this reason, my studies have been focused on the developing of a Machine Learning framework completely unobtrusively based on the Facial reactions

    QoE on media deliveriy in 5G environments

    Get PDF
    231 p.5G expandirá las redes móviles con un mayor ancho de banda, menor latencia y la capacidad de proveer conectividad de forma masiva y sin fallos. Los usuarios de servicios multimedia esperan una experiencia de reproducción multimedia fluida que se adapte de forma dinámica a los intereses del usuario y a su contexto de movilidad. Sin embargo, la red, adoptando una posición neutral, no ayuda a fortalecer los parámetros que inciden en la calidad de experiencia. En consecuencia, las soluciones diseñadas para realizar un envío de tráfico multimedia de forma dinámica y eficiente cobran un especial interés. Para mejorar la calidad de la experiencia de servicios multimedia en entornos 5G la investigación llevada a cabo en esta tesis ha diseñado un sistema múltiple, basado en cuatro contribuciones.El primer mecanismo, SaW, crea una granja elástica de recursos de computación que ejecutan tareas de análisis multimedia. Los resultados confirman la competitividad de este enfoque respecto a granjas de servidores. El segundo mecanismo, LAMB-DASH, elige la calidad en el reproductor multimedia con un diseño que requiere una baja complejidad de procesamiento. Las pruebas concluyen su habilidad para mejorar la estabilidad, consistencia y uniformidad de la calidad de experiencia entre los clientes que comparten una celda de red. El tercer mecanismo, MEC4FAIR, explota las capacidades 5G de analizar métricas del envío de los diferentes flujos. Los resultados muestran cómo habilita al servicio a coordinar a los diferentes clientes en la celda para mejorar la calidad del servicio. El cuarto mecanismo, CogNet, sirve para provisionar recursos de red y configurar una topología capaz de conmutar una demanda estimada y garantizar unas cotas de calidad del servicio. En este caso, los resultados arrojan una mayor precisión cuando la demanda de un servicio es mayor

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences

    A reduced reference video quality assessment method for provision as a service over SDN/NFV-enabled networks

    Get PDF
    139 p.The proliferation of multimedia applications and services has generarted a noteworthy upsurge in network traffic regarding video content and has created the need for trustworthy service quality assessment methods. Currently, predominent position among the technological trends in telecommunication networkds are Network Function Virtualization (NFV), Software Defined Networking (SDN) and 5G mobile networks equipped with small cells. Additionally Video Quality Assessment (VQA) methods are a very useful tool for both content providers and network operators, to understand of how users perceive quality and this study the feasibility of potential services and adapt the network available resources to satisfy the user requirements

    Towards enabling cross-layer information sharing to improve today's content delivery systems

    Get PDF
    Content is omnipresent and without content the Internet would not be what it is today. End users consume content throughout the day, from checking the latest news on Twitter in the morning, to streaming music in the background (while working), to streaming movies or playing online games in the evening, and to using apps (e.g., sleep trackers) even while we sleep in the night. All of these different kinds of content have very specific and different requirements on a transport—on one end, online gaming often requires a low latency connection but needs little throughput, and, on the other, streaming a video requires high throughput, but it performs quite poorly under packet loss. Yet, all content is transferred opaquely over the same transport, adhering to a strict separation of network layers. Even a modern transport protocol such as Multi-Path TCP, which is capable of utilizing multiple paths, cannot take the (above) requirements or needs of that content into account for its path selection. In this work we challenge the layer separation and show that sharing information across the layers is beneficial for consuming web and video content. To this end, we created an event-based simulator for evaluating how applications can make informed decisions about which interfaces to use delivering different content based on a set of pre-defined policies that encode the (performance) requirements or needs of that content. Our policies achieve speedups of a factor of two in 20% of our cases, have benefits in more than 50%, and create no overhead in any of the cases. For video content we created a full streaming system that allows an even finer grained information sharing between the transport and the application. Our streaming system, called VOXEL, enables applications to select dynamically and on a frame granularity which video data to transfer based on the current network conditions. VOXEL drastically reduces video stalls in the 90th-percentile by up to 97% while not sacrificing the stream's visual fidelity. We confirmed our performance improvements in a real-user study where 84% of the participants clearly preferred watching videos streamed with VOXEL over the state-of-the-art.Inhalte sind allgegenwärtig und ohne Inhalte wäre das Internet nicht das, was es heute ist. Endbenutzer konsumieren Inhalte von früh bis spät - es beginnt am Morgen mit dem Lesen der neusten Nachrichten auf Twitter, dem online hören von Musik während der Arbeit, wird fortgeführt mit dem Schauen von Filmen über Online-Streaming Dienste oder dem spielen von Mehrspieler Online Spielen am Abend, und sogar dem, mit dem Internet synchronisierten, Überwachens des eigenen Schlafes in der Nacht. All diese verschiedenen Arten von Inhalten haben sehr spezifische und unterschiedliche Ansprüche an den Transport über das Internet - auf der einen Seite sind es Online Spiele, die eine sehr geringe Latenz, aber kaum Durchsatz benötigen, auf der Anderen gibt es Video-Streaming Dienste, die einen sehr hohen Datendurchsatz benötigen, aber, sehr nur schlecht mit Paketverlust umgehen können. Jedoch werden all diese Inhalte über den selben, undurchsichtigen, Transportweg übertragen, weil an eine strikte Unterteilung der Netzwerk- und Transportschicht festgehalten wird. Sogar ein modernes Übertragungsprotokoll wie MPTCP, welches es ermöglicht mehrere Netzwerkpfade zu nutzen, kann die (oben genannten) Anforderungen oder Bedürfnisse des Inhaltes, nicht für die Pfadselektierung, in Betracht ziehen. In dieser Arbeit fordern wir die Trennung der Schichten heraus und zeigen, dass ein Informationsaustausch zwischen den Netzwerkschichten von großem Vorteil für das Konsumieren von Webseiten und Video Inhalten sein kann. Hierzu haben wir einen Ereignisorientierten Simulator entwickelt, mit dem wir untersuchten wie Applikationen eine informierte Entscheidung darüber treffen können, welche Netzwerkschnittstellen für verschiedene Inhalte, basierend auf vordefinierten Regeln, welche die Leistungsvorgaben oder Bedürfnisse eines Inhalts kodieren, benutzt werden sollen. Unsere Regeln erreichen eine Verbesserung um einen Faktor von Zwei in 20% unserer Testfälle, haben einen Vorteil in mehr als 50% der Fälle und erzeugen in keinem Fall einen Mehraufwand. Für Video Inhalte haben wir ein komplettes Video-Streaming System entwickelt, welches einen noch feingranulareren Informationsaustausch zwischen der Applikation und des Transportes ermöglicht. Unser, VOXEL genanntes, System ermöglicht es Applikationen dynamisch und auf Videobild Granularität zu bestimmen welche Videodaten, entsprechend der aktuellen Netzwerksituation, übertragen werden sollen. VOXEL kann das stehenbleiben von Videos im 90%-Perzentil drastisch, um bis zu 97%, reduzieren, ohne dabei die visuelle Qualität des übertragenen Videos zu beeinträchtigen. Wir haben unsere Leistungsverbesserung in einer Studie mit echten Benutzern bestätigt, bei der 84% der Befragten es, im vergleich zum aktuellen Stand der Technik, klar bevorzugten Videos zu schauen, die über VOXEL übertragen wurden
    corecore