10 research outputs found

    Critical analysis on the reproducibility of visual quality assessment using deep features

    Full text link
    Data used to train supervised machine learning models are commonly split into independent training, validation, and test sets. In this paper we illustrate that intricate cases of data leakage have occurred in the no-reference video and image quality assessment literature. We show that the performance results of several recently published journal papers that are well above the best performances in related works, cannot be reached. Our analysis shows that information from the test set was inappropriately used in the training process in different ways. When correcting for the data leakage, the performances of the approaches drop below the state-of-the-art by a large margin. Additionally, we investigate end-to-end variations to the discussed approaches, which do not improve upon the original.Comment: 20 pages, 7 figures, PLOS ONE journal. arXiv admin note: substantial text overlap with arXiv:2005.0440

    Estimation of the QoE for video streaming services based on facial expressions and gaze direction

    Get PDF
    As the multimedia technologies evolve, the need to control their quality becomes even more important making the Quality of Experience (QoE) measurements a key priority. Machine Learning (ML) can support this task providing models to analyse the information extracted by the multimedia. It is possible to divide the ML models applications in the following categories: 1) QoE modelling: ML is used to define QoE models which provide an output (e.g., perceived QoE score) for any given input (e.g., QoE influence factor). 2) QoE monitoring in case of encrypted traffic: ML is used to analyze passive traffic monitored data to obtain insight into degradations perceived by end-users. 3) Big data analytics: ML is used for the extraction of meaningful and useful information from the collected data, which can further be converted to actionable knowledge and utilized in managing QoE. The QoE estimation quality task can be carried out by using two approaches: the objective approach and subjective one. As the two names highlight, they are referred to the pieces of information that the model analyses. The objective approach analyses the objective features extracted by the network connection and by the used media. As objective parameters, the state-of-the-art shows different approaches that use also the features extracted by human behaviour. The subjective approach instead, comes as a result of the rating approach, where the participants were asked to rate the perceived quality using different scales. This approach had the problem of being a time-consuming approach and for this reason not all the users agree to compile the questionnaire. Thus the direct evolution of this approach is the ML model adoption. A model can substitute the questionnaire and evaluate the QoE, depending on the data that analyses. By modelling the human response to the perceived quality on multimedia, QoE researchers found that the parameters extracted from the users could be different, like Electroencephalogram (EEG), Electrocardiogram (ECG), waves of the brain. The main problem with these techniques is the hardware. In fact, the user must wear electrodes in case of ECG and EEG, and also if the obtained results from these methods are relevant, their usage in a real context could be not feasible. For this reason, my studies have been focused on the developing of a Machine Learning framework completely unobtrusively based on the Facial reactions

    Auditory Localization in Low-Bitrate Compressed Ambisonic Scenes

    Get PDF
    The increasing popularity of Ambisonics as a spatial audio format for streaming services poses new challenges to existing audio coding techniques. Immersive audio delivered to mobile devices requires an efficient bitrate compression that does not affect the spatial quality of the content. Good localizability of virtual sound sources is one of the key elements that must be preserved. This study was conducted to investigate the localization precision of virtual sound source presentations within Ambisonic scenes encoded with Opus low-bitrate compression at different bitrates and Ambisonic orders (1st, 3rd, and 5th). The test stimuli were reproduced over a 50-channel spherical loudspeaker configuration and binaurally using individually measured and generic Head-Related Transfer Functions (HRTFs). Participants were asked to adjust the position of a virtual acoustic pointer to match the position of virtual sound source within the bitrate-compressed Ambisonic scene. Results show that auditory localization in low-bitrate compressed Ambisonic scenes is not significantly affected by codec parameters. The key factors influencing localization are the rendering method and Ambisonic order truncation. This suggests that efficient perceptual coding might be successfully used for mobile spatial audio delivery

    An objective and subjective quality assessment for passive gaming video streaming

    Get PDF
    Gaming video streaming has become increasingly popular in recent times. Along with the rise and popularity of cloud gaming services and e-sports, passive gaming video streaming services such as Twitch.tv, YouTubeGaming, etc. where viewers watch the gameplay of other gamers, have seen increasing acceptance. Twitch.tv alone has over 2.2 million monthly streamers and 15 million daily active users with almost a million average concurrent users, making Twitch.tv the 4th biggest internet traffic generator, just after Netflix, YouTube and Apple. Despite the increasing importance and popularity of such live gaming video streaming services, they have until recently not caught the attention of the quality assessment research community. For the continued success of such services, it is imperative to maintain and satisfy the end user Quality of Experience (QoE), which can be measured using various Video Quality Assessment (VQA) methods. Gaming videos are synthetic and artificial in nature and have different streaming requirements as compared to traditional non-gaming content. While there exist a lot of subjective and objective studies in the field of quality assessment of Video-on-demand (VOD) streaming services, such as Netflix and YouTube, along with the design of many VQA metrics, no work has been done previously towards quality assessment of live passive gaming video streaming applications. The research work in this thesis tries to address this gap by using various subjective and objective quality assessment studies. A codec comparison using the three most popular and widely used compression standards is performed to determine their compression efficiency. Furthermore, a subjective and objective comparative study is carried out to find out the difference between gaming and non-gaming videos in terms of the trade-off between quality and data-rate after compression. This is followed by the creation of an open source gaming video dataset, which is then used for a performance evaluation study of the eight most popular VQA metrics. Different temporal pooling strategies and content based classification approaches are evaluated to assess their effect on the VQA metrics. Finally, due to the low performance of existing No-Reference (NR) VQA metrics on gaming video content, two machine learning based NR models are designed using NR features and existing NR metrics, which are shown to outperform existing NR metrics while performing on par with state-of-the-art Full-Reference (FR) VQA metrics

    Saliency-Enabled Coding Unit Partitioning and Quantization Control for Versatile Video Coding

    Get PDF
    The latest video coding standard, versatile video coding (VVC), has greatly improved coding efficiency over its predecessor standard high efficiency video coding (HEVC), but at the expense of sharply increased complexity. In the context of perceptual video coding (PVC), the visual saliency model that utilizes the characteristics of the human visual system to improve coding efficiency has become a reliable method due to advances in computer performance and visual algorithms. In this paper, a novel VVC optimization scheme compliant PVC framework is proposed, which consists of fast coding unit (CU) partition algorithm and quantization control algorithm. Firstly, based on the visual saliency model, we proposed a fast CU division scheme, including the redetermination of the CU division depth by calculating Scharr operator and variance, as well as the executive decision for intra sub-partitions (ISP), to reduce the coding complexity. Secondly, a quantization control algorithm is proposed by adjusting the quantization parameter based on multi-level classification of saliency values at the CU level to reduce the bitrate. In comparison with the reference model, experimental results indicate that the proposed method can reduce about 47.19% computational complexity and achieve a bitrate saving of 3.68% on average. Meanwhile, the proposed algorithm has reasonable peak signal-to-noise ratio losses and nearly the same subjective perceptual quality

    A comparison of reinforcement learning algorithms in fairness-oriented OFDMA schedulers

    Get PDF
    Due to large-scale control problems in 5G access networks, the complexity of radioresource management is expected to increase significantly. Reinforcement learning is seen as apromising solution that can enable intelligent decision-making and reduce the complexity of differentoptimization problems for radio resource management. The packet scheduler is an importantentity of radio resource management that allocates users’ data packets in the frequency domainaccording to the implemented scheduling rule. In this context, by making use of reinforcementlearning, we could actually determine, in each state, the most suitable scheduling rule to be employedthat could improve the quality of service provisioning. In this paper, we propose a reinforcementlearning-based framework to solve scheduling problems with the main focus on meeting the userfairness requirements. This framework makes use of feed forward neural networks to map momentarystates to proper parameterization decisions for the proportional fair scheduler. The simulation resultsshow that our reinforcement learning framework outperforms the conventional adaptive schedulersoriented on fairness objective. Discussions are also raised to determine the best reinforcement learningalgorithm to be implemented in the proposed framework based on various scheduler settings

    A comparison of reinforcement learning algorithms in fairness-oriented OFDMA schedulers

    Get PDF
    Due to large-scale control problems in 5G access networks, the complexity of radioresource management is expected to increase significantly. Reinforcement learning is seen as apromising solution that can enable intelligent decision-making and reduce the complexity of differentoptimization problems for radio resource management. The packet scheduler is an importantentity of radio resource management that allocates users’ data packets in the frequency domainaccording to the implemented scheduling rule. In this context, by making use of reinforcementlearning, we could actually determine, in each state, the most suitable scheduling rule to be employedthat could improve the quality of service provisioning. In this paper, we propose a reinforcementlearning-based framework to solve scheduling problems with the main focus on meeting the userfairness requirements. This framework makes use of feed forward neural networks to map momentarystates to proper parameterization decisions for the proportional fair scheduler. The simulation resultsshow that our reinforcement learning framework outperforms the conventional adaptive schedulersoriented on fairness objective. Discussions are also raised to determine the best reinforcement learningalgorithm to be implemented in the proposed framework based on various scheduler settings

    Smart Sensor Technologies for IoT

    Get PDF
    The recent development in wireless networks and devices has led to novel services that will utilize wireless communication on a new level. Much effort and resources have been dedicated to establishing new communication networks that will support machine-to-machine communication and the Internet of Things (IoT). In these systems, various smart and sensory devices are deployed and connected, enabling large amounts of data to be streamed. Smart services represent new trends in mobile services, i.e., a completely new spectrum of context-aware, personalized, and intelligent services and applications. A variety of existing services utilize information about the position of the user or mobile device. The position of mobile devices is often achieved using the Global Navigation Satellite System (GNSS) chips that are integrated into all modern mobile devices (smartphones). However, GNSS is not always a reliable source of position estimates due to multipath propagation and signal blockage. Moreover, integrating GNSS chips into all devices might have a negative impact on the battery life of future IoT applications. Therefore, alternative solutions to position estimation should be investigated and implemented in IoT applications. This Special Issue, “Smart Sensor Technologies for IoT” aims to report on some of the recent research efforts on this increasingly important topic. The twelve accepted papers in this issue cover various aspects of Smart Sensor Technologies for IoT
    corecore