27 research outputs found

    New visual coding exploration in MPEG: Super-MultiView and free navigation in free viewpoint TV

    Get PDF
    ISO/IEC MPEG and ITU-T VCEG have recently jointly issued a new multiview video compression standard, called 3D-HEVC, which reaches unpreceded compression performances for linear,dense camera arrangements. In view of supporting future highquality,auto-stereoscopic 3D displays and Free Navigation virtual/augmented reality applications with sparse, arbitrarily arranged camera setups, innovative depth estimation and virtual view synthesis techniques with global optimizations over all camera views should be developed. Preliminary studies in response to the MPEG-FTV (Free viewpoint TV) Call for Evidence suggest these targets are within reach, with at least 6% bitrate gains over 3DHEVC technology

    A bag of words description scheme for image quality assessment

    Get PDF
    Every day millions of images are obtained, processed, compressed, saved, transmitted and reproduced. All these operations can cause distortions that affect their quality. The quality of these images should be measured subjectively. However, that brings the disadvantage of achieving a considerable number of tests with individuals requested to provide a statistical analysis of an image’s perceptual quality. Several objective metrics have been developed, that try to model the human perception of quality. However, in most applications the representation of human quality perception given by these metrics is far from the desired representation. Therefore, this work proposes the usage of machine learning models that allow for a better approximation. In this work, definitions for image and quality are given and some of the difficulties of the study of image quality are mentioned. Moreover, three metrics are initially explained. One uses the image’s original quality has a reference (SSIM) while the other two are no reference (BRISQUE and QAC). A comparison is made, showing a large discrepancy of values between the two kinds of metrics. The database that is used for the tests is TID2013. This database was chosen due to its dimension and by the fact of considering a large number of distortions. A study of each type of distortion in this database is made. Furthermore, some concepts of machine learning are introduced along with algorithms relevant in the context of this dissertation, notably, K-means, KNN and SVM. Description aggregator algorithms like “bag of words” and “fisher-vectors” are also mentioned. This dissertation studies a new model that combines machine learning and a quality metric for quality estimation. This model is based on the division of images in cells, where a specific metric is computed. With this division, it is possible to obtain local quality descriptors that will be aggregated using “bag of words”. A SVM with an RBF kernel is trained and tested on the same database and the results of the model are evaluated using cross-validation. The results are analysed using Pearson, Spearman and Kendall correlations and the RMSE to evaluate the representation of the model when compared with the subjective results. The model improves the results of the metric that was used and shows a new path to apply machine learning for quality evaluation.No nosso dia-a-dia as imagens sĂŁo obtidas, processadas, comprimidas, guardadas, transmitidas e reproduzidas. Em qualquer destas operaçÔes podem ocorrer distorçÔes que prejudicam a sua qualidade. A qualidade destas imagens pode ser medida de forma subjectiva, o que tem a desvantagem de serem necessĂĄrios vĂĄrios testes, a um nĂșmero considerĂĄvel de indivĂ­duos para ser feita uma anĂĄlise estatĂ­stica da qualidade perceptual de uma imagem. Foram desenvolvidas vĂĄrias mĂ©tricas objectivas, que de alguma forma tentam modelar a percepção humana de qualidade. Todavia, em muitas aplicaçÔes a representação de percepção de qualidade humana dada por estas mĂ©tricas fica aquĂ©m do desejĂĄvel, razĂŁo porque se propĂ”e neste trabalho usar modelos de reconhecimento de padrĂ”es que permitam uma maior aproximação. Neste trabalho, sĂŁo dadas definiçÔes para imagem e qualidade e algumas das dificuldades do estudo da qualidade de imagem sĂŁo referidas. É referida a importĂąncia da qualidade de imagem como ramo de estudo, e sĂŁo estudadas diversas mĂ©tricas de qualidade. SĂŁo explicadas trĂȘs mĂ©tricas, uma delas que usa a qualidade original como referĂȘncia (SSIM) e duas mĂ©tricas sem referĂȘncia (BRISQUE e QAC). Uma comparação Ă© feita entre elas, mostrando- – se uma grande discrepĂąncia de valores entre os dois tipos de mĂ©tricas. Para os testes feitos Ă© usada a base de dados TID2013, que Ă© muitas vezes considerada para estudos de qualidade de mĂ©tricas devido Ă  sua dimensĂŁo e ao facto de considerar um grande nĂșmero de distorçÔes. Neste trabalho tambĂ©m se fez um estudo dos tipos de distorção incluidos nesta base de dados e como Ă© que eles sĂŁo simulados. SĂŁo introduzidos tambĂ©m alguns conceitos teĂłricos de reconhecimento de padrĂ”es e alguns algoritmos relevantes no contexto da dissertação, sĂŁo descritos como o K-means, KNN e as SVMs. Algoritmos de agregação de descritores como o “bag of words” e o “fisher-vectors” tambĂ©m sĂŁo referidos. Esta dissertação adiciona mĂ©todos de reconhecimento de padrĂ”es a mĂ©tricas objectivas de qua– lidade de imagem. Uma nova tĂ©cnica Ă© proposta, baseada na divisĂŁo de imagens em cĂ©lulas, nas quais uma mĂ©trica serĂĄ calculada. Esta divisĂŁo permite obter descritores locais de qualidade que serĂŁo agregados usando “bag of words”. Uma SVM com kernel RBF Ă© treinada e testada na mesma base de dados e os resultados do modelo sĂŁo mostrados usando cross-validation. Os resultados sĂŁo analisados usando as correlaçÔes de Pearson, Spearman e Kendall e o RMSE que permitem avaliar a proximidade entre a mĂ©trica desenvolvida e os resultados subjectivos. Este modelo melhora os resultados obtidos com a mĂ©trica usada e demonstra uma nova forma de aplicar modelos de reconhecimento de padrĂ”es ao estudo de avaliação de qualidade

    Exploiting Digital Surface Models for Inferring Super-Resolution for Remotely Sensed Images

    Get PDF
    Despite the plethora of successful Super-Resolution Reconstruction (SRR) models applied to natural images, their application to remote sensing imagery tends to produce poor results. Remote sensing imagery is often more complicated than natural images and has its peculiarities such as being of lower resolution, it contains noise, and often depicting large textured surfaces. As a result, applying non-specialized SRR models on remote sensing imagery results in artifacts and poor reconstructions. To address these problems, this paper proposes an architecture inspired by previous research work, introducing a novel approach for forcing an SRR model to output realistic remote sensing images: instead of relying on feature-space similarities as a perceptual loss, the model considers pixel-level information inferred from the normalized Digital Surface Model (nDSM) of the image. This strategy allows the application of better-informed updates during the training of the model which sources from a task (elevation map inference) that is closely related to remote sensing. Nonetheless, the nDSM auxiliary information is not required during production and thus the model infers a super-resolution image without any additional data besides its low-resolution pairs. We assess our model on two remotely sensed datasets of different spatial resolutions that also contain the DSM pairs of the images: the DFC2018 dataset and the dataset containing the national Lidar fly-by of Luxembourg. Based on visual inspection, the inferred super-resolution images exhibit particularly superior quality. In particular, the results for the high-resolution DFC2018 dataset are realistic and almost indistinguishable from the ground truth images

    A Review of Predictive Quality of Experience Management in Video Streaming Services

    Get PDF
    Satisfying the requirements of devices and users of online video streaming services is a challenging task. It requires not only managing the network quality of service but also to exert real-time control, addressing the user's quality of experience (QoE) expectations. QoE management is an end-to-end process that, due to the ever-increasing variety of video services, has become too complex for conventional “reactive” techniques. Herein, we review the most significant “predictive” QoE management methods for video streaming services, showing how different machine learning approaches may be used to perform proactive control. We pinpoint a selection of the best suited machine learning methods, highlighting advantages and limitations in specific service conditions. The review leads to lessons learned and guidelines to better address QoE requirements in complex video services

    A Virtual Reality Application of the Rubber Hand Illusion Induced by Ultrasonic Mid-Air Haptic Stimulation

    Get PDF
    Ultrasonic mid-air haptic technologies, which provide haptic feedback through airwaves produced using ultrasound, could be employed to investigate the sense of body ownership and immersion in virtual reality (VR) by inducing the virtual hand illusion (VHI). Ultrasonic mid-air haptic perception has solely been investigated for glabrous (hairless) skin, which has higher tactile sensitivity than hairy skin. In contrast, the VHI paradigm typically targets hairy skin without comparisons to glabrous skin. The aim of this article was to investigate illusory body ownership, the applicability of ultrasonic mid-air haptics, and perceived immersion in VR using the VHI. Fifty participants viewed a virtual hand being stroked by a feather synchronously and asynchronously with the ultrasonic stimulation applied to the glabrous skin on the palmar surface and the hairy skin on the dorsal surface of their hands. Questionnaire responses revealed that synchronous stimulation induced a stronger VHI than asynchronous stimulation. In synchronous conditions, the VHI was stronger for palmar stimulation than dorsal stimulation. The ultrasonic stimulation was also perceived as more intense on the palmar surface compared to the dorsal surface. Perceived immersion was not related to illusory body ownership per se but was enhanced by the provision of synchronous stimulation

    Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption

    Get PDF
    Subtitles can help improve the understanding of media content. People enable subtitles based on individual characteristics (e.g., language or hearing ability), viewing environment, or media context (e.g., drama, quiz show). However, some people find that subtitles can be distracting and that they negatively impact their viewing experience. We explore the challenges and opportunities surrounding interaction with real-time personalisation of subtitled content. To understand how people currently interact with subtitles, we first conducted an online questionnaire with 102 participants. We used our findings to elicit requirements for a new approach called Adaptive Subtitles that allows the viewer to alter which speakers have subtitles displayed in real-time. We evaluated our approach with 19 participants to understand the interaction trade-offs and challenges within real-time adaptations of subtitled media. Our evaluation findings suggest that granular controls and structured onboarding allow viewers to make informed trade-offs when adapting media content, leading to improved viewing experiences
    corecore