27 research outputs found
New visual coding exploration in MPEG: Super-MultiView and free navigation in free viewpoint TV
ISO/IEC MPEG and ITU-T VCEG have recently jointly issued
a new multiview video compression standard, called 3D-HEVC,
which reaches unpreceded compression performances for linear,dense camera arrangements. In view of supporting future highquality,auto-stereoscopic 3D displays and Free Navigation virtual/augmented reality applications with sparse, arbitrarily arranged camera setups, innovative depth estimation and virtual view synthesis techniques with global optimizations over all camera views should be developed. Preliminary studies in response to the MPEG-FTV (Free viewpoint TV) Call for Evidence suggest these
targets are within reach, with at least 6% bitrate gains over 3DHEVC
technology
A bag of words description scheme for image quality assessment
Every day millions of images are obtained, processed, compressed, saved, transmitted and reproduced.
All these operations can cause distortions that affect their quality. The quality of
these images should be measured subjectively. However, that brings the disadvantage of achieving
a considerable number of tests with individuals requested to provide a statistical analysis of
an imageâs perceptual quality. Several objective metrics have been developed, that try to model
the human perception of quality. However, in most applications the representation of human
quality perception given by these metrics is far from the desired representation. Therefore,
this work proposes the usage of machine learning models that allow for a better approximation.
In this work, definitions for image and quality are given and some of the difficulties of the study
of image quality are mentioned. Moreover, three metrics are initially explained. One uses the
imageâs original quality has a reference (SSIM) while the other two are no reference (BRISQUE
and QAC). A comparison is made, showing a large discrepancy of values between the two kinds
of metrics.
The database that is used for the tests is TID2013. This database was chosen due to its dimension
and by the fact of considering a large number of distortions. A study of each type of distortion
in this database is made.
Furthermore, some concepts of machine learning are introduced along with algorithms relevant
in the context of this dissertation, notably, K-means, KNN and SVM. Description aggregator
algorithms like âbag of wordsâ and âfisher-vectorsâ are also mentioned.
This dissertation studies a new model that combines machine learning and a quality metric for
quality estimation. This model is based on the division of images in cells, where a specific
metric is computed. With this division, it is possible to obtain local quality descriptors that will
be aggregated using âbag of wordsâ. A SVM with an RBF kernel is trained and tested on the same
database and the results of the model are evaluated using cross-validation.
The results are analysed using Pearson, Spearman and Kendall correlations and the RMSE to
evaluate the representation of the model when compared with the subjective results. The
model improves the results of the metric that was used and shows a new path to apply machine
learning for quality evaluation.No nosso dia-a-dia as imagens sĂŁo obtidas, processadas, comprimidas, guardadas, transmitidas
e reproduzidas. Em qualquer destas operaçÔes podem ocorrer distorçÔes que prejudicam a sua
qualidade. A qualidade destas imagens pode ser medida de forma subjectiva, o que tem a
desvantagem de serem necessĂĄrios vĂĄrios testes, a um nĂșmero considerĂĄvel de indivĂduos para
ser feita uma anĂĄlise estatĂstica da qualidade perceptual de uma imagem. Foram desenvolvidas
vårias métricas objectivas, que de alguma forma tentam modelar a percepção humana de
qualidade. Todavia, em muitas aplicaçÔes a representação de percepção de qualidade humana
dada por estas métricas fica aquém do desejåvel, razão porque se propÔe neste trabalho usar
modelos de reconhecimento de padrÔes que permitam uma maior aproximação.
Neste trabalho, são dadas definiçÔes para imagem e qualidade e algumas das dificuldades do
estudo da qualidade de imagem sĂŁo referidas. Ă referida a importĂąncia da qualidade de imagem
como ramo de estudo, e são estudadas diversas métricas de qualidade.
SĂŁo explicadas trĂȘs mĂ©tricas, uma delas que usa a qualidade original como referĂȘncia (SSIM) e
duas mĂ©tricas sem referĂȘncia (BRISQUE e QAC). Uma comparação Ă© feita entre elas, mostrando-
â se uma grande discrepĂąncia de valores entre os dois tipos de mĂ©tricas.
Para os testes feitos Ă© usada a base de dados TID2013, que Ă© muitas vezes considerada para
estudos de qualidade de métricas devido à sua dimensão e ao facto de considerar um grande
nĂșmero de distorçÔes. Neste trabalho tambĂ©m se fez um estudo dos tipos de distorção incluidos
nesta base de dados e como Ă© que eles sĂŁo simulados.
São introduzidos também alguns conceitos teóricos de reconhecimento de padrÔes e alguns
algoritmos relevantes no contexto da dissertação, são descritos como o K-means, KNN e as
SVMs. Algoritmos de agregação de descritores como o âbag of wordsâ e o âfisher-vectorsâ
também são referidos.
Esta dissertação adiciona mĂ©todos de reconhecimento de padrĂ”es a mĂ©tricas objectivas de quaâ
lidade de imagem. Uma nova técnica é proposta, baseada na divisão de imagens em células, nas
quais uma métrica serå calculada. Esta divisão permite obter descritores locais de qualidade
que serĂŁo agregados usando âbag of wordsâ. Uma SVM com kernel RBF Ă© treinada e testada na
mesma base de dados e os resultados do modelo sĂŁo mostrados usando cross-validation.
Os resultados são analisados usando as correlaçÔes de Pearson, Spearman e Kendall e o RMSE
que permitem avaliar a proximidade entre a métrica desenvolvida e os resultados subjectivos.
Este modelo melhora os resultados obtidos com a métrica usada e demonstra uma nova forma
de aplicar modelos de reconhecimento de padrÔes ao estudo de avaliação de qualidade
Exploiting Digital Surface Models for Inferring Super-Resolution for Remotely Sensed Images
Despite the plethora of successful Super-Resolution Reconstruction (SRR) models applied to natural images, their application to remote sensing imagery tends to produce poor results. Remote sensing imagery is often more complicated than natural images and has its peculiarities such as being of lower resolution, it contains noise, and often depicting large textured surfaces. As a result, applying non-specialized SRR models on remote sensing imagery results in artifacts and poor reconstructions. To address these problems, this paper proposes an architecture inspired by previous research work, introducing a novel approach for forcing an SRR model to output realistic remote sensing images: instead of relying on feature-space similarities as a perceptual loss, the model considers pixel-level information inferred from the normalized Digital Surface Model (nDSM) of the image. This strategy allows the application of better-informed updates during the training of the model which sources from a task (elevation map inference) that is closely related to remote sensing. Nonetheless, the nDSM auxiliary information is not required during production and thus the model infers a super-resolution image without any additional data besides its low-resolution pairs. We assess our model on two remotely sensed datasets of different spatial resolutions that also contain the DSM pairs of the images: the DFC2018 dataset and the dataset containing the national Lidar fly-by of Luxembourg. Based on visual inspection, the inferred super-resolution images exhibit particularly superior quality. In particular, the results for the high-resolution DFC2018 dataset are realistic and almost indistinguishable from the ground truth images
A Review of Predictive Quality of Experience Management in Video Streaming Services
Satisfying the requirements of devices and users of online video streaming services is a challenging task. It requires not only managing the network quality of service but also to exert real-time control, addressing the user's quality of experience (QoE) expectations. QoE management is an end-to-end process that, due to the ever-increasing variety of video services, has become too complex for conventional âreactiveâ techniques. Herein, we review the most significant âpredictiveâ QoE management methods for video streaming services, showing how different machine learning approaches may be used to perform proactive control. We pinpoint a selection of the best suited machine learning methods, highlighting advantages and limitations in specific service conditions. The review leads to lessons learned and guidelines to better address QoE requirements in complex video services
A Virtual Reality Application of the Rubber Hand Illusion Induced by Ultrasonic Mid-Air Haptic Stimulation
Ultrasonic mid-air haptic technologies, which provide haptic feedback through airwaves produced using ultrasound, could be employed to investigate the sense of body ownership and immersion in virtual reality (VR) by inducing the virtual hand illusion (VHI). Ultrasonic mid-air haptic perception has solely been investigated for glabrous (hairless) skin, which has higher tactile sensitivity than hairy skin. In contrast, the VHI paradigm typically targets hairy skin without comparisons to glabrous skin. The aim of this article was to investigate illusory body ownership, the applicability of ultrasonic mid-air haptics, and perceived immersion in VR using the VHI. Fifty participants viewed a virtual hand being stroked by a feather synchronously and asynchronously with the ultrasonic stimulation applied to the glabrous skin on the palmar surface and the hairy skin on the dorsal surface of their hands. Questionnaire responses revealed that synchronous stimulation induced a stronger VHI than asynchronous stimulation. In synchronous conditions, the VHI was stronger for palmar stimulation than dorsal stimulation. The ultrasonic stimulation was also perceived as more intense on the palmar surface compared to the dorsal surface. Perceived immersion was not related to illusory body ownership per se but was enhanced by the provision of synchronous stimulation
Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption
Subtitles can help improve the understanding of media content. People enable subtitles based on individual characteristics (e.g., language or hearing ability), viewing environment, or media context (e.g., drama, quiz show). However, some people find that subtitles can be distracting and that they negatively impact their viewing experience. We explore the challenges and opportunities surrounding interaction with real-time personalisation of subtitled content. To understand how people currently interact with subtitles, we first conducted an online questionnaire with 102 participants. We used our findings to elicit requirements for a new approach called Adaptive Subtitles that allows the viewer to alter which speakers have subtitles displayed in real-time. We evaluated our approach with 19 participants to understand the interaction trade-offs and challenges within real-time adaptations of subtitled media. Our evaluation findings suggest that granular controls and structured onboarding allow viewers to make informed trade-offs when adapting media content, leading to improved viewing experiences
Application of Quality of Experience in Networked Services: Review, Trend & Perspectives
Full text embargoed until 17.10.2019 (publisher's embargo period, 12 months