Search CORE

146 research outputs found

User-aware Video Coding Based on Semantic Video Understanding and Enhancing

Author: Chia-Hu Chang
Yu-Tzu Lin
Publication venue: 'IntechOpen'
Publication date: 05/07/2011
Field of study

Enhanced quality reconstruction of erroneous video streams using packet filtering based on non-desynchronizing bits and UDP checksum-filtered list decoding

Author: Golaghazadeh Firouzeh
Publication venue: École de technologie supérieure
Publication date
Field of study

The latest video coding standards, such as H.264 and H.265, are extremely vulnerable in error-prone networks. Due to their sophisticated spatial and temporal prediction tools, the effect of an error is not limited to the erroneous area but it can easily propagate spatially to the neighboring blocks and temporally to the following frames. Thus, reconstructed video packets at the decoder side may exhibit significant visual quality degradation. Error concealment and error corrections are two mechanisms that have been developed to improve the quality of reconstructed frames in the presence of errors. In most existing error concealment approaches, the corrupted packets are ignored and only the correctly received information of the surrounding areas (spatially and/or temporally) is used to recover the erroneous area. This is due to the fact that there is no perfect error detection mechanism to identify correctly received blocks within a corrupted packet, and moreover because of the desynchronization problem caused by the transmission errors on the variable-length code (VLC). But, as many studies have shown, the corrupted packets may contain valuable information that can be used to reconstruct adequately of the lost area (e.g. when the error is located at the end of a slice). On the other hand, error correction approaches, such as list decoding, exploit the corrupted packets to generate several candidate transmitted packets from the corrupted received packet. They then select, among these candidates, the one with the highest likelihood of being the transmitted packet based on the available soft information (e.g. log-likelihood ratio (LLR) of each bit). However, list decoding approaches suffer from a large solution space of candidate transmitted packets. This is worsened when the soft information is not available at the application layer; a more realistic scenario in practice. Indeed, since it is unknown which bits have higher probabilities of having been modified during transmission, the candidate received packets cannot be ranked by likelihood. In this thesis, we propose various strategies to improve the quality of reconstructed packets which have been lightly damaged during transmission (e.g. at most a single error per packet). We first propose a simple but efficient mechanism to filter damaged packets in order to retain those likely to lead to a very good reconstruction and discard the others. This method can be used as a complement to most existing concealment approaches to enhance their performance. The method is based on the novel concept of non-desynchronizing bits (NDBs) defined, in the context of an H.264 context-adaptive variable-length coding (CAVLC) coded sequence, as a bit whose inversion does not cause desynchronization at the bitstream level nor changes the number of decoded macroblocks. We establish that, on typical coded bitstreams, the NDBs constitute about a one-third (about 30%) of a bitstream, and that the effect on visual quality of flipping one of them in a packet is mostly insignificant. In most cases (90%), the quality of the reconstructed packet when modifying an individual NDB is almost the same as the intact one. We thus demonstrate that keeping, under certain conditions, a corrupted packet as a candidate for the lost area can provide better visual quality compared to the concealment approaches. We finally propose a non-desync-based decoding framework, which retains a corrupted packet, under the condition of not causing desynchronization and not altering the number of expected macroblocks. The framework can be combined with most current concealment approaches. The proposed approach is compared to the frame copy (FC) concealment of Joint Model (JM) software (JM-FC) and a state-of-the-art concealment approach using the spatiotemporal boundary matching algorithm (STBMA) mechanism, in the case of one bit in error, and on average, respectively, provides 3.5 dB and 1.42 dB gain over them. We then propose a novel list decoding approach called checksum-filtered list decoding (CFLD) which can correct a packet at the bit stream level by exploiting the receiver side user datagram protocol (UDP) checksum value. The proposed approach is able to identify the possible locations of errors by analyzing the pattern of the calculated UDP checksum on the corrupted packet. This makes it possible to considerably reduce the number of candidate transmitted packets in comparison to conventional list decoding approaches, especially when no soft information is available. When a packet composed of N bits contains a single bit in error, instead of considering N candidate packets, as is the case in conventional list decoding approaches, the proposed approach considers approximately N/32 candidate packets, leading to a 97% reduction in the number of candidates. This reduction can increase to 99.6% in the case of a two-bit error. The method’s performance is evaluated using H.264 and high efficiency video coding (HEVC) test model software. We show that, in the case H.264 coded sequence, on average, the CFLD approach is able to correct the packet 66% of the time. It also offers a 2.74 dB gain over JM-FC and 1.14 dB and 1.42 dB gains over STBMA and hard output maximum likelihood decoding (HO-MLD), respectively. Additionally, in the case of HEVC, the CFLD approach corrects the corrupted packet 91% of the time, and offers 2.35 dB and 4.97 dB gains over our implementation of FC concealment in HEVC test model software (HM-FC) in class B (1920×1080) and C (832×480) sequences, respectively

Espace ÉTS

No-reference image and video quality assessment: a classification and review of recent approaches

Author: A Amer
A Amer
A Chetouani
A Chetouani
A Ciancio
A Ciancio
A Eden
A Ichigaya
A Ichigaya
A Khan
A Khan
A Khan
A Maalouf
A Maalouf
A Mittal
A Mittal
A Raake
A Rossholm
A Rossholm
A Takahashi
AB Watson
AC Bovik
AG Davis
AK Moorthy
AM Treisman
AN Rimell
Andreas Rossholm
AR Reibman
AR Reibman
B Belmudez
B Lee
B-X Zuo
B-X Zuo
Benny Lövström
C Chen
C Chen
C Keimel
C Keimel
C Keimel
C Li
C Oprea
C-S Park
Cisco Visual Networking Index
D Bhattacharjee
D Ćulibrk
DL Ruderman
DM Chandler
E Cohen
F Battisti
F Yang
F Yang
F Yang
G Valenzise
G Valenzise
G Van Wallendael
G Yammine
G Zhai
H Boujut
H Liu
H Liu
H Liu
H Liu
H Liu
H Tong
Hans-Jürgen Zepernick
HR Sheikh
HR Sheikh
HR Wu
I Park
I Sedano
ITU
ITU
ITU-T
J Han
J Joskowicz
J Park
J Shen
J Tian
J You
J You
J Zhang
J Zhang
J Zhang
J Zhou
JE Caviedes
K Nishikawa
K Nishikawa
K Rank
K Watanabe
K Watanabe
K Yamagishi
K Zhu
K-C Yang
KD Singh
L Debing
L Liang
M Barkowsky
M Chin
M Ghazal
M Naccari
M Naccari
M Narwaria
M Ries
M Ries
M Ries
M Shahid
M Shahid
M Slanina
M Vranješ
M-J Chen
M-J Chen
M-N Garcia
MA Saad
MA Saad
MA Saad
MCQ Farias
MG Choi
MN Do
Muhammad Shahid
N Narvekar
N Narvekar
N Ponomarenko
N Staelens
N Staelens
ND Narvekar
NG Sadaka
O Sugimoto
OYG Castillo
P Gastaldo
P Kortum
P Marziliano
P Marziliano
P Romaniak
PL Callet
Q Huynh-Thu
Q Huynh-Thu
R Ferzli
R Ferzli
R Ferzli
R Ferzli
R Ferzli
R Hassen
R Soundararajan
RR Pastrana-Vidal
RR Pastrana-Vidal
RV Babu
RV Babu
S Argyropoulos
S Borer
S Chikkerur
S Gabarda
S Ouni
S Pyatykh
S Suresh
S Suthaharan
S Varadarajan
S Winkler
S Winkler
S Wolf
S Wu
S Wu
S Yao
S Zhao
S-O Lee
S-Y Shim
SI Olsen
SS Hemami
T Brandão
T Brandão
T Brandão
T Brandão
T Brandão
T Oelbaum
T Shanableh
T Shanableh
T Yamada
T Yamada
T Yamada
U Engelke
U Engelke
U Engelke
VQEG
VQEG
W Lin
W Lu
X Jiang
X Liu
X Liu
X Liu
X Marichal
X Zhu
X Zhu
X-H Wang
Z Hua
Z Hua
Z Wang
Z Wang
Z Wang
Z Zhang
ZMP Sazzad
ZMP Sazzad
ZMP Sazzad
ZMP Sazzad
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

Inspection and evaluation of artifacts in digital video sources

Author: Goodall Todd Richard
Publication venue
Publication date: 27/08/2018
Field of study

Streaming digital video content providers such as YouTube, Amazon, Hulu, and Netflix collaborate with production teams to obtain new and old video content. These collaborations lead to an accumulation of video sources, some of which might contain unacceptable visual artifacts. Artifacts may inadvertently enter the video master at any point in the production pipeline, due to any of a number of equipment and user failures. Unfortunately, these artifacts are difficult to detect since no pristine reference exists for comparison. As of now, few automated tools exist that can effectively capture the most common forms of these artifacts. This work studies no-reference video source inspection for generalized artifact detection and subjective quality prediction, which will ultimate inform decisions related to acquisition of new content. Automatically identifying the locations and severities of video artifacts is a difficult problem. We have developed a general method for detecting local artifacts by learning differences in the statistics between distorted and pristine video frames. Our model, which we call the Video Impairment Mapper (VID-MAP), produces a full resolution map of artifact detection probabilities based on comparisons of excitatory and inhibatory convolutional responses. Validation on a large database shows that our method outperforms the previous state-of-the-art of even distortion-specific detectors. A variety of powerful picture quality predictors are available that rely on neuro-statistical models of distortion perception. We extend these principles to video source inspection, by coupling spatial divisive normalization with a series of filterbanks tuned for artifact detection, implemented using a common convolutional framework. We developed the Video Impairment Detection by SParse Error CapTure (VIDSPECT) model, which leverages discriminative sparse dictionaries that are tuned to detect specific artifacts. VIDSPECT is simple, highly generalizable, and yields better accuracy than competing methods. To evaluate the perceived quality of video sources containing artifacts, we built a new digital video database, called the LIVE Video Masters Database, which contains 384 videos affected by the types of artifacts encountered in otherwise pristine digital video sources. We find that VIDSPECT delivers top performance on this database for most artifacts tested, and competitive performance otherwise, using the same basic architecture in all cases.Electrical and Computer Engineerin

Texas ScholarWorks

Video Content-Based QoE Prediction for HEVC Encoded Videos Delivered over IP Networks

Author: Anegekuh Louis
Publication venue: Plymouth University
Publication date: 01/01/2015
Field of study

The recently released High Efficiency Video Coding (HEVC) standard, which halves the transmission bandwidth requirement of encoded video for almost the same quality when compared to H.264/AVC, and the availability of increased network bandwidth (e.g. from 2 Mbps for 3G networks to almost 100 Mbps for 4G/LTE) have led to the proliferation of video streaming services. Based on these major innovations, the prevalence and diversity of video application are set to increase over the coming years. However, the popularity and success of current and future video applications will depend on the perceived quality of experience (QoE) of end users. How to measure or predict the QoE of delivered services becomes an important and inevitable task for both service and network providers. Video quality can be measured either subjectively or objectively. Subjective quality measurement is the most reliable method of determining the quality of multimedia applications because of its direct link to users’ experience. However, this approach is time consuming and expensive and hence the need for an objective method that can produce results that are comparable with those of subjective testing. In general, video quality is impacted by impairments caused by the encoder and the transmission network. However, videos encoded and transmitted over an error-prone network have different quality measurements even under the same encoder setting and network quality of service (NQoS). This indicates that, in addition to encoder settings and network impairment, there may be other key parameters that impact video quality. In this project, it is hypothesised that video content type is one of the key parameters that may impact the quality of streamed videos. Based on this assertion, parameters related to video content type are extracted and used to develop a single metric that quantifies the content type of different video sequences. The proposed content type metric is then used together with encoding parameter settings and NQoS to develop content-based video quality models that estimate the quality of different video sequences delivered over IP-based network. This project led to the following main contributions: (1) A new metric for quantifying video content type based on the spatiotemporal features extracted from the encoded bitstream. (2) The development of novel subjective test approach for video streaming services. (3) New content-based video quality prediction models for predicting the QoE of video sequences delivered over IP-based networks. The models have been evaluated using subjective and objective methods

Plymouth Electronic Archive and Research Library

Privacy aware human action recognition: an exploration of temporal salience modelling and neuromorphic vision sensing

Author: AL-OBAIDI SALAH MAHDI
Publication venue
Publication date: 01/03/2020
Field of study

Solving the issue of privacy in the application of vision-based home monitoring has emerged as a significant demand. The state-of-the-art studies contain advanced privacy protection by filtering/covering the most sensitive content, which is the identity in this scenario. However, going beyond privacy remains a challenge for the machine to explore the obfuscated data, i.e., utility. Thanks for the usefulness of exploring the human visual system to solve the problem of visual data. Nowadays, a high level of visual abstraction can be obtained from the visual scene by constructing saliency maps that highlight the most useful content in the scene and attenuate others. One way of maintaining privacy with keeping useful information about the action is by discovering the most significant region and removing the redundancy. Another solution to address the privacy is motivated by the new visual sensor technology, i.e., neuromorphic vision sensor. In this thesis, we first introduce a novel method for vision-based privacy preservation. Particularly, we propose a new temporal salience-based anonymisation method to preserve privacy with maintaining the usefulness of the anonymity domain-based data. This anonymisation method has achieved a high level of privacy compared to the current work. The second contribution involves the development of a new descriptor for human action recognition (HAR) based on exploring the anonymity domain of the temporal salience method. The proposed descriptor tests the utility of the anonymised data without referring to RGB intensities of the original data. The extracted features using our proposed descriptor have shown an improvement with accuracies of the human actions, outperforming the existing methods. The proposed method has shown improvements by 3.04%, 3.14%, 0.83%, 3.67%, and 16.71% for DHA, KTH, UIUC1, UCF sports, and HMDB51 datasets, respectively, compared to state-of-the-art methods. The third contribution focuses on proposing a new method to deal with the new neuromorphic vision domain, which has come up to the application, since the issue of privacy has been already solved by the sensor itself. The output of this new domain is exploited by further exploring the local and global details of the log intensity changes. The empirical evaluation shows that exploring the neuromorphic domain provides useful details that have demonstrated increasing accuracy rates for E-KTH, E-UCF11 and E-HMDB5 by 0.54%, 19.42% and 25.61%, respectively

White Rose E-theses Online

Video transport optimization techniques design and evaluation for next generation cellular networks

Author: Munaretto Daniele
Publication venue
Publication date: 27/01/2014
Field of study

Video is foreseen to be the dominant type of data traffic in the Internet. This vision is supported by a number of studies which forecast that video traffic will drastically increase in the following years, surpassing Peer-to-Peer traffic in volume already in the current year. Current infrastructures are not prepared to deal with this traffic increase. The current Internet, and in particular the mobile Internet, was not designed with video requirements in mind and, as a consequence, its architecture is very inefficient for handling this volume of video traffic. When a large part of traffic is associated to multimedia entertainment, most of the mobile infrastructure is used in a very inefficient way to provide such a simple service, thereby saturating the whole cellular network, and leading to perceived quality levels that are not adequate to support widespread end user acceptance. The main goal of the research activity in this thesis is to evolve the mobile Internet architecture for efficient video traffic support. As video is expected to represent the majority of the traffic, the future architecture should efficiently support the requirements of this data type, and specific enhancements for video should be introduced at all layers of the protocol stack where needed. These enhancements need to cater for improved quality of experience, improved reliability in a mobile world (anywhere, anytime), lower exploitation cost, and increased flexibility. In this thesis a set of video delivery mechanisms are designed to optimize the video transmission at different layers of the protocol stack and at different levels of the cellular network. Upon the architectural choices, resource allocation schemes are implemented to support a range of video applications, which cover video broadcast/multicast streaming, video on demand, real-time streaming, video progressive download and video upstreaming. By means of simulation, the benefits of the designed mechanisms in terms of perceived video quality and network resource saving are shown and compared to existing solutions. Furthermore, selected modules are implemented in a real testbed and some experimental results are provided to support the development of such transport mechanisms in practice

Archivio istituzionale della ricerca - Università di Padova

Advances in Image Processing, Analysis and Recognition Technology

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

Directory of Open Access Books (DOAB)

Multimedia Forensics

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2022
Field of study

This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

Directory of Open Access Books (DOAB)

Fehlerkaschierte Bildbasierte Darstellungsverfahren

Author: Eisemann Martin
Publication venue: Monsenstein und Vannerdat
Publication date: 06/07/2011
Field of study

Creating photo-realistic images has been one of the major goals in computer graphics since its early days. Instead of modeling the complexity of nature with standard modeling tools, image-based approaches aim at exploiting real-world footage directly,as they are photo-realistic by definition. A drawback of these approaches has always been that the composition or combination of different sources is a non-trivial task, often resulting in annoying visible artifacts. In this thesis we focus on different techniques to diminish visible artifacts when combining multiple images in a common image domain. The results are either novel images, when dealing with the composition task of multiple images, or novel video sequences rendered in real-time, when dealing with video footage from multiple cameras.Fotorealismus ist seit jeher eines der großen Ziele in der Computergrafik. Anstatt die Komplexität der Natur mit standardisierten Modellierungswerkzeugen nachzubauen, gehen bildbasierte Ansätze den umgekehrten Weg und verwenden reale Bildaufnahmen zur Modellierung, da diese bereits per Definition fotorealistisch sind. Ein Nachteil dieser Variante ist jedoch, dass die Komposition oder Kombination mehrerer Quellbilder eine nichttriviale Aufgabe darstellt und häufig unangenehm auffallende Artefakte im erzeugten Bild nach sich zieht. In dieser Dissertation werden verschiedene Ansätze verfolgt, um Artefakte zu verhindern oder abzuschwächen, welche durch die Komposition oder Kombination mehrerer Bilder in einer gemeinsamen Bilddomäne entstehen. Im Ergebnis liefern die vorgestellten Verfahren neue Bilder oder neue Ansichten einer Bildsammlung oder Videosequenz, je nachdem, ob die jeweilige Aufgabe die Komposition mehrerer Bilder ist oder die Kombination mehrerer Videos verschiedener Kameras darstellt

Digitale Bibliothek Braunschweig