Search CORE

12 research outputs found

Detecting Deepfake Videos in Data Scarcity Conditions by Means of Video Coding Features

Author: Alamayreh O
Barni M
Costanzo A
Tondi B
Wang J
Publication venue: 'Now Publishers'
Publication date: 01/01/2022
Field of study

The most powerful deepfake detection methods developed so far are based on deep learning, requiring that large amounts of training data representative of the specific task are available to the trainer. In this paper, we propose a feature-based method for video deepfake detection that can work in data scarcity conditions, that is, when only very few examples are available to the forensic analyst. The proposed method is based on video coding analysis and relies on a simple footprint obtained from the motion prediction modes in the video sequence. The footprint is extracted from video sequences and used to train a simple linear Support Vector Machine classifier. The effectiveness of the proposed method is validated experimentally on three different datasets, namely, a synthetic street video dataset and two datasets of Deepfake face videos

Archivio della Ricerca - Università degli Studi di Siena

A new video quality metric for compressed video.

Author: Bhat Abharana Ramdas
Publication venue
Publication date: 29/02/2012
Field of study

Video compression enables multimedia applications such as mobile video messaging and streaming, video conferencing and more recently online social video interactions to be possible. Since most multimedia applications are meant for the human observer, measuring perceived video quality during the designing and testing of these applications is important. Performance of existing perceptual video quality measurement techniques is limited due to poor correlation with subjective quality and implementation complexity. Therefore, this thesis presents new techniques for measuring perceived quality of compressed multimedia video using computationally simple and efficient algorithms. A new full reference perceptual video quality metric called the MOSp metric for measuring subjective quality of multimedia video sequences compressed using block-based video coding algorithms is developed. The metric predicts subjective quality of compressed video using the mean squared error between original and compressed sequences, and video content. Factors which influence the visibility of compression-induced distortion such as spatial texture masking, temporal masking and cognition, are considered for quantifying video content. The MOSp metric is simple to implement and can be integrated into block-based video coding algorithms for real time quality estimations. Performance results presented for a variety of multimedia content compressed to a large range of bitrates show that the metric has high correlation with subjective quality and performs better than popular video quality metrics. As an application of the MOSp metric to perceptual video coding, a new MOSpbased mode selection algorithm for a H264/AVC video encoder is developed. Results show that, by integrating the MOSp metric into the mode selection process, it is possible to make coding decisions based on estimated visual quality rather than mathematical error measures and to achieve visual quality gain in content that is identified as visually important by the MOSp metric. The novel algorithms developed in this research work are particularly useful for integrating into block based video encoders such as the H264/AVC standard for making real time visual quality estimations and coding decisions based on estimated visual quality rather than the currently used mathematical error measures

Open Access Institutional Repository at Robert Gordon University

Complexity management of H.264/AVC video compression.

Author: Kannangara Chaminda Sampath
Publication venue
Publication date: 31/10/2006
Field of study

The H. 264/AVC video coding standard offers significantly improved compression efficiency and flexibility compared to previous standards. However, the high computational complexity of H. 264/AVC is a problem for codecs running on low-power hand held devices and general purpose computers. This thesis presents new techniques to reduce, control and manage the computational complexity of an H. 264/AVC codec. A new complexity reduction algorithm for H. 264/AVC is developed. This algorithm predicts "skipped" macroblocks prior to motion estimation by estimating a Lagrange ratedistortion cost function. Complexity savings are achieved by not processing the macroblocks that are predicted as "skipped". The Lagrange multiplier is adaptively modelled as a function of the quantisation parameter and video sequence statistics. Simulation results show that this algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. The complexity reduction algorithm is further developed to achieve complexity-scalable control of the encoding process. The Lagrangian cost estimation is extended to incorporate computational complexity. A target level of complexity is maintained by using a feedback algorithm to update the Lagrange multiplier associated with complexity. Results indicate that scalable complexity control of the encoding process can be achieved whilst maintaining near optimal complexity-rate-distortion performance. A complexity management framework is proposed for maximising the perceptual quality of coded video in a real-time processing-power constrained environment. A real-time frame-level control algorithm and a per-frame complexity control algorithm are combined in order to manage the encoding process such that a high frame rate is maintained without significantly losing frame quality. Subjective evaluations show that the managed complexity approach results in higher perceptual quality compared to a reference encoder that drops frames in computationally constrained situations. These novel algorithms are likely to be useful in implementing real-time H. 264/AVC standard encoders in computationally constrained environments such as low-power mobile devices and general purpose computers

Open Access Institutional Repository at Robert Gordon University

Signal processing for improved MPEG-based communication systems

Author: Eerenberg O.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2015
Field of study

Repository TU/e

Pure OAI Repository

Recommended from our members

End-to-end 3D video communication over heterogeneous networks

Author: Mohib Hamdullah
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Three-dimensional technology, more commonly referred to as 3D technology, has revolutionised many fields including entertainment, medicine, and communications to name a few. In addition to 3D films, games, and sports channels, 3D perception has made tele-medicine a reality. By the year 2015, 30% of the all HD panels at home will be 3D enabled, predicted by consumer electronics manufacturers. Stereoscopic cameras, a comparatively mature technology compared to other 3D systems, are now being used by ordinary citizens to produce 3D content and share at a click of a button just like they do with the 2D counterparts via sites like YouTube. But technical challenges still exist, including with autostereoscopic multiview displays. 3D content requires many complex considerations--including how to represent it, and deciphering what is the best compression format--when considering transmission or storage, because of its increased amount of data. Any decision must be taken in the light of the available bandwidth or storage capacity, quality and user expectations. Free viewpoint navigation also remains partly unsolved. The most pressing issue getting in the way of widespread uptake of consumer 3D systems is the ability to deliver 3D content to heterogeneous consumer displays over the heterogeneous networks. Optimising 3D video communication solutions must consider the entire pipeline, starting with optimisation at the video source to the end display and transmission optimisation. Multi-view offers the most compelling solution for 3D videos with motion parallax and freedom from wearing headgear for 3D video perception. Optimising multi-view video for delivery and display could increase the demand for true 3D in the consumer market. This thesis focuses on an end-to-end quality optimisation in 3D video communication/transmission, offering solutions for optimisation at the compression, transmission, and decoder levels.Brunel University - Isambard Research Scholarshi

Brunel University Research Archive

System Steganalysis: Implementation Vulnerabilities and Side-Channel Attacks Against Digital Steganography Systems

Author: Sloan Thomas
Publication venue
Publication date: 01/08/2018
Field of study

Steganography is the process of hiding information in plain sight, it is a technology that can be used to hide data and facilitate secret communications. Steganography is commonly seen in the digital domain where the pervasive nature of media content (image, audio, video) provides an ideal avenue for hiding secret information. In recent years, video steganography has shown to be a highly suitable alternative to image and audio steganography due to its potential advantages (capacity, flexibility, popularity). An increased interest towards research in video steganography has led to the development of video stego-systems that are now available to the public. Many of these stego-systems have not yet been subjected to analysis or evaluation, and their capabilities for performing secure, practical, and effective video steganography are unknown. This thesis presents a comprehensive analysis of the state-of-the-art in practical video steganography. Video-based stego-systems are identified and examined using steganalytic techniques (system steganalysis) to determine the security practices of relevant stego-systems. The research in this thesis is conducted through a series of case studies that aim to provide novel insights in the field of steganalysis and its capabilities towards practical video steganography. The results of this work demonstrate the impact of system attacks over the practical state-of-the-art in video steganography. Through this research, it is evident that video-based stego-systems are highly vulnerable and fail to follow many of the well-understood security practices in the field. Consequently, it is possible to confidently detect each stego-system with a high rate of accuracy. As a result of this research, it is clear that current work in practical video steganography demonstrates a failure to address key principles and best practices in the field. Continued efforts to address this will provide safe and secure steganographic technologies

Kent Academic Repository

Video Quality Measurement for 3G Handset

Author: Zeeshan
Publication venue: 'University of Plymouth'
Publication date: 01/01/2007
Field of study

The quality of video has become a decisive factor for the consumer of 3G video services to choose his mobile operator. It is, therefore, critical for 3G network operator, equipment provider and service provider to measure and hence maintain the video quality of video services they offer. A project has been proposed in University of Plymouth to develop a test platform to evaluate video quality for 3G handset using Asterisk PBX server. For this purpose, support for 3G-324M protocol and all the audio and video codecs (i.e. H.263 baseline level 10 and MPEG-4 simple profile @ level 0) mandated and recommended by 3G- 324M standard should be added in to Asterisk®. The purpose of this thesis is to identify the correct software implementation of H.263 baseline level 10 and MEPG-4 simple profile @ level 0 video codecs so that they can then be incorporated in to Asterisk®. This is the part of the above mentioned project. Open source FFmpeg-libavcodec is believed to support both MPEG-4 and H.263 codecs. Similarly Telenor H.263 codec is also free to use. This project tests both the capabilities and suitability of the above mentioned software packages/codecs for adding in to Asterisk to perform the required encoding and decoding. Experiments showed that FFmpeg-libavcodec can neither decode nor encode to MPEG-4 simple profile @ level 0. It seems that FFmpeg requires some major modifications in its source code to support MPEG-4 simple profile @ level 0 codec. Although FFmepg can decode and encode to H.263 baseline level 10, but it does not offer a fine control over bitrate while encoding, and reports very high muxing overhead while decoding, H.263 baseline level 10. Telenor H.263 codec can decode and encode to H.263 baseline level 10.without any problem. Telenor H.263 codec is, therefore, more suitable for incorporating in to Asterisk® than FFmpeg for decoding and encoding to H.263 baseline level 10 bitstreams. ISchool of Computing, Communication and Electronic

Plymouth Electronic Archive and Research Library

Energy efficient hardware acceleration of multimedia processing tools

Author: Kinane Andrew
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/2006
Field of study

The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

Irish Universities

DCU Online Research Access Service

Scalable video compression with optimized visual performance and random accessibility

Author: Leung Raymond
Publication venue: UNSW, Sydney
Publication date: 01/01/2006
Field of study

This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video

UNSWorks

Detection of Double-Compressed H.264/AVC Video Incorporating the Features of the String of Data Bits and Skip Macroblocks

Author: Chuan Qin
Heng Yao
Saihua Song
Xiaokai Liu
Zhenjun Tang
Publication venue: 'MDPI AG'
Publication date: 01/12/2017
Field of study

Today’s H.264/AVC coded videos have a high quality, high data-compression ratio. They also have a strong fault tolerance, better network adaptability, and have been widely applied on the Internet. With the popularity of powerful and easy-to-use video editing software, digital videos can be tampered with in various ways. Therefore, the double compression in the H.264/AVC video can be used as a first step in the study of video-tampering forensics. This paper proposes a simple, but effective, double-compression detection method that analyzes the periodic features of the string of data bits (SODBs) and the skip macroblocks (S-MBs) for all I-frames and P-frames in a double-compressed H.264/AVC video. For a given suspicious video, the SODBs and S-MBs are extracted for each frame. Both features are then incorporated to generate one enhanced feature to represent the periodic artifact of the double-compressed video. Finally, a time-domain analysis is conducted to detect the periodicity of the features. The primary Group of Pictures (GOP) size is estimated based on an exhaustive strategy. The experimental results demonstrate the efficacy of the proposed method

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals