14 research outputs found

    Receiver-Driven Video Adaptation

    Get PDF
    In the span of a single generation, video technology has made an incredible impact on daily life. Modern use cases for video are wildly diverse, including teleconferencing, live streaming, virtual reality, home entertainment, social networking, surveillance, body cameras, cloud gaming, and autonomous driving. As these applications continue to grow more sophisticated and heterogeneous, a single representation of video data can no longer satisfy all receivers. Instead, the initial encoding must be adapted to each receiver's unique needs. Existing adaptation strategies are fundamentally flawed, however, because they discard the video's initial representation and force the content to be re-encoded from scratch. This process is computationally expensive, does not scale well with the number of videos produced, and throws away important information embedded in the initial encoding. Therefore, a compelling need exists for the development of new strategies that can adapt video content without fully re-encoding it. To better support the unique needs of smart receivers, diverse displays, and advanced applications, general-use video systems should produce and offer receivers a more flexible compressed representation that supports top-down adaptation strategies from an original, compressed-domain ground truth. This dissertation proposes an alternate model for video adaptation that addresses these challenges. The key idea is to treat the initial compressed representation of a video as the ground truth, and allow receivers to drive adaptation by dynamically selecting which subsets of the captured data to receive. In support of this model, three strategies for top-down, receiver-driven adaptation are proposed. First, a novel, content-agnostic entropy coding technique is implemented in which symbols are selectively dropped from an input abstract symbol stream based on their estimated probability distributions to hit a target bit rate. Receivers are able to guide the symbol dropping process by supplying the encoder with an appropriate rate controller algorithm that fits their application needs and available bandwidths. Next, a domain-specific adaptation strategy is implemented for H.265/HEVC coded video in which the prediction data from the original source is reused directly in the adapted stream, but the residual data is recomputed as directed by the receiver. By tracking the changes made to the residual, the encoder can compensate for decoder drift to achieve near-optimal rate-distortion performance. Finally, a fully receiver-driven strategy is proposed in which the syntax elements of a pre-coded video are cataloged and exposed directly to clients through an HTTP API. Instead of requesting the entire stream at once, clients identify the exact syntax elements they wish to receive using a carefully designed query language. Although an implementation of this concept is not provided, an initial analysis shows that such a system could save bandwidth and computation when used by certain targeted applications.Doctor of Philosoph

    Enhanced applicability of loop transformations

    Get PDF

    Cross-Layer Framework for Multiuser Real Time H.264/AVC Video Encoding and Transmission over Block Fading MIMO Channels Using Outage Probability

    Get PDF
    We present a framework for cross-layer optimized real time multiuser encoding of video using a single layer H.264/AVC and transmission over MIMO wireless channels. In the proposed cross-layer adaptation, the channel of every user is characterized by the probability density function of its channel mutual information and the performance of the H.264/AVC encoder is modeled by a rate distortion model that takes into account the channel errors. These models are used during the resource allocation of the available slots in a TDMA MIMO communication system with capacity achieving channel codes. This framework allows for adaptation to the statistics of the wireless channel and to the available resources in the system and utilization of the multiuser diversity of the transmitted video sequences. We show the effectiveness of the proposed framework for video transmission over Rayleigh MIMO block fading channels, when channel distribution information is available at the transmitter

    Adapted Compressed Sensing: A Game Worth Playing

    Get PDF
    Despite the universal nature of the compressed sensing mechanism, additional information on the class of sparse signals to acquire allows adjustments that yield substantial improvements. In facts, proper exploitation of these priors allows to significantly increase compression for a given reconstruction quality. Since one of the most promising scopes of application of compressed sensing is that of IoT devices subject to extremely low resource constraint, adaptation is especially interesting when it can cope with hardware-related constraint allowing low complexity implementations. We here review and compare many algorithmic adaptation policies that focus either on the encoding part or on the recovery part of compressed sensing. We also review other more hardware-oriented adaptation techniques that are actually able to make the difference when coming to real-world implementations. In all cases, adaptation proves to be a tool that should be mastered in practical applications to unleash the full potential of compressed sensing

    Digital Watermarking for Verification of Perception-based Integrity of Audio Data

    Get PDF
    In certain application fields digital audio recordings contain sensitive content. Examples are historical archival material in public archives that preserve our cultural heritage, or digital evidence in the context of law enforcement and civil proceedings. Because of the powerful capabilities of modern editing tools for multimedia such material is vulnerable to doctoring of the content and forgery of its origin with malicious intent. Also inadvertent data modification and mistaken origin can be caused by human error. Hence, the credibility and provenience in terms of an unadulterated and genuine state of such audio content and the confidence about its origin are critical factors. To address this issue, this PhD thesis proposes a mechanism for verifying the integrity and authenticity of digital sound recordings. It is designed and implemented to be insensitive to common post-processing operations of the audio data that influence the subjective acoustic perception only marginally (if at all). Examples of such operations include lossy compression that maintains a high sound quality of the audio media, or lossless format conversions. It is the objective to avoid de facto false alarms that would be expectedly observable in standard crypto-based authentication protocols in the presence of these legitimate post-processing. For achieving this, a feasible combination of the techniques of digital watermarking and audio-specific hashing is investigated. At first, a suitable secret-key dependent audio hashing algorithm is developed. It incorporates and enhances so-called audio fingerprinting technology from the state of the art in contentbased audio identification. The presented algorithm (denoted as ”rMAC” message authentication code) allows ”perception-based” verification of integrity. This means classifying integrity breaches as such not before they become audible. As another objective, this rMAC is embedded and stored silently inside the audio media by means of audio watermarking technology. This approach allows maintaining the authentication code across the above-mentioned admissible post-processing operations and making it available for integrity verification at a later date. For this, an existent secret-key ependent audio watermarking algorithm is used and enhanced in this thesis work. To some extent, the dependency of the rMAC and of the watermarking processing from a secret key also allows authenticating the origin of a protected audio. To elaborate on this security aspect, this work also estimates the brute-force efforts of an adversary attacking this combined rMAC-watermarking approach. The experimental results show that the proposed method provides a good distinction and classification performance of authentic versus doctored audio content. It also allows the temporal localization of audible data modification within a protected audio file. The experimental evaluation finally provides recommendations about technical configuration settings of the combined watermarking-hashing approach. Beyond the main topic of perception-based data integrity and data authenticity for audio, this PhD work provides new general findings in the fields of audio fingerprinting and digital watermarking. The main contributions of this PhD were published and presented mainly at conferences about multimedia security. These publications were cited by a number of other authors and hence had some impact on their works

    Quantum simulation of confinement dynamics

    Get PDF
    Quantum computers have generated much excitement over recent years due to their potential to outperform classical computers in many difficult problems. While a fully fault tolerant quantum device is yet to be built, there has been much work pushing for noisy intermediate-scale quantum (NISQ) devices to achieve quantum advantage. One of the most promising fields to accomplish this is in quantum simulation of quantum many-body systems. A classical computer is able to simulate a general quantum system but suffers from exponential memory requirements in system size. Thus, for exact results, classical computers are limited to simulating just tens of particles whereas realistic quantum systems are comprised of ∌1023\sim 10^{23}. Quantum computers are able to reduce this memory cost to polynomial growth making them key to understand the physics of many-body systems. One area that is notably difficult to simulate is confinement physics. Confinement is the phenomenon in which the energy of two particles grows indefinitely with their separation - most prominently found between quarks in quantum chromodynamics (QCD). In this work we will consider the application of quantum devices to simulate such phenomena. In particular, we consider simple condensed matter systems, namely variations of the Ising model, that exhibit confinement physics. In the first half of this work we perform an analytical and numerical study of confinement, and develop a trotterization protocol to enable the quantum simulation of such physics on a digital quantum computer. We present results obtained directly on an IBM quantum computer showing the non-equilibrium effects of confinement in such systems. In order to achieve these results we developed state-of-the-art error mitigation methods to combat the large errors inherently faced in current NISQ devices. In the latter half, we propose physical phenomena that may act as a benchmark for quantum devices in the future. Collisions of mesons (boundstates of two particles) with impurities are considered in which a long-lived metastable state is found to form. Such collisions have potential to be simulated on digital quantum computers in the near future. We then consider collisions of mesons in systems with long-range interactions. We show how collisions of interacting mesons can lead to the formation of hadrons (boundstates of many constituent particles) in a fusion type event. While these proposals are beyond current digital quantum computer capabilities, analogue quantum simulation devices such as trapped ion setups or Rydberg atom experiments are well suited to realise this physics.Open Acces

    Fundamental Limits in Multimedia Forensics and Anti-forensics

    Get PDF
    As the use of multimedia editing tools increases, people become questioning the authenticity of multimedia content. This is specially a big concern for authorities, such as law enforcement, news reporter and government, who constantly use multimedia evidence to make critical decisions. To verify the authenticity of multimedia content, many forensic techniques have been proposed to identify the processing history of multimedia content under question. However, as new technologies emerge and more complicated scenarios are considered, the limitation of multimedia forensics has been gradually realized by forensic researchers. It is the inevitable trend in multimedia forensics to explore the fundamental limits. In this dissertation, we propose several theoretical frameworks to study the fundamental limits in various forensic problems. Specifically, we begin by developing empirical forensic techniques to deal with the limitation of existing techniques due to the emergence of new technology, compressive sensing. Then, we go one step further to explore the fundamental limit of forensic performance. Two types of forensic problems have been examined. In operation forensics, we propose an information theoretical framework and define forensicability as the maximum information features contain about hypotheses of processing histories. Based on this framework, we have found the maximum number of JPEG compressions one can detect. In order forensics, an information theoretical criterion is proposed to determine when we can and cannot detect the order of manipulation operations that have been applied on multimedia content. Additionally, we have examined the fundamental tradeoffs in multimedia antiforensics, where attacking techniques are developed by forgers to conceal manipulation fingerprints and confuse forensic investigations. In this field, we have defined concealability as the effectiveness of anti-forensics concealing manipulation fingerprints. Then, a tradeoff between concealability, rate and distortion is proposed and characterized for compression anti-forensics, which provides us valuable insights of how forgers may behave under their best strategy

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications
    corecore