50 research outputs found

    Temporal and Spatial Alignment of Multimedia Signals

    Get PDF
    With the increasing availability of cameras and other mobile devices, digital images and videos are becoming ubiquitous. Research efforts have been made to develop technologies that utilize multiple pieces of multimedia information simultaneously. This dissertation focuses on the temporal and spatial alignment of multimedia signals, which is a fundamental problem that needs to be solved to enable such applications dealing with multiple pieces of multimedia data. The first part of the dissertation addresses the synchronization of multimedia signals. We propose a new modality for audio and video synchronization based on the electric network frequency (ENF) signal naturally embedded in multimedia recordings. Synchronization of audio and video is achieved by aligning the ENF signals. The proposed method offers a significant departure to tackling the audio/video synchronization problem from existing work, and a strong potential to address previously untractable scenarios. Estimation of the ENF signal from video is a challenging task. In order to address the problem of insufficient sampling rate of video, we propose to exploit the rolling shutter mechanism commonly adopted in CMOS camera sensors. Several techniques are designed to alleviate the distortions of motions and brightness changes in videos for ENF estimation. We also address several challenges that are unique to the synchronization of digitized analog audio recordings. Speed offset often occurs in digitized analog audio recordings due to the inconsistency in the tape's rolling speed. We show that the ENF signal captured by the original analog audio recording can be retained in the digitized version. The ENF signal is considered approximately as a single-tone signal and used as a reference to detect and correct speed offsets automatically. A complete multimedia application system often needs to jointly consider both temporal synchronization and spatial alignment. The last part of the dissertation examines the quality assessment of local image features for efficient and robust spatial alignment. We propose a scheme to evaluate the quality of SIFT features in terms of their robustness and discriminability. A quality score is assigned to every SIFT feature based on its contrast value, scale and descriptor, using a quality metric kernel that is obtained in a one-time training phase. Feature selection is performed by retaining features with high quality scores. The proposed approach is also applicable to other local image features, such as the Speeded Up Robust Features (SURF)

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    TIME AND LOCATION FORENSICS FOR MULTIMEDIA

    Get PDF
    In the modern era, a vast quantities of digital information is available in the form of audio, image, video, and other sensor recordings. These recordings may contain metadata describing important information such as the time and the location of recording. As the stored information can be easily modified using readily available digital editing software, determining the authenticity of a recording has utmost importance, especially for critical applications such as law enforcement, journalism, and national and business intelligence. In this dissertation, we study novel environmental signatures induced by power networks, which are known as Electrical Network Frequency (ENF) signals and become embedded in multimedia data at the time of recording. ENF fluctuates slightly over time from its nominal value of 50 Hz/60 Hz. The major trend of fluctuations in the ENF remains consistent across the entire power grid, including when measured at physically distant geographical locations. We investigate the use of ENF signals for a variety of applications such as estimation/verification of time and location of a recording's creation, and develop a theoretical foundation to support ENF based forensic analysis. In the first part of the dissertation, the presence of ENF signals in visual recordings captured in electric powered lighting environments is demonstrated. The source of ENF signals in visual recordings is shown to be the invisible flickering of indoor lighting sources such as fluorescent and incandescent lamps. The techniques to extract ENF signals from recordings demonstrate that a high correlation is observed between the ENF fluctuations obtained from indoor lighting and that from the power mains supply recorded at the same time. Applications of the ENF signal analysis to tampering detection of surveillance video recordings, and forensic binding of the audio and visual track of a video are also discussed. In the following part, an analytical model is developed to gain an understanding of the behavior of ENF signals. It is demonstrated that ENF signals can be modeled using a time-varying autoregressive process. The performance of the proposed model is evaluated for a timestamp verification application. Based on this model, an improved algorithm for ENF matching between a reference signal and a query signal is provided. It is shown that the proposed approach provides an improved matching performance as compared to the case when matching is performed directly on ENF signals. Another application of the proposed model in learning the power grid characteristics is also explicated. These characteristics are learnt by using the modeling parameters as features to train a classifier to determine the creation location of a recording among candidate grid-regions. The last part of the dissertation demonstrates that differences exist between ENF signals recorded in the same grid-region at the same time. These differences can be extracted using a suitable filter mechanism and follow a relationship with the distance between different locations. Based on this observation, two localization protocols are developed to identify the location of a recording within the same grid-region, using ENF signals captured at anchor locations. Localization accuracy of the proposed protocols are then compared. Challenges in using the proposed technique to estimate the creation location of multimedia recordings within the same grid, along with efficient and resilient trilateration strategies in the presence of outliers and malicious anchors, are also discussed

    Intrinsically Embedded Signatures for Multimedia Forensics

    Get PDF
    This dissertation examines the use of signatures that are intrinsically embedded in media recordings for studies and applications in multimedia forensics. These near-invisible signatures are fingerprints that are captured unintentionally in a recording due to influences from the environment in which it was made and the recording device that was used to make it. We focus on two types of such signatures: the Electric Network Frequency (ENF) signal and the flicker signal. The ENF is the frequency of power distribution networks and has a nominal value of 50Hz or 60Hz. The ENF fluctuates around its nominal value due to load changes in the grid. It is particularly relevant to multimedia forensics because ENF variations captured intrinsically in a media recording reflect the time and location related properties of the respective area in which it was made. This has led to a number of applications in information forensics and security, such as time-of-recording authentication/estimation and ENF-based detection of tampering in a recording. The first part of this dissertation considers the extraction and detection of the ENF signal. We discuss our proposed spectrum combining approach for ENF estimation that exploits the presence of ENF traces at several harmonics within the same recording to produce more accurate and robust ENF signal estimates. We also explore possible factors that can promote or hinder the capture of ENF traces in recordings, which is important for a better understanding of the real-world applicability of ENF signals. Next, we discuss novel real-world ENF-based applications proposed through this dissertation research. We discuss using the embedded ENF signal to identify the region-of-recording of a media signal through a pattern analysis and learning framework that distinguishes between ENF signals coming from different power grids. We also discuss the use of the ENF traces embedded in a video to characterize the video camera that had originally produced the video, an application that was inspired by our work on flicker forensics. The last part of the dissertation considers the flicker signal and its use in forensics. We address problems in the entertainment industry pertaining to movie piracy related investigations, where a pirated movie is formed by camcording media content shown on an LCD screen. The flicker signature can be inherently created in such a scenario due to the interplay between the back-light of an LCD screen and the recording mechanism of the video camera. We build an analytic model of the flicker, relating it to inner parameters of the video camera and the screen producing the video. We then demonstrate that solely analyzing such a pirated video can lead to the identification of the video camera and the screen that produced the video, which can be used as corroborating evidence in piracy investigations
    corecore