36 research outputs found
Recommended from our members
Characterizing Audio Events for Video Soundtrack Analysis
There is an entire emerging ecosystem of amateur video recordings on the internet today, in addition to the abundance of more professionally produced content. The ability to automatically scan and evaluate the content of these recordings would be very useful for search and indexing, especially as amateur content tends to be more poorly labeled and tagged than professional content. Although the visual content is often considered to be of primary importance, the audio modality contains rich information which may be very helpful in the context of video search and understanding. Any technology that could help to interpret video soundtrack data would also be applicable in a number of other scenarios, such as mobile device audio awareness, surveillance, and robotics. In this thesis we approach the problem of extracting information from these kinds of unconstrained audio recordings. Specifically we focus on techniques for characterizing discrete audio events within the soundtrack (e.g. a dog bark or door slam), since we expect events to be particularly informative about content. Our task is made more complicated by the extremely variable recording quality and noise present in this type of audio. Initially we explore the idea of using the matching pursuit algorithm to decompose and isolate components of audio events. Using these components we develop an approach for non-exact (approximate) fingerprinting as a way to search audio data for similar recurring events. We demonstrate a proof of concept for this idea. Subsequently we extend the use of matching pursuit to build an actual audio fingerprinting system, with the goal of identifying simultaneously recorded amateur videos (i.e. videos taken in the same place at the same time by different people, which contain overlapping audio). Automatic discovery of these simultaneous recordings is one particularly interesting facet of general video indexing. We evaluate this fingerprinting system on a database of 733 internet videos. Next we return to searching for features to directly characterize soundtrack events. We develop a system to detect transient sounds and represent audio clips as a histogram of the transients it contains. We use this representation for video classification over a database of 1873 internet videos. When we combine these features with a spectral feature baseline system we achieve a relative improvement of 7.5% in mean average precision over the baseline. In another attempt to devise features to better describe and compare events, we investigate decomposing audio using a convolutional form of non-negative matrix factorization, resulting in event-like spectro-temporal patches. We use the resulting representation to build an event detection system that is more robust to additive noise than a comparative baseline system. Lastly we investigate a promising feature representation that has been used by others previously to describe event-like sound effect clips. These features derive from an auditory model and are meant to capture fine time structure in sound events. We compare these features and a related but simpler feature set on the task of video classification over 9317 internet videos. We find that combinations of these features with baseline spectral features produce a significant improvement in mean average precision over the baseline
Digital Watermarking for Verification of Perception-based Integrity of Audio Data
In certain application fields digital audio recordings contain sensitive content. Examples are historical archival material in public archives that preserve our cultural heritage, or digital evidence in the context of law enforcement and civil proceedings. Because of the powerful capabilities of modern editing tools for multimedia such material is vulnerable to doctoring of the content and forgery of its origin with malicious intent. Also inadvertent data modification and mistaken origin can be caused by human error. Hence, the credibility and provenience in terms of an unadulterated and genuine state of such audio content and the confidence about its origin are critical factors.
To address this issue, this PhD thesis proposes a mechanism for verifying the integrity and authenticity of digital sound recordings. It is designed and implemented to be insensitive to common post-processing operations of the audio data that influence the subjective acoustic perception only marginally (if at all). Examples of such operations include lossy compression that maintains a high sound quality of the audio media, or lossless format conversions. It is the objective to avoid de facto false alarms that would be expectedly observable in standard crypto-based authentication protocols in the presence of these legitimate post-processing. For achieving this, a feasible combination of the techniques of digital watermarking and audio-specific hashing is investigated.
At first, a suitable secret-key dependent audio hashing algorithm is developed. It incorporates and enhances so-called audio fingerprinting technology from the state of the art in contentbased audio identification. The presented algorithm (denoted as ”rMAC” message authentication code) allows ”perception-based” verification of integrity. This means classifying integrity breaches as such not before they become audible. As another objective, this rMAC is embedded and stored silently inside the audio media by means of audio watermarking technology. This approach allows maintaining the authentication code across the above-mentioned admissible post-processing operations and making it available for integrity verification at a later date. For this, an existent secret-key ependent audio watermarking algorithm is used and enhanced in this thesis work.
To some extent, the dependency of the rMAC and of the watermarking processing from a secret key also allows authenticating the origin of a protected audio. To elaborate on this security aspect, this work also estimates the brute-force efforts of an adversary attacking this combined rMAC-watermarking approach. The experimental results show that the proposed method provides a good distinction and classification
performance of authentic versus doctored audio content. It also allows the temporal localization of audible data modification within a protected audio file. The experimental evaluation finally provides recommendations about technical configuration settings of the combined watermarking-hashing approach.
Beyond the main topic of perception-based data integrity and data authenticity for audio, this PhD work provides new general findings in the fields of audio fingerprinting and digital watermarking. The main contributions of this PhD were published and presented mainly at conferences about multimedia security. These publications were cited by a number of other authors and hence had some impact on their works
Context and communication profiling for IoT security and privacy: techniques and applications
During the last decade, two major technological changes have profoundly changed the way in which users consume and interact with on-line services and applications. The first of these has been the success of mobile computing, in particular that of smartphones, the primary end device used by many users for access to the Internet and various applications. The other change is the emergence of the so-called Internet-of-Things (IoT), denoting a technological transition in which everyday objects like household appliances that traditionally have been seen as stand-alone devices, are given network connectivity by introducing digital communication capabilities to those devices. The topic of this dissertation is related to a core challenge that the emergence of these technologies is introducing: how to effectively manage the security and privacy settings of users and devices in a user-friendly manner in an environment in which an ever-growing number of heterogeneous devices live and co-exist with each other?
In particular we study approaches for utilising profiling of contextual parameters and device communications
in order to make autonomous security decisions with the goal of striking a better balance between a system's security on one hand, and, its usability on the other. We introduce four distinct novel approaches utilising profiling for this end. First, we introduce ConXsense, a system demonstrating the use of user-specific longitudinal profiling of contextual information for modelling the usage context of mobile computing devices. Based on this ConXsense can probabilistically automate security policy decisions affecting security settings of the device.
Further we develop an approach utilising the similarity of contextual parameters observed with on-board sensors of co-located devices to construct proofs of presence that are resilient to context-guessing attacks by adversaries that seek to fool a device into believing the adversary is co-located with it, even though it is in reality not.
We then extend this approach to a context-based key evolution approach that allows IoT devices that are co-present in the same physical environment like the same room to use passively observed context measurements to iteratively authenticate their co-presence and thus gradually establish confidence in the other device being part of the same trust domain, e.g., the set of IoT devices in a user's home. We further analyse the relevant constraints that need to be taken into account to ensure security and usability of context-based authentication.
In the final part of this dissertation we extend the profiling approach to network communications of IoT devices and utilise it to realise the design of the IoTSentinel system for autonomous security policy adaptation in IoT device networks. We show that by monitoring the inherent network traffic of IoT devices during their initial set-up, we can automatically identify the type of device newly added to the network. The device-type information is then used by IoTSentinel to adapt traffic filtering rules automatically to provide isolation of devices that are potentially vulnerable to known attacks, thereby protecting the device itself and the rest of the network from threats arising from possible compromise of vulnerable devices
MediaSync: Handbook on Multimedia Synchronization
This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences
Digital watermarking and novel security devices
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Recent Application in Biometrics
In the recent years, a number of recognition and authentication systems based on biometric measurements have been proposed. Algorithms and sensors have been developed to acquire and process many different biometric traits. Moreover, the biometric technology is being used in novel ways, with potential commercial and practical implications to our daily activities. The key objective of the book is to provide a collection of comprehensive references on some recent theoretical development as well as novel applications in biometrics. The topics covered in this book reflect well both aspects of development. They include biometric sample quality, privacy preserving and cancellable biometrics, contactless biometrics, novel and unconventional biometrics, and the technical challenges in implementing the technology in portable devices. The book consists of 15 chapters. It is divided into four sections, namely, biometric applications on mobile platforms, cancelable biometrics, biometric encryption, and other applications. The book was reviewed by editors Dr. Jucheng Yang and Dr. Norman Poh. We deeply appreciate the efforts of our guest editors: Dr. Girija Chetty, Dr. Loris Nanni, Dr. Jianjiang Feng, Dr. Dongsun Park and Dr. Sook Yoon, as well as a number of anonymous reviewers
Image and Video Forensics
Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity
Micro-architectural Threats to Modern Computing Systems
With the abundance of cheap computing power and high-speed internet, cloud and mobile computing replaced traditional computers. As computing models evolved, newer CPUs were fitted with additional cores and larger caches to accommodate run multiple processes concurrently. In direct relation to these changes, shared hardware resources emerged and became a source of side-channel leakage. Although side-channel attacks have been known for a long time, these changes made them practical on shared hardware systems. In addition to side-channels, concurrent execution also opened the door to practical quality of service attacks (QoS). The goal of this dissertation is to identify side-channel leakages and architectural bottlenecks on modern computing systems and introduce exploits. To that end, we introduce side-channel attacks on cloud systems to recover sensitive information such as code execution, software identity as well as cryptographic secrets. Moreover, we introduce a hard to detect QoS attack that can cause over 90+\% slowdown. We demonstrate our attack by designing an Android app that causes degradation via memory bus locking. While practical and quite powerful, mounting side-channel attacks is akin to listening on a private conversation in a crowded train station. Significant manual labor is required to de-noise and synchronizes the leakage trace and extract features. With this motivation, we apply machine learning (ML) to automate and scale the data analysis. We show that classical machine learning methods, as well as more complicated convolutional neural networks (CNN), can be trained to extract useful information from side-channel leakage trace. Finally, we propose the DeepCloak framework as a countermeasure against side-channel attacks. We argue that by exploiting adversarial learning (AL), an inherent weakness of ML, as a defensive tool against side-channel attacks, we can cloak side-channel trace of a process. With DeepCloak, we show that it is possible to trick highly accurate (99+\% accuracy) CNN classifiers. Moreover, we investigate defenses against AL to determine if an attacker can protect itself from DeepCloak by applying adversarial re-training and defensive distillation. We show that even in the presence of an intelligent adversary that employs such techniques, DeepCloak still succeeds
Multimedia Forensics
This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field