9 research outputs found
VenoMave: Targeted Poisoning Against Speech Recognition
The wide adoption of Automatic Speech Recognition (ASR) remarkably enhanced
human-machine interaction. Prior research has demonstrated that modern ASR
systems are susceptible to adversarial examples, i.e., malicious audio inputs
that lead to misclassification by the victim's model at run time. The research
question of whether ASR systems are also vulnerable to data-poisoning attacks
is still unanswered. In such an attack, a manipulation happens during the
training phase: an adversary injects malicious inputs into the training set to
compromise the neural network's integrity and performance. Prior work in the
image domain demonstrated several types of data-poisoning attacks, but these
results cannot directly be applied to the audio domain. In this paper, we
present the first data-poisoning attack against ASR, called VenoMave. We
evaluate our attack on an ASR system that detects sequences of digits. When
poisoning only 0.17% of the dataset on average, we achieve an attack success
rate of 86.67%. To demonstrate the practical feasibility of our attack, we also
evaluate if the target audio waveform can be played over the air via simulated
room transmissions. In this more realistic threat model, VenoMave still
maintains a success rate up to 73.33%. We further extend our evaluation to the
Speech Commands corpus and demonstrate the scalability of VenoMave to a larger
vocabulary. During a transcription test with human listeners, we verify that
more than 85% of the original text of poisons can be correctly transcribed. We
conclude that data-poisoning attacks against ASR represent a real threat, and
we are able to perform poisoning for arbitrary target input files while the
crafted poison samples remain inconspicuous
On the Limitations of Model Stealing with Uncertainty Quantification Models
Model stealing aims at inferring a victim model's functionality at a fraction
of the original training cost. While the goal is clear, in practice the model's
architecture, weight dimension, and original training data can not be
determined exactly, leading to mutual uncertainty during stealing. In this
work, we explicitly tackle this uncertainty by generating multiple possible
networks and combining their predictions to improve the quality of the stolen
model. For this, we compare five popular uncertainty quantification models in a
model stealing task. Surprisingly, our results indicate that the considered
models only lead to marginal improvements in terms of label agreement (i.e.,
fidelity) to the stolen model. To find the cause of this, we inspect the
diversity of the model's prediction by looking at the prediction variance as a
function of training iterations. We realize that during training, the models
tend to have similar predictions, indicating that the network diversity we
wanted to leverage using uncertainty quantification models is not (high) enough
for improvements on the model stealing task.Comment: 6 pages, 1 figure, 2 table, paper submitted to European Symposium on
Artificial Neural Networks, Computational Intelligence and Machine Learnin
Password-Authenticated Key Exchange from Group Actions
We present two provably secure password-authenticated key exchange (PAKE) protocols based on a commutative group action. To date the most important instantiation of isogeny-based group actions is given by CSIDH. To model the properties more accurately, we extend the framework of cryptographic group actions (Alamati et al., ASIACRYPT 2020) by the ability of computing the quadratic twist of an elliptic curve. This property is always present in the CSIDH setting and turns out to be crucial in the security analysis of our PAKE protocols.
Despite the resemblance, the translation of Diffie-Hellman based PAKE protocols to group actions either does not work with known techniques or is insecure ( How not to create an isogeny-based PAKE , Azarderakhsh et al., ACNS 2020). We overcome the difficulties mentioned in previous work by using a bit-by-bit approach, where each password bit is considered separately.
Our first protocol can be executed in a single round. Both parties need to send two set elements for each password bit in order to prevent offline dictionary attacks. The second protocol requires only one set element per password bit, but one party has to send a commitment on its message first. We also discuss different optimizations that can be used to reduce the computational cost. We provide comprehensive security proofs for our base protocols and deduce security for the optimized versions
Security of machine learning systems
Modelle für Maschinelles Lernen (ML) werden nicht explizit programmiert, sondern aus einer Menge von Datenpunkten abgeleitet. Diese datenzentrische Perspektive ermöglicht vielfältige Anwendungen und ML-Modelle werden in der Praxis inzwischen häufig als interne Komponenten von ML-Systemen eingesetzt. Diese Integration eröffnet jedoch neue Angriffsflächen, da ML-Modelle anfällig für eine Vielzahl von Angriffen sind. Frühere Arbeiten haben solche Angriffe primär aus der Perspektive des Modells betrachtet, aber der Einsatz in einem praktischen System führt zu zusätzlichen Anforderungen und das allgemein untersuchte Angreifermodell bildet das Wissen, die Fähigkeiten und die Ziele realistischer Angreifer nicht ausreichend ab. In dieser Arbeit untersuchen wir die Sicherheit des maschinellen Lernens aus der Perspektive der Systemsicherheit. Indem wir das ML-Modell als Teil eines Systems betrachten, können wir die erhöhte Angriffsfläche evaluieren und wie Systeme besser geschützt werden können.Machine learning (ML) models are not programmed explicitly but are learnt from a set of data points. This data-centric paradigm allows for diverse applications, and ML models are now widely deployed in practice as internal components of ML systems. This inclusion of machine learning, however, introduces a new attack surface to these systems since ML models are vulnerable to a myriad of possible attacks. While prior work made remarkable progress in understanding such attacks from the perspective of the model, the deployment in practical systems introduces additional constraints, and commonly studied threat models do not sufficiently express the knowledge, capabilities, and goals of practical adversaries. In this work, we therefore investigate the security of machine learning with a systems security approach. By viewing the ML model as part of a system, we study the increased attack surface of practical systems and how such systems can be secured
Exploring accidental triggers of smart speakers
Voice assistants like Amazon’s Alexa, Google’s Assistant, Tencent’s Xiaowei, or Apple’s Siri, have become the primary (voice) interface in smart speakers that can be found in millions of households. For privacy reasons, these speakers analyze every sound in their environment for their respective wake word like “Alexa,” “JiÇ”sì’èr lĂng,” or “Hey Siri,” before uploading the audio stream to the cloud for further processing. Previous work reported on examples of an inaccurate wake word detection, which can be tricked using similar words or sounds like “cocaine noodles” instead of “OK Google.”
In this paper, we perform a comprehensive analysis of such accidental triggers, i. e., sounds that should not have triggered the voice assistant, but did. More specifically, we automate the process of finding accidental triggers and measure their prevalence across 11 smart speakers from 8 different manufacturers using everyday media such as TV shows, news, and other kinds of audio datasets. To systematically detect accidental triggers, we describe a method to artificially craft such triggers using a pronouncing dictionary and a weighted, phone-based Levenshtein distance. In total, we have found hundreds of accidental triggers. Moreover, we explore potential gender and language biases and analyze the reproducibility. Finally, we discuss the resulting privacy implications of accidental triggers and explore countermeasures to reduce and limit their impact on users’ privacy. To foster additional research on these sounds that mislead machine learning models, we publish a dataset of more than 350 verified triggers as a research artifact
VENOMAVE: Targeted Poisoning Against Speech Recognition
Despite remarkable improvements, automatic speech recognition is susceptible to adversarial perturbations. Compared to standard machine learning architectures, these attacks against speech recognition are significantly more challenging, especially since the inputs to a speech recognition system are time series that contain both acoustic and linguistic properties of speech. Extracting all recognition-relevant information requires more complex pipelines and an ensemble of specialized components. Consequently, an attacker needs to consider the entire pipeline.
In this paper, we present VENOMAVE, the first training time poisoning attack against speech recognition. Similar to the pre-dominantly studied evasion attacks, we pursue the same goal: leading the system to an incorrect, and attacker-chosen transcription of a target audio waveform. In contrast to evasion attacks, however, we assume that the attacker can only manipulate a small part of the training data without altering the target audio waveform at run time. We evaluate our attack on two datasets: TIDIGITS and Speech Commands. When poisoning less than 0.17 % of the dataset, VENOMAVE achieves attack success rates of over 80.0 %, without access to the victim’s network architecture or hyperparameters. In a more realistic scenario, when the target audio waveform is played over the air in different rooms, VENOMAVE maintains a success rate of up to 73.3%. Finally, VENOMAVE achieves an attack transferability rate of
36.4 % between two different model architecture
Drone Security and the Mysterious Case of DJI's DroneID
Consumer drones enable high-class aerial video photography, promise to reform the logistics industry, and are already used for humanitarian rescue operations and during armed conflicts. Contrasting their widespread adoption and high popularity, the low entry barrier for air mobility - a traditionally heavily regulated sector - poses many risks to safety, security, and privacy. Malicious parties could, for example, (mis-)use drones for surveillance, transportation of illegal goods, or cause economic damage by intruding the closed airspace over airports. To prevent harm, drone manufacturers employ several countermeasures to enforce safe and secure use of drones, e.g., they impose software limits regarding speed and altitude, or use geofencing to implement no-fly zones around airports or prisons. Complementing traditional countermeasures, drones from the market leader DJI implement a protocol called DroneID, which is designed to transmit the position of both the drone and its operator to authorized entities such as law enforcement or operators of critical infrastructures.
In this paper, we analyze security and privacy claims for drones, focusing on the leading manufacturer DJI with a market share of 94%. We first systemize the drone attack surface and investigate an attacker capable of eavesdropping on the drone's over-the-air data traffic. Based on reverse engineering of DJI firmware, we design and implement a decoder for DJI's proprietary tracking protocol DroneID using only cheap COTS hardware. We show that the transmitted data is not encrypted, but accessible to anyone, compromising the drone operator's privacy. Second, we conduct a comprehensive analysis of drone security: Using a combination of reverse engineering, a novel fuzzing approach tailored to DJI's communication protocol, and hardware analysis, we uncover several critical flaws in drone firmware that allow attackers to gain elevated privileges on two different DJI drones and their remote control. Such root access paves the way to disable or bypass countermeasures and abuse drones. These vulnerabilities have the potential to be triggered remotely, causing the drone to crash mid-flight