19 research outputs found
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models
Large language models (LLMs) for automatic code generation have achieved
breakthroughs in several programming tasks. Their advances in competition-level
programming problems have made them an essential pillar of AI-assisted pair
programming, and tools such as GitHub Copilot have emerged as part of the daily
programming workflow used by millions of developers. The training data for
these models is usually collected from the Internet (e.g., from open-source
repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these
vulnerabilities and propagate them during the code generation procedure. While
these models have been extensively assessed for their ability to produce
functionally correct programs, there remains a lack of comprehensive
investigations and benchmarks addressing the security aspects of these models.
In this work, we propose a method to systematically study the security issues
of code language models to assess their susceptibility to generating vulnerable
code. To this end, we introduce the first approach to automatically find
generated code that contains vulnerabilities in black-box code generation
models. To achieve this, we present an approach to approximate inversion of the
black-box code generation models based on few-shot prompting. We evaluate the
effectiveness of our approach by examining code language models in generating
high-risk security weaknesses. Furthermore, we establish a collection of
diverse non-secure prompts for various vulnerability scenarios using our
method. This dataset forms a benchmark for evaluating and comparing the
security weaknesses in code language models.Comment: 23 pages, 9 figure
On the Limitations of Model Stealing with Uncertainty Quantification Models
Model stealing aims at inferring a victim model's functionality at a fraction
of the original training cost. While the goal is clear, in practice the model's
architecture, weight dimension, and original training data can not be
determined exactly, leading to mutual uncertainty during stealing. In this
work, we explicitly tackle this uncertainty by generating multiple possible
networks and combining their predictions to improve the quality of the stolen
model. For this, we compare five popular uncertainty quantification models in a
model stealing task. Surprisingly, our results indicate that the considered
models only lead to marginal improvements in terms of label agreement (i.e.,
fidelity) to the stolen model. To find the cause of this, we inspect the
diversity of the model's prediction by looking at the prediction variance as a
function of training iterations. We realize that during training, the models
tend to have similar predictions, indicating that the network diversity we
wanted to leverage using uncertainty quantification models is not (high) enough
for improvements on the model stealing task.Comment: 6 pages, 1 figure, 2 table, paper submitted to European Symposium on
Artificial Neural Networks, Computational Intelligence and Machine Learnin
VenoMave: Targeted Poisoning Against Speech Recognition
The wide adoption of Automatic Speech Recognition (ASR) remarkably enhanced
human-machine interaction. Prior research has demonstrated that modern ASR
systems are susceptible to adversarial examples, i.e., malicious audio inputs
that lead to misclassification by the victim's model at run time. The research
question of whether ASR systems are also vulnerable to data-poisoning attacks
is still unanswered. In such an attack, a manipulation happens during the
training phase: an adversary injects malicious inputs into the training set to
compromise the neural network's integrity and performance. Prior work in the
image domain demonstrated several types of data-poisoning attacks, but these
results cannot directly be applied to the audio domain. In this paper, we
present the first data-poisoning attack against ASR, called VenoMave. We
evaluate our attack on an ASR system that detects sequences of digits. When
poisoning only 0.17% of the dataset on average, we achieve an attack success
rate of 86.67%. To demonstrate the practical feasibility of our attack, we also
evaluate if the target audio waveform can be played over the air via simulated
room transmissions. In this more realistic threat model, VenoMave still
maintains a success rate up to 73.33%. We further extend our evaluation to the
Speech Commands corpus and demonstrate the scalability of VenoMave to a larger
vocabulary. During a transcription test with human listeners, we verify that
more than 85% of the original text of poisons can be correctly transcribed. We
conclude that data-poisoning attacks against ASR represent a real threat, and
we are able to perform poisoning for arbitrary target input files while the
crafted poison samples remain inconspicuous
Conning the Crypto Conman: End-to-End Analysis of Cryptocurrency-based Technical Support Scams
The mainstream adoption of cryptocurrencies has led to a surge in
wallet-related issues reported by ordinary users on social media platforms. In
parallel, there is an increase in an emerging fraud trend called
cryptocurrency-based technical support scam, in which fraudsters offer fake
wallet recovery services and target users experiencing wallet-related issues.
In this paper, we perform a comprehensive study of cryptocurrency-based
technical support scams. We present an analysis apparatus called HoneyTweet to
analyze this kind of scam. Through HoneyTweet, we lure over 9K scammers by
posting 25K fake wallet support tweets (so-called honey tweets). We then deploy
automated systems to interact with scammers to analyze their modus operandi. In
our experiments, we observe that scammers use Twitter as a starting point for
the scam, after which they pivot to other communication channels (eg email,
Instagram, or Telegram) to complete the fraud activity. We track scammers
across those communication channels and bait them into revealing their payment
methods. Based on the modes of payment, we uncover two categories of scammers
that either request secret key phrase submissions from their victims or direct
payments to their digital wallets. Furthermore, we obtain scam confirmation by
deploying honey wallet addresses and validating private key theft. We also
collaborate with the prominent payment service provider by sharing scammer data
collections. The payment service provider feedback was consistent with our
findings, thereby supporting our methodology and results. By consolidating our
analysis across various vantage points, we provide an end-to-end scam lifecycle
analysis and propose recommendations for scam mitigation
Adversarially robust speech and speaker recognition
Sprachassistenten beantworten Fragen, spielen Musik und steuern Smart Homes. In dieser Arbeit wird die Robustheit von Sprach- und Sprechererkennung untersucht. Im ersten Teil zeigen wir, dass eine gewichtete Erkennung auf der Basis von Umgebungsbedingungen genauer ist, als die einer einzelne Modalität. Zusätzlich schlagen wir eine Methode zur Erkennung von Spoofing-Angriffen gegen audiovisuelle Sprechererkennung vor. Im zweiten Teil zeigen wir die Berechnung von unauffälligen Adversarial Examples mit Hilfe von Psychoakustik. Außerdem wird der Angriff so erweitert, dass die resultierenden Adversarial Examples über verschiedene Räume hinweg robust bleiben und zeigen einen Mechanismus zur Erkennung von Adversarial Examples. Zusätzlich führen wir eine Analyse der Empfindlichkeit von Smart Speaker gegenüber versehentlichen Aktivierungen durch. Wir untersuchen die Häufigkeit von versehentlichen Aktivierungen und schlagen einen Ansatz zur künstlichen Herstellung von diesen vor