30,218 research outputs found

    Adversarial content manipulation for analyzing and improving model robustness

    Get PDF
    The recent rapid progress in machine learning systems has opened up many real-world applications --- from recommendation engines on web platforms to safety critical systems like autonomous vehicles. A model deployed in the real-world will often encounter inputs far from its training distribution. For example, a self-driving car might come across a black stop sign in the wild. To ensure safe operation, it is vital to quantify the robustness of machine learning models to such out-of-distribution data before releasing them into the real-world. However, the standard paradigm of benchmarking machine learning models with fixed size test sets drawn from the same distribution as the training data is insufficient to identify these corner cases efficiently. In principle, if we could generate all valid variations of an input and measure the model response, we could quantify and guarantee model robustness locally. Yet, doing this with real world data is not scalable. In this thesis, we propose an alternative, using generative models to create synthetic data variations at scale and test robustness of target models to these variations. We explore methods to generate semantic data variations in a controlled fashion across visual and text modalities. We build generative models capable of performing controlled manipulation of data like changing visual context, editing appearance of an object in images or changing writing style of text. Leveraging these generative models we propose tools to study robustness of computer vision systems to input variations and systematically identify failure modes. In the text domain, we deploy these generative models to improve diversity of image captioning systems and perform writing style manipulation to obfuscate private attributes of the user. Our studies quantifying model robustness explore two kinds of input manipulations, model-agnostic and model-targeted. The model-agnostic manipulations leverage human knowledge to choose the kinds of changes without considering the target model being tested. This includes automatically editing images to remove objects not directly relevant to the task and create variations in visual context. Alternatively, in the model-targeted approach the input variations performed are directly adversarially guided by the target model. For example, we adversarially manipulate the appearance of an object in the image to fool an object detector, guided by the gradients of the detector. Using these methods, we measure and improve the robustness of various computer vision systems -- specifically image classification, segmentation, object detection and visual question answering systems -- to semantic input variations.Der schnelle Fortschritt von Methoden des maschinellen Lernens hat viele neue Anwendungen ermöglicht – von Recommender-Systemen bis hin zu sicherheitskritischen Systemen wie autonomen Fahrzeugen. In der realen Welt werden diese Systeme oft mit Eingaben außerhalb der Verteilung der Trainingsdaten konfrontiert. Zum Beispiel könnte ein autonomes Fahrzeug einem schwarzen Stoppschild begegnen. Um sicheren Betrieb zu gewährleisten, ist es entscheidend, die Robustheit dieser Systeme zu quantifizieren, bevor sie in der Praxis eingesetzt werden. Aktuell werden diese Modelle auf festen Eingaben von derselben Verteilung wie die Trainingsdaten evaluiert. Allerdings ist diese Strategie unzureichend, um solche Ausnahmefälle zu identifizieren. Prinzipiell könnte die Robustheit “lokal” bestimmt werden, indem wir alle zulässigen Variationen einer Eingabe generieren und die Ausgabe des Systems überprüfen. Jedoch skaliert dieser Ansatz schlecht zu echten Daten. In dieser Arbeit benutzen wir generative Modelle, um synthetische Variationen von Eingaben zu erstellen und so die Robustheit eines Modells zu überprüfen. Wir erforschen Methoden, die es uns erlauben, kontrolliert semantische Änderungen an Bild- und Textdaten vorzunehmen. Wir lernen generative Modelle, die kontrollierte Manipulation von Daten ermöglichen, zum Beispiel den visuellen Kontext zu ändern, die Erscheinung eines Objekts zu bearbeiten oder den Schreibstil von Text zu ändern. Basierend auf diesen Modellen entwickeln wir neue Methoden, um die Robustheit von Bilderkennungssystemen bezüglich Variationen in den Eingaben zu untersuchen und Fehlverhalten zu identifizieren. Im Gebiet von Textdaten verwenden wir diese Modelle, um die Diversität von sogenannten Automatische Bildbeschriftung-Modellen zu verbessern und Schreibtstil-Manipulation zu erlauben, um private Attribute des Benutzers zu verschleiern. Um die Robustheit von Modellen zu quantifizieren, werden zwei Arten von Eingabemanipulationen untersucht: Modell-agnostische und Modell-spezifische Manipulationen. Modell-agnostische Manipulationen basieren auf menschlichem Wissen, um bestimmte Änderungen auszuwählen, ohne das entsprechende Modell miteinzubeziehen. Dies beinhaltet das Entfernen von für die Aufgabe irrelevanten Objekten aus Bildern oder Variationen des visuellen Kontextes. In dem alternativen Modell-spezifischen Ansatz werden Änderungen vorgenommen, die für das Modell möglichst ungünstig sind. Zum Beispiel ändern wir die Erscheinung eines Objekts um ein Modell der Objekterkennung täuschen. Dies ist durch den Gradienten des Modells möglich. Mithilfe dieser Werkzeuge können wir die Robustheit von Systemen zur Bildklassifizierung oder -segmentierung, Objekterkennung und Visuelle Fragenbeantwortung quantifizieren und verbessern

    Modeling Adversarial Attack on Pre-trained Language Models as Sequential Decision Making

    Full text link
    Pre-trained language models (PLMs) have been widely used to underpin various downstream tasks. However, the adversarial attack task has found that PLMs are vulnerable to small perturbations. Mainstream methods adopt a detached two-stage framework to attack without considering the subsequent influence of substitution at each step. In this paper, we formally model the adversarial attack task on PLMs as a sequential decision-making problem, where the whole attack process is sequential with two decision-making problems, i.e., word finder and word substitution. Considering the attack process can only receive the final state without any direct intermediate signals, we propose to use reinforcement learning to find an appropriate sequential attack path to generate adversaries, named SDM-Attack. Extensive experimental results show that SDM-Attack achieves the highest attack success rate with a comparable modification rate and semantic similarity to attack fine-tuned BERT. Furthermore, our analyses demonstrate the generalization and transferability of SDM-Attack. The code is available at https://github.com/fduxuan/SDM-Attack

    SPEC5G: A Dataset for 5G Cellular Network Protocol Analysis

    Full text link
    5G is the 5th generation cellular network protocol. It is the state-of-the-art global wireless standard that enables an advanced kind of network designed to connect virtually everyone and everything with increased speed and reduced latency. Therefore, its development, analysis, and security are critical. However, all approaches to the 5G protocol development and security analysis, e.g., property extraction, protocol summarization, and semantic analysis of the protocol specifications and implementations are completely manual. To reduce such manual effort, in this paper, we curate SPEC5G the first-ever public 5G dataset for NLP research. The dataset contains 3,547,586 sentences with 134M words, from 13094 cellular network specifications and 13 online websites. By leveraging large-scale pre-trained language models that have achieved state-of-the-art results on NLP tasks, we use this dataset for security-related text classification and summarization. Security-related text classification can be used to extract relevant security-related properties for protocol testing. On the other hand, summarization can help developers and practitioners understand the high level of the protocol, which is itself a daunting task. Our results show the value of our 5G-centric dataset in 5G protocol analysis automation. We believe that SPEC5G will enable a new research direction into automatic analyses for the 5G cellular network protocol and numerous related downstream tasks. Our data and code are publicly available

    Improving Question Generation with Multi-level Content Planning

    Full text link
    This paper addresses the problem of generating questions from a given context and an answer, specifically focusing on questions that require multi-hop reasoning across an extended context. Previous studies have suggested that key phrase selection is essential for question generation (QG), yet it is still challenging to connect such disjointed phrases into meaningful questions, particularly for long context. To mitigate this issue, we propose MultiFactor, a novel QG framework based on multi-level content planning. Specifically, MultiFactor includes two components: FA-model, which simultaneously selects key phrases and generates full answers, and Q-model which takes the generated full answer as an additional input to generate questions. Here, full answer generation is introduced to connect the short answer with the selected key phrases, thus forming an answer-aware summary to facilitate QG. Both FA-model and Q-model are formalized as simple-yet-effective Phrase-Enhanced Transformers, our joint model for phrase selection and text generation. Experimental results show that our method outperforms strong baselines on two popular QG datasets. Our code is available at https://github.com/zeaver/MultiFactor.Comment: Camera-ready. Accepted by EMNLP 2023 Finding
    • …
    corecore