39 research outputs found

    Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models

    Full text link
    Deployable Large Language Models (LLMs) must conform to the criterion of helpfulness and harmlessness, thereby achieving consistency between LLMs outputs and human values. Red-teaming techniques constitute a critical way towards this criterion. Existing work rely solely on manual red team designs and heuristic adversarial prompts for vulnerability detection and optimization. These approaches lack rigorous mathematical formulation, thus limiting the exploration of diverse attack strategy within quantifiable measure and optimization of LLMs under convergence guarantees. In this paper, we present Red-teaming Game (RTG), a general game-theoretic framework without manual annotation. RTG is designed for analyzing the multi-turn attack and defense interactions between Red-team language Models (RLMs) and Blue-team Language Model (BLM). Within the RTG, we propose Gamified Red-teaming Solver (GRTS) with diversity measure of the semantic space. GRTS is an automated red teaming technique to solve RTG towards Nash equilibrium through meta-game analysis, which corresponds to the theoretically guaranteed optimization direction of both RLMs and BLM. Empirical results in multi-turn attacks with RLMs show that GRTS autonomously discovered diverse attack strategies and effectively improved security of LLMs, outperforming existing heuristic red-team designs. Overall, RTG has established a foundational framework for red teaming tasks and constructed a new scalable oversight technique for alignment

    Continuous Monitoring and Automated Fault Detection and Diagnosis of Large Air-Handling Units

    Get PDF

    Continuous Monitoring and Automated Fault Detection and Diagnosis of Large Air-Handling Units

    Get PDF

    Motion Synthesis and Control for Autonomous Agents using Generative Models and Reinforcement Learning

    Get PDF
    Imitating and predicting human motions have wide applications in both graphics and robotics, from developing realistic models of human movement and behavior in immersive virtual worlds and games to improving autonomous navigation for service agents deployed in the real world. Traditional approaches for motion imitation and prediction typically rely on pre-defined rules to model agent behaviors or use reinforcement learning with manually designed reward functions. Despite impressive results, such approaches cannot effectively capture the diversity of motor behaviors and the decision making capabilities of human beings. Furthermore, manually designing a model or reward function to explicitly describe human motion characteristics often involves laborious fine-tuning and repeated experiments, and may suffer from generalization issues. In this thesis, we explore data-driven approaches using generative models and reinforcement learning to study and simulate human motions. Specifically, we begin with motion synthesis and control of physically simulated agents imitating a wide range of human motor skills, and then focus on improving the local navigation decisions of autonomous agents in multi-agent interaction settings. For physics-based agent control, we introduce an imitation learning framework built upon generative adversarial networks and reinforcement learning that enables humanoid agents to learn motor skills from a few examples of human reference motion data. Our approach generates high-fidelity motions and robust controllers without needing to manually design and finetune a reward function, allowing at the same time interactive switching between different controllers based on user input. Based on this framework, we further propose a multi-objective learning scheme for composite and task-driven control of humanoid agents. Our multi-objective learning scheme balances the simultaneous learning of disparate motions from multiple reference sources and multiple goal-directed control objectives in an adaptive way, enabling the training of efficient composite motion controllers. Additionally, we present a general framework for fast and robust learning of motor control skills. Our framework exploits particle filtering to dynamically explore and discretize the high-dimensional action space involved in continuous control tasks, and provides a multi-modal policy as a substitute for the commonly used Gaussian policies. For navigation learning, we leverage human crowd data to train a human-inspired collision avoidance policy by combining knowledge distillation and reinforcement learning. Our approach enables autonomous agents to take human-like actions during goal-directed steering in fully decentralized, multi-agent environments. To inform better control in such environments, we propose SocialVAE, a variational autoencoder based architecture that uses timewise latent variables with socially-aware conditions and a backward posterior approximation to perform agent trajectory prediction. Our approach improves current state-of-the-art performance on trajectory prediction tasks in daily human interaction scenarios and more complex scenes involving interactions between NBA players. We further extend SocialVAE by exploiting semantic maps as context conditions to generate map-compliant trajectory prediction. Our approach processes context conditions and social conditions occurring during agent-agent interactions in an integrated manner through the use of a dual-attention mechanism. We demonstrate the real-time performance of our approach and its ability to provide high-fidelity, multi-modal predictions on various large-scale vehicle trajectory prediction tasks

    Adversarial machine learning for cyber security

    Get PDF
    This master thesis aims to take advantage of state of the art and tools that have been developed in Adversarial Machine Learning (AML) and related research branches to strengthen Machine Learning (ML) models used in cyber security. First, it seeks to collect, organize and summarize the most recent and potential state-of-the-art techniques in AML, considering that it is a research branch in an unstable state with a great diversity of difficult to contrast proposals, which rapidly evolve but are quickly replaced by attacks or defenses with greater potential. This summary is important considering that the AML literature is far from being able to create defensive techniques that effectively protect a ML model from all possible attacks, and it is relevant to analyze them both in detail and with criteria in order to apply them in practice. It is also useful to find biases in state-of-the-art to be considered regarding the measurement of the attack or defense effectiveness, which can be addressed by proposing methodologies and metrics to mitigate them. Additionally, it is considered inappropriate to analyze AML in isolation, considering that the robustness of a ML model to adversarial attacks is totally related to its generalization capacity to in-distribution cases, to its robustness to out-of-distribution cases, and to the possibility of overinterpretation, using spurious (but statistically valid) patterns in the model that may give a false sense of high performance. Therefore, this thesis proposes a methodology to previously evaluate the exposure of a model to these considerations, focusing on improving it in progressive order of priorities in each of its stages, and to guarantee satisfactory overall robustness. Based on this methodology, two interesting case studies are chosen to be explored in greater depth to evaluate their robustness to adversarial attacks, perform attacks to gain insights about their strengths and weaknesses, and finally propose improvements. In this process, all kinds of approaches are used depending on the type of problem evaluated and its assumptions, performing exploratory analysis, applying AML attacks and detailing their implications, proposing improvements and implementation of defenses such as Adversarial Training, and finally creating and proposing a methodology to correctly evaluate the effectiveness of a defense avoiding the biases of the state of the art. For each of the case studies, it is possible to create efficient adversarial attacks, analyze the strengths of each model, and in the case of the second case study, it is possible to increase the adversarial robustness of a Classification Convolutional Neural Network using Adversarial Training. This leads to other positive effects on the model, such as a better representation of the data, easier implementation of techniques to detect adversarial cases through anomaly analysis, and insights concerning its performance to reinforce the model from other viewp

    AVATAR - Machine Learning Pipeline Evaluation Using Surrogate Model

    Get PDF
    © 2020, The Author(s). The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid, and it is unnecessary to execute them to find out whether they are good pipelines. To address this issue, we propose a novel method to evaluate the validity of ML pipelines using a surrogate model (AVATAR). The AVATAR enables to accelerate automatic ML pipeline composition and optimisation by quickly ignoring invalid pipelines. Our experiments show that the AVATAR is more efficient in evaluating complex pipelines in comparison with the traditional evaluation approaches requiring their execution

    Preclinical Evaluation of Lipid-Based Nanosystems

    Get PDF
    The use of lipid-based nanosystems, including lipid nanoparticles (solid lipid nanoparticles (SLN) and nanostructured lipid carriers (NLC)), nanoemulsions, and liposomes, among others, is widespread. Several researchers have described the advantages of different applications of these nanosystems. For instance, they can increase the targeting and bioavailability of drugs, improving therapeutic effects. Their use in the cosmetic field is also promising, owing to their moisturizing properties and ability to protect labile cosmetic actives. Thus, it is surprising that only a few lipid-based nanosystems have reached the market. This can be explained by the strict regulatory requirements of medicines and the occurrence of unexpected in vivo failure, which highlights the need to conduct more preclinical studies.Current research is focused on testing the in vitro, ex vivo, and in vivo efficacy of lipid-based nanosystems to predict their clinical performance. However, there is a lack of method validation, which compromises the comparison between different studies.This book brings together the latest research and reviews that report on in vitro, ex vivo, and in vivo preclinical studies using lipid-based nanosystems. Readers can find up-to-date information on the most common experiments performed to predict the clinical behavior of lipid-based nanosystems. A series of 15 research articles and a review are presented, with authors from 15 different countries, which demonstrates the universality of the investigations that have been carried out in this area

    Cartography

    Get PDF
    The terrestrial space is the place of interaction of natural and social systems. The cartography is an essential tool to understand the complexity of these systems, their interaction and evolution. This brings the cartography to an important place in the modern world. The book presents several contributions at different areas and activities showing the importance of the cartography to the perception and organization of the territory. Learning with the past or understanding the present the use of cartography is presented as a way of looking to almost all themes of the knowledge
    corecore