484 research outputs found

    Optimization Based Motion Planning for Multi-Limbed Vertical Climbing Robots

    Full text link
    Motion planning trajectories for a multi-limbed robot to climb up walls requires a unique combination of constraints on torque, contact force, and posture. This paper focuses on motion planning for one particular setup wherein a six-legged robot braces itself between two vertical walls and climbs vertically with end effectors that only use friction. Instead of motion planning with a single nonlinear programming (NLP) solver, we decoupled the problem into two parts with distinct physical meaning: torso postures and contact forces. The first part can be formulated as either a mixed-integer convex programming (MICP) or NLP problem, while the second part is formulated as a series of standard convex optimization problems. Variants of the two wall climbing problem e.g., obstacle avoidance, uneven surfaces, and angled walls, help verify the proposed method in simulation and experimentation.Comment: IROS 2019 Accepte

    How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs

    Full text link
    Most traditional AI safety research has approached AI models as machines and centered on algorithm-focused attacks developed by security experts. As large language models (LLMs) become increasingly common and competent, non-expert users can also impose risks during daily interactions. This paper introduces a new perspective to jailbreak LLMs as human-like communicators, to explore this overlooked intersection between everyday language interaction and AI safety. Specifically, we study how to persuade LLMs to jailbreak them. First, we propose a persuasion taxonomy derived from decades of social science research. Then, we apply the taxonomy to automatically generate interpretable persuasive adversarial prompts (PAP) to jailbreak LLMs. Results show that persuasion significantly increases the jailbreak performance across all risk categories: PAP consistently achieves an attack success rate of over 92%92\% on Llama 2-7b Chat, GPT-3.5, and GPT-4 in 1010 trials, surpassing recent algorithm-focused attacks. On the defense side, we explore various mechanisms against PAP and, found a significant gap in existing defenses, and advocate for more fundamental mitigation for highly interactive LLMsComment: 14 pages of the main text, qualitative examples of jailbreaks may be harmful in natur
    • …
    corecore