6 research outputs found
Prompt Injection attack against LLM-integrated Applications
Large Language Models (LLMs), renowned for their superior proficiency in
language comprehension and generation, stimulate a vibrant ecosystem of
applications around them. However, their extensive assimilation into various
services introduces significant security risks. This study deconstructs the
complexities and implications of prompt injection attacks on actual
LLM-integrated applications. Initially, we conduct an exploratory analysis on
ten commercial applications, highlighting the constraints of current attack
strategies in practice. Prompted by these limitations, we subsequently
formulate HouYi, a novel black-box prompt injection attack technique, which
draws inspiration from traditional web injection attacks. HouYi is
compartmentalized into three crucial elements: a seamlessly-incorporated
pre-constructed prompt, an injection prompt inducing context partition, and a
malicious payload designed to fulfill the attack objectives. Leveraging HouYi,
we unveil previously unknown and severe attack outcomes, such as unrestricted
arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi
on 36 actual LLM-integrated applications and discern 31 applications
susceptible to prompt injection. 10 vendors have validated our discoveries,
including Notion, which has the potential to impact millions of users. Our
investigation illuminates both the possible risks of prompt injection attacks
and the possible tactics for mitigation
Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study
Large Language Models (LLMs), like ChatGPT, have demonstrated vast potential
but also introduce challenges related to content constraints and potential
misuse. Our study investigates three key research questions: (1) the number of
different prompt types that can jailbreak LLMs, (2) the effectiveness of
jailbreak prompts in circumventing LLM constraints, and (3) the resilience of
ChatGPT against these jailbreak prompts. Initially, we develop a classification
model to analyze the distribution of existing prompts, identifying ten distinct
patterns and three categories of jailbreak prompts. Subsequently, we assess the
jailbreak capability of prompts with ChatGPT versions 3.5 and 4.0, utilizing a
dataset of 3,120 jailbreak questions across eight prohibited scenarios.
Finally, we evaluate the resistance of ChatGPT against jailbreak prompts,
finding that the prompts can consistently evade the restrictions in 40 use-case
scenarios. The study underscores the importance of prompt structures in
jailbreaking LLMs and discusses the challenges of robust jailbreak prompt
generation and prevention
PentestGPT: An LLM-empowered Automatic Penetration Testing Tool
Penetration testing, a crucial industrial practice for ensuring system
security, has traditionally resisted automation due to the extensive expertise
required by human professionals. Large Language Models (LLMs) have shown
significant advancements in various domains, and their emergent abilities
suggest their potential to revolutionize industries. In this research, we
evaluate the performance of LLMs on real-world penetration testing tasks using
a robust benchmark created from test machines with platforms. Our findings
reveal that while LLMs demonstrate proficiency in specific sub-tasks within the
penetration testing process, such as using testing tools, interpreting outputs,
and proposing subsequent actions, they also encounter difficulties maintaining
an integrated understanding of the overall testing scenario.
In response to these insights, we introduce PentestGPT, an LLM-empowered
automatic penetration testing tool that leverages the abundant domain knowledge
inherent in LLMs. PentestGPT is meticulously designed with three
self-interacting modules, each addressing individual sub-tasks of penetration
testing, to mitigate the challenges related to context loss. Our evaluation
shows that PentestGPT not only outperforms LLMs with a task-completion increase
of 228.6\% compared to the \gptthree model among the benchmark targets but also
proves effective in tackling real-world penetration testing challenges. Having
been open-sourced on GitHub, PentestGPT has garnered over 4,700 stars and
fostered active community engagement, attesting to its value and impact in both
the academic and industrial spheres