38 research outputs found
Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study
Large Language Models (LLMs), like ChatGPT, have demonstrated vast potential
but also introduce challenges related to content constraints and potential
misuse. Our study investigates three key research questions: (1) the number of
different prompt types that can jailbreak LLMs, (2) the effectiveness of
jailbreak prompts in circumventing LLM constraints, and (3) the resilience of
ChatGPT against these jailbreak prompts. Initially, we develop a classification
model to analyze the distribution of existing prompts, identifying ten distinct
patterns and three categories of jailbreak prompts. Subsequently, we assess the
jailbreak capability of prompts with ChatGPT versions 3.5 and 4.0, utilizing a
dataset of 3,120 jailbreak questions across eight prohibited scenarios.
Finally, we evaluate the resistance of ChatGPT against jailbreak prompts,
finding that the prompts can consistently evade the restrictions in 40 use-case
scenarios. The study underscores the importance of prompt structures in
jailbreaking LLMs and discusses the challenges of robust jailbreak prompt
generation and prevention
Jailbreaker: Automated Jailbreak Across Multiple Large Language Model Chatbots
Large Language Models (LLMs) have revolutionized Artificial Intelligence (AI)
services due to their exceptional proficiency in understanding and generating
human-like text. LLM chatbots, in particular, have seen widespread adoption,
transforming human-machine interactions. However, these LLM chatbots are
susceptible to "jailbreak" attacks, where malicious users manipulate prompts to
elicit inappropriate or sensitive responses, contravening service policies.
Despite existing attempts to mitigate such threats, our research reveals a
substantial gap in our understanding of these vulnerabilities, largely due to
the undisclosed defensive measures implemented by LLM service providers.
In this paper, we present Jailbreaker, a comprehensive framework that offers
an in-depth understanding of jailbreak attacks and countermeasures. Our work
makes a dual contribution. First, we propose an innovative methodology inspired
by time-based SQL injection techniques to reverse-engineer the defensive
strategies of prominent LLM chatbots, such as ChatGPT, Bard, and Bing Chat.
This time-sensitive approach uncovers intricate details about these services'
defenses, facilitating a proof-of-concept attack that successfully bypasses
their mechanisms. Second, we introduce an automatic generation method for
jailbreak prompts. Leveraging a fine-tuned LLM, we validate the potential of
automated jailbreak generation across various commercial LLM chatbots. Our
method achieves a promising average success rate of 21.58%, significantly
outperforming the effectiveness of existing techniques. We have responsibly
disclosed our findings to the concerned service providers, underscoring the
urgent need for more robust defenses. Jailbreaker thus marks a significant step
towards understanding and mitigating jailbreak threats in the realm of LLM
chatbots
Prompt Injection attack against LLM-integrated Applications
Large Language Models (LLMs), renowned for their superior proficiency in
language comprehension and generation, stimulate a vibrant ecosystem of
applications around them. However, their extensive assimilation into various
services introduces significant security risks. This study deconstructs the
complexities and implications of prompt injection attacks on actual
LLM-integrated applications. Initially, we conduct an exploratory analysis on
ten commercial applications, highlighting the constraints of current attack
strategies in practice. Prompted by these limitations, we subsequently
formulate HouYi, a novel black-box prompt injection attack technique, which
draws inspiration from traditional web injection attacks. HouYi is
compartmentalized into three crucial elements: a seamlessly-incorporated
pre-constructed prompt, an injection prompt inducing context partition, and a
malicious payload designed to fulfill the attack objectives. Leveraging HouYi,
we unveil previously unknown and severe attack outcomes, such as unrestricted
arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi
on 36 actual LLM-integrated applications and discern 31 applications
susceptible to prompt injection. 10 vendors have validated our discoveries,
including Notion, which has the potential to impact millions of users. Our
investigation illuminates both the possible risks of prompt injection attacks
and the possible tactics for mitigation
Understanding Large Language Model Based Fuzz Driver Generation
Fuzz drivers are a necessary component of API fuzzing. However, automatically
generating correct and robust fuzz drivers is a difficult task. Compared to
existing approaches, LLM-based (Large Language Model) generation is a promising
direction due to its ability to operate with low requirements on consumer
programs, leverage multiple dimensions of API usage information, and generate
human-friendly output code. Nonetheless, the challenges and effectiveness of
LLM-based fuzz driver generation remain unclear.
To address this, we conducted a study on the effects, challenges, and
techniques of LLM-based fuzz driver generation. Our study involved building a
quiz with 86 fuzz driver generation questions from 30 popular C projects,
constructing precise effectiveness validation criteria for each question, and
developing a framework for semi-automated evaluation. We designed five query
strategies, evaluated 36,506 generated fuzz drivers. Furthermore, the drivers
were compared with manually written ones to obtain practical insights. Our
evaluation revealed that:
while the overall performance was promising (passing 91% of questions), there
were still practical challenges in filtering out the ineffective fuzz drivers
for large scale application; basic strategies achieved a decent correctness
rate (53%), but struggled with complex API-specific usage questions. In such
cases, example code snippets and iterative queries proved helpful; while
LLM-generated drivers showed competent fuzzing outcomes compared to manually
written ones, there was still significant room for improvement, such as
incorporating semantic oracles for logical bugs detection.Comment: 17 pages, 14 figure
PentestGPT: An LLM-empowered Automatic Penetration Testing Tool
Penetration testing, a crucial industrial practice for ensuring system
security, has traditionally resisted automation due to the extensive expertise
required by human professionals. Large Language Models (LLMs) have shown
significant advancements in various domains, and their emergent abilities
suggest their potential to revolutionize industries. In this research, we
evaluate the performance of LLMs on real-world penetration testing tasks using
a robust benchmark created from test machines with platforms. Our findings
reveal that while LLMs demonstrate proficiency in specific sub-tasks within the
penetration testing process, such as using testing tools, interpreting outputs,
and proposing subsequent actions, they also encounter difficulties maintaining
an integrated understanding of the overall testing scenario.
In response to these insights, we introduce PentestGPT, an LLM-empowered
automatic penetration testing tool that leverages the abundant domain knowledge
inherent in LLMs. PentestGPT is meticulously designed with three
self-interacting modules, each addressing individual sub-tasks of penetration
testing, to mitigate the challenges related to context loss. Our evaluation
shows that PentestGPT not only outperforms LLMs with a task-completion increase
of 228.6\% compared to the \gptthree model among the benchmark targets but also
proves effective in tackling real-world penetration testing challenges. Having
been open-sourced on GitHub, PentestGPT has garnered over 4,700 stars and
fostered active community engagement, attesting to its value and impact in both
the academic and industrial spheres
Digger: Detecting Copyright Content Mis-usage in Large Language Model Training
Pre-training, which utilizes extensive and varied datasets, is a critical
factor in the success of Large Language Models (LLMs) across numerous
applications. However, the detailed makeup of these datasets is often not
disclosed, leading to concerns about data security and potential misuse. This
is particularly relevant when copyrighted material, still under legal
protection, is used inappropriately, either intentionally or unintentionally,
infringing on the rights of the authors.
In this paper, we introduce a detailed framework designed to detect and
assess the presence of content from potentially copyrighted books within the
training datasets of LLMs. This framework also provides a confidence estimation
for the likelihood of each content sample's inclusion. To validate our
approach, we conduct a series of simulated experiments, the results of which
affirm the framework's effectiveness in identifying and addressing instances of
content misuse in LLM training processes. Furthermore, we investigate the
presence of recognizable quotes from famous literary works within these
datasets. The outcomes of our study have significant implications for ensuring
the ethical use of copyrighted materials in the development of LLMs,
highlighting the need for more transparent and responsible data management
practices in this field
Kebutuhan Sistem Informasi Komunikasi Pasar Tepat Waktu Bagi Pemasaran Lokal, Domestik, dan Ekspor Produk Agribisnis Kabupaten Kerinci
Fenomena yang terjadi selama ini pengusaha agribisnis/petani produsen belum menggunakan jasa sistem informasi komunikasi pasar tepat waktu. Sehingga pengusaha agribisnis tidak mengetahui harga pasar yang sebenarnya sehingga menjual produk dengan harga murah, produsen akan selalu dirugikan karena menjual produk dengan harga murah, pendapatan petani produsen tetap selalu rendah dan hidup di bawah garis kemiskinan. Penelitian ini menggunakan pendekatan kualitatif dengan metode riset partisipatoris yaitu kegiatan yang dilakukan oleh peneliti mengikuti dan memahami serta membahas masalah-masalah yang dihadapi masyarakat produsen/pedagang agribisnis dan juga melakukan pola pembinaannya. Untuk itu, peneliti bertemu langsung dengan masyarakat produsen/pedagang agribisnis dan menjalin persahabatan dengan sejumlah anggota masyarakat dalam menyelesaikan permasalahan penelitian. Metode ini berdampingan dengan menggunakan pendekatan Descriptive analysis untuk menjawab masalah yang memerlukan keterangan, gambaran dan sejenisnya secara faktual dan aktual. Populasi dalam penelitian ini adalah populasi terhingga. Sampel Penelitian ada di wilayah sentra produksi agribisnis Kecamatan Kayu Aro Kabupaten Kerinci. Hasil penelitian, pengusaha agribisnis terutama petani agribisnis sebagian belum menggunakan komunikasi pasar tepat waktu dan sebagian sudah menggunakan komunikasi pasar tepat waktu. Bagi pedagang perantara sudah melakukan sistem informasi komunikasi pasar tepat waktu baik untuk pemasaran lokal, domestik dan ekspor dan dapat menggunakannya dengan baik dan benar. Pengusaha agribisnis sudah dibina untuk dapat mengetahui harga pasar dengan menggunakan media komunikasi telepon rumah dan handphone. Oleh karena itu petani produsen dan pedagang agribisnis agar membentuk jejaring informasi komunikasi pasar pada semua pedagang agen di luar provinsi dan semua pedagang agen di luar pulau dan membentuk wadah persatuan informasi harga serta bersatu dalam menentukan harga pasar dan tidak saling menjatuhkan harga