38 research outputs found

    Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study

    Full text link
    Large Language Models (LLMs), like ChatGPT, have demonstrated vast potential but also introduce challenges related to content constraints and potential misuse. Our study investigates three key research questions: (1) the number of different prompt types that can jailbreak LLMs, (2) the effectiveness of jailbreak prompts in circumventing LLM constraints, and (3) the resilience of ChatGPT against these jailbreak prompts. Initially, we develop a classification model to analyze the distribution of existing prompts, identifying ten distinct patterns and three categories of jailbreak prompts. Subsequently, we assess the jailbreak capability of prompts with ChatGPT versions 3.5 and 4.0, utilizing a dataset of 3,120 jailbreak questions across eight prohibited scenarios. Finally, we evaluate the resistance of ChatGPT against jailbreak prompts, finding that the prompts can consistently evade the restrictions in 40 use-case scenarios. The study underscores the importance of prompt structures in jailbreaking LLMs and discusses the challenges of robust jailbreak prompt generation and prevention

    Jailbreaker: Automated Jailbreak Across Multiple Large Language Model Chatbots

    Full text link
    Large Language Models (LLMs) have revolutionized Artificial Intelligence (AI) services due to their exceptional proficiency in understanding and generating human-like text. LLM chatbots, in particular, have seen widespread adoption, transforming human-machine interactions. However, these LLM chatbots are susceptible to "jailbreak" attacks, where malicious users manipulate prompts to elicit inappropriate or sensitive responses, contravening service policies. Despite existing attempts to mitigate such threats, our research reveals a substantial gap in our understanding of these vulnerabilities, largely due to the undisclosed defensive measures implemented by LLM service providers. In this paper, we present Jailbreaker, a comprehensive framework that offers an in-depth understanding of jailbreak attacks and countermeasures. Our work makes a dual contribution. First, we propose an innovative methodology inspired by time-based SQL injection techniques to reverse-engineer the defensive strategies of prominent LLM chatbots, such as ChatGPT, Bard, and Bing Chat. This time-sensitive approach uncovers intricate details about these services' defenses, facilitating a proof-of-concept attack that successfully bypasses their mechanisms. Second, we introduce an automatic generation method for jailbreak prompts. Leveraging a fine-tuned LLM, we validate the potential of automated jailbreak generation across various commercial LLM chatbots. Our method achieves a promising average success rate of 21.58%, significantly outperforming the effectiveness of existing techniques. We have responsibly disclosed our findings to the concerned service providers, underscoring the urgent need for more robust defenses. Jailbreaker thus marks a significant step towards understanding and mitigating jailbreak threats in the realm of LLM chatbots

    Prompt Injection attack against LLM-integrated Applications

    Full text link
    Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, their extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities and implications of prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis on ten commercial applications, highlighting the constraints of current attack strategies in practice. Prompted by these limitations, we subsequently formulate HouYi, a novel black-box prompt injection attack technique, which draws inspiration from traditional web injection attacks. HouYi is compartmentalized into three crucial elements: a seamlessly-incorporated pre-constructed prompt, an injection prompt inducing context partition, and a malicious payload designed to fulfill the attack objectives. Leveraging HouYi, we unveil previously unknown and severe attack outcomes, such as unrestricted arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection. 10 vendors have validated our discoveries, including Notion, which has the potential to impact millions of users. Our investigation illuminates both the possible risks of prompt injection attacks and the possible tactics for mitigation

    Understanding Large Language Model Based Fuzz Driver Generation

    Full text link
    Fuzz drivers are a necessary component of API fuzzing. However, automatically generating correct and robust fuzz drivers is a difficult task. Compared to existing approaches, LLM-based (Large Language Model) generation is a promising direction due to its ability to operate with low requirements on consumer programs, leverage multiple dimensions of API usage information, and generate human-friendly output code. Nonetheless, the challenges and effectiveness of LLM-based fuzz driver generation remain unclear. To address this, we conducted a study on the effects, challenges, and techniques of LLM-based fuzz driver generation. Our study involved building a quiz with 86 fuzz driver generation questions from 30 popular C projects, constructing precise effectiveness validation criteria for each question, and developing a framework for semi-automated evaluation. We designed five query strategies, evaluated 36,506 generated fuzz drivers. Furthermore, the drivers were compared with manually written ones to obtain practical insights. Our evaluation revealed that: while the overall performance was promising (passing 91% of questions), there were still practical challenges in filtering out the ineffective fuzz drivers for large scale application; basic strategies achieved a decent correctness rate (53%), but struggled with complex API-specific usage questions. In such cases, example code snippets and iterative queries proved helpful; while LLM-generated drivers showed competent fuzzing outcomes compared to manually written ones, there was still significant room for improvement, such as incorporating semantic oracles for logical bugs detection.Comment: 17 pages, 14 figure

    PentestGPT: An LLM-empowered Automatic Penetration Testing Tool

    Full text link
    Penetration testing, a crucial industrial practice for ensuring system security, has traditionally resisted automation due to the extensive expertise required by human professionals. Large Language Models (LLMs) have shown significant advancements in various domains, and their emergent abilities suggest their potential to revolutionize industries. In this research, we evaluate the performance of LLMs on real-world penetration testing tasks using a robust benchmark created from test machines with platforms. Our findings reveal that while LLMs demonstrate proficiency in specific sub-tasks within the penetration testing process, such as using testing tools, interpreting outputs, and proposing subsequent actions, they also encounter difficulties maintaining an integrated understanding of the overall testing scenario. In response to these insights, we introduce PentestGPT, an LLM-empowered automatic penetration testing tool that leverages the abundant domain knowledge inherent in LLMs. PentestGPT is meticulously designed with three self-interacting modules, each addressing individual sub-tasks of penetration testing, to mitigate the challenges related to context loss. Our evaluation shows that PentestGPT not only outperforms LLMs with a task-completion increase of 228.6\% compared to the \gptthree model among the benchmark targets but also proves effective in tackling real-world penetration testing challenges. Having been open-sourced on GitHub, PentestGPT has garnered over 4,700 stars and fostered active community engagement, attesting to its value and impact in both the academic and industrial spheres

    Digger: Detecting Copyright Content Mis-usage in Large Language Model Training

    Full text link
    Pre-training, which utilizes extensive and varied datasets, is a critical factor in the success of Large Language Models (LLMs) across numerous applications. However, the detailed makeup of these datasets is often not disclosed, leading to concerns about data security and potential misuse. This is particularly relevant when copyrighted material, still under legal protection, is used inappropriately, either intentionally or unintentionally, infringing on the rights of the authors. In this paper, we introduce a detailed framework designed to detect and assess the presence of content from potentially copyrighted books within the training datasets of LLMs. This framework also provides a confidence estimation for the likelihood of each content sample's inclusion. To validate our approach, we conduct a series of simulated experiments, the results of which affirm the framework's effectiveness in identifying and addressing instances of content misuse in LLM training processes. Furthermore, we investigate the presence of recognizable quotes from famous literary works within these datasets. The outcomes of our study have significant implications for ensuring the ethical use of copyrighted materials in the development of LLMs, highlighting the need for more transparent and responsible data management practices in this field

    Kebutuhan Sistem Informasi Komunikasi Pasar Tepat Waktu Bagi Pemasaran Lokal, Domestik, dan Ekspor Produk Agribisnis Kabupaten Kerinci

    Full text link
    Fenomena yang terjadi selama ini pengusaha agribisnis/petani produsen belum menggunakan jasa sistem informasi komunikasi pasar tepat waktu. Sehingga pengusaha agribisnis tidak mengetahui harga pasar yang sebenarnya sehingga menjual produk dengan harga murah, produsen akan selalu dirugikan karena menjual produk dengan harga murah, pendapatan petani produsen tetap selalu rendah dan hidup di bawah garis kemiskinan. Penelitian ini menggunakan pendekatan kualitatif dengan metode riset partisipatoris yaitu kegiatan yang dilakukan oleh peneliti mengikuti dan memahami serta membahas masalah-masalah yang dihadapi masyarakat produsen/pedagang agribisnis dan juga melakukan pola pembinaannya. Untuk itu, peneliti bertemu langsung dengan masyarakat produsen/pedagang agribisnis dan menjalin persahabatan dengan sejumlah anggota masyarakat dalam menyelesaikan permasalahan penelitian. Metode ini berdampingan dengan menggunakan pendekatan Descriptive analysis untuk menjawab masalah yang memerlukan keterangan, gambaran dan sejenisnya secara faktual dan aktual. Populasi dalam penelitian ini adalah populasi terhingga. Sampel Penelitian ada di wilayah sentra produksi agribisnis Kecamatan Kayu Aro Kabupaten Kerinci. Hasil penelitian, pengusaha agribisnis terutama petani agribisnis sebagian belum menggunakan komunikasi pasar tepat waktu dan sebagian sudah menggunakan komunikasi pasar tepat waktu. Bagi pedagang perantara sudah melakukan sistem informasi komunikasi pasar tepat waktu baik untuk pemasaran lokal, domestik dan ekspor dan dapat menggunakannya dengan baik dan benar. Pengusaha agribisnis sudah dibina untuk dapat mengetahui harga pasar dengan menggunakan media komunikasi telepon rumah dan handphone. Oleh karena itu petani produsen dan pedagang agribisnis agar membentuk jejaring informasi komunikasi pasar pada semua pedagang agen di luar provinsi dan semua pedagang agen di luar pulau dan membentuk wadah persatuan informasi harga serta bersatu dalam menentukan harga pasar dan tidak saling menjatuhkan harga
    corecore