7 research outputs found
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Large language models (LLMs) have strong capabilities in solving diverse
natural language processing tasks. However, the safety and security issues of
LLM systems have become the major obstacle to their widespread application.
Many studies have extensively investigated risks in LLM systems and developed
the corresponding mitigation strategies. Leading-edge enterprises such as
OpenAI, Google, Meta, and Anthropic have also made lots of efforts on
responsible LLMs. Therefore, there is a growing need to organize the existing
studies and establish comprehensive taxonomies for the community. In this
paper, we delve into four essential modules of an LLM system, including an
input module for receiving prompts, a language model trained on extensive
corpora, a toolchain module for development and deployment, and an output
module for exporting LLM-generated content. Based on this, we propose a
comprehensive taxonomy, which systematically analyzes potential risks
associated with each module of an LLM system and discusses the corresponding
mitigation strategies. Furthermore, we review prevalent benchmarks, aiming to
facilitate the risk assessment of LLM systems. We hope that this paper can help
LLM participants embrace a systematic perspective to build their responsible
LLM systems