27 research outputs found

    LLM for SoC Security: A Paradigm Shift

    Full text link
    As the ubiquity and complexity of system-on-chip (SoC) designs increase across electronic devices, the task of incorporating security into an SoC design flow poses significant challenges. Existing security solutions are inadequate to provide effective verification of modern SoC designs due to their limitations in scalability, comprehensiveness, and adaptability. On the other hand, Large Language Models (LLMs) are celebrated for their remarkable success in natural language understanding, advanced reasoning, and program synthesis tasks. Recognizing an opportunity, our research delves into leveraging the emergent capabilities of Generative Pre-trained Transformers (GPTs) to address the existing gaps in SoC security, aiming for a more efficient, scalable, and adaptable methodology. By integrating LLMs into the SoC security verification paradigm, we open a new frontier of possibilities and challenges to ensure the security of increasingly complex SoCs. This paper offers an in-depth analysis of existing works, showcases practical case studies, demonstrates comprehensive experiments, and provides useful promoting guidelines. We also present the achievements, prospects, and challenges of employing LLM in different SoC security verification tasks.Comment: 42 page

    LLM for SoC Security: A Paradigm Shift

    Get PDF
    As the ubiquity and complexity of system-on-chip (SoC) designs increase across electronic devices, the task of incorporating security into an SoC design flow poses significant challenges. Existing security solutions are inadequate to provide effective verification of modern SoC designs due to their limitations in scalability, comprehensiveness, and adaptability. On the other hand, Large Language Models (LLMs) are celebrated for their remarkable success in natural language understanding, advanced reasoning, and program synthesis tasks. Recognizing an opportunity, our research delves into leveraging the emergent capabilities of Generative Pre-trained Transformers (GPTs) to address the existing gaps in SoC security, aiming for a more efficient, scalable, and adaptable methodology. By integrating LLMs into the SoC security verification paradigm, we open a new frontier of possibilities and challenges to ensure the security of increasingly complex SoCs. This paper offers an in-depth analysis of existing works, showcases practical case studies, demonstrates comprehensive experiments, and provides useful promoting guidelines. We also present the achievements, prospects, and challenges of employing LLM in different SoC security verification tasks

    Unlocking Hardware Security Assurance: The Potential of LLMs

    Full text link
    System-on-Chips (SoCs) form the crux of modern computing systems. SoCs enable high-level integration through the utilization of multiple Intellectual Property (IP) cores. However, the integration of multiple IP cores also presents unique challenges owing to their inherent vulnerabilities, thereby compromising the security of the entire system. Hence, it is imperative to perform hardware security validation to address these concerns. The efficiency of this validation procedure is contingent on the quality of the SoC security properties provided. However, generating security properties with traditional approaches often requires expert intervention and is limited to a few IPs, thereby resulting in a time-consuming and non-robust process. To address this issue, we, for the first time, propose a novel and automated Natural Language Processing (NLP)-based Security Property Generator (NSPG). Specifically, our approach utilizes hardware documentation in order to propose the first hardware security-specific language model, HS-BERT, for extracting security properties dedicated to hardware design. To evaluate our proposed technique, we trained the HS-BERT model using sentences from RISC-V, OpenRISC, MIPS, OpenSPARC, and OpenTitan SoC documentation. When assessedb on five untrained OpenTitan hardware IP documents, NSPG was able to extract 326 security properties from 1723 sentences. This, in turn, aided in identifying eight security bugs in the OpenTitan SoC design presented in the hardware hacking competition, Hack@DAC 2022

    Survey of Approaches and Techniques for Security Verification of Computer Systems

    Get PDF
    This paper surveys the landscape of security verification approaches and techniques for computer systems at various levels: from a software-application level all the way to the physical hardware level. Different existing projects are compared, based on the tools used and security aspects being examined. Since many systems require both hardware and software components to work together to provide the system\u27s promised security protections, it is not sufficient to verify just the software levels or just the hardware levels in a mutually exclusive fashion. This survey especially highlights system levels that are verified by the different existing projects and presents to the readers the state of the art in hardware and software system security verification. Few approaches come close to providing full-system verification, and there is still much room for improvement

    Learned interpreters : structural and learned systematicity in neural networks for program execution

    Full text link
    Les architectures de rĂ©seaux de neurones profonds Ă  usage gĂ©nĂ©ral ont fait des progrĂšs surprenants dans l'apprentissage automatique pour le code, permettant l’amĂ©lioration de la complĂ©tion de code, la programmation du langage naturel, la dĂ©tection et la rĂ©paration des bogues, et mĂȘme la rĂ©solution de problĂšmes de programmation compĂ©titifs Ă  un niveau de performance humain. NĂ©anmoins, ces mĂ©thodes ont du mal Ă  comprendre le processus d'exĂ©cution du code, mĂȘme lorsqu'il s'agit de code qu'ils Ă©crivent eux-mĂȘmes. À cette fin, nous explorons une architecture du rĂ©seau neuronal inspirĂ© d’interprĂ©teur de code, via une nouvelle famille d'architecture appelĂ©e Instruction Pointer Attention Graph Neural Networks (IPA-GNN). Nous appliquons cette famille d'approches Ă  plusieurs tĂąches nĂ©cessitant un raisonnement sur le comportement d'exĂ©cution du programme : apprendre Ă  exĂ©cuter des programmes complets et partiels, prĂ©dire la couverture du code pour la vĂ©rification du matĂ©riel, et prĂ©dire les erreurs d'exĂ©cution dans des programmes de compĂ©tition. GrĂące Ă  cette sĂ©rie de travaux, nous apportons plusieurs contributions et rencontrons de multiples rĂ©sultats surprenants et prometteurs. Nous introduisons une bibliothĂšque Python pour construire des reprĂ©sentations de graphes des programmes utiles dans la recherche sur l'apprentissage automatique, qui sert de fondement Ă  la recherche dans cette thĂšse et dans la communautĂ© de recherche plus large. Nous introduisons Ă©galement de riches ensembles de donnĂ©es Ă  grande Ă©chelle de programmes annotĂ©s avec le comportement du programme (les sorties et les erreurs soulevĂ©es lors de son exĂ©cution) pour faciliter la recherche dans ce domaine. Nous constatons que les mĂ©thodes IPA-GNN prĂ©sentent une forte gĂ©nĂ©ralisation amĂ©liorĂ©e par rapport aux mĂ©thodes Ă  usage gĂ©nĂ©ral, fonctionnant bien lorsqu'ils sont entraĂźnĂ©s pour exĂ©cuter uniquement des programmes courts mais testĂ©s sur des programmes plus longs. En fait, nous constatons que les mĂ©thodes IPA-GNN surpassent les mĂ©thodes gĂ©nĂ©riques sur chacune des tĂąches de modĂ©lisation du comportement que nous considĂ©rons dans les domaines matĂ©riel et logiciel. Nous constatons mĂȘme que les mĂ©thodes inspirĂ©es par l'interprĂ©teur de code qui modĂ©lisent explicitement la gestion des exceptions ont une propriĂ©tĂ© interprĂ©tative souhaitable, permettant la prĂ©diction des emplacements d'erreur mĂȘme lorsqu'elles n'ont Ă©tĂ© entraĂźnĂ©es qu'Ă  prĂ©dire la prĂ©sence d'erreur et le type d'erreur. Au total, les architectures inspirĂ©es des interprĂ©teurs de code comme l'IPA-GNN reprĂ©sentent un chemin prometteur Ă  suivre pour imprĂ©gner des rĂ©seaux de neurones avec de nouvelles capacitĂ©s pour apprendre Ă  raisonner sur les exĂ©cutions de programme.General purpose deep neural network architectures have made startling advances in machine learning for code, advancing code completion, enabling natural language programming, detecting and repairing bugs, and even solving competitive programming problems at a human level of performance. Nevertheless, these methods struggle to understand the execution behavior of code, even when it is code they write themselves. To this end, we explore interpreter-inspired neural network architectures, introducing a novel architecture family called instruction pointer attention graph neural networks (IPA-GNN). We apply this family of approaches to several tasks that require reasoning about the execution behavior of programs: learning to execute full and partial programs, code coverage prediction for hardware verification, and predicting runtime errors in competition programs. Through this series of works we make several contributions and encounter multiple surprising and promising results. We introduce a Python library for constructing graph representations of programs for use in machine learning research, which serves as a bedrock for the research in this thesis and in the broader research community. We also introduce rich large scale datasets of programs annotated with program behavior like outputs and errors raised to facilitate research in this domain. We find that IPA-GNN methods exhibit improved strong generalization over general purpose methods, performing well when trained to execute only on short programs and tested on significantly longer programs. In fact, we find that IPA-GNN methods outperform generic methods on each of the behavior modeling tasks we consider across both hardware and software domains. We even find that interpreter-inspired methods that model exception handling explicitly have a desirable interpretability property, enabling the prediction of error locations even when only trained on error presence and kind. In total, interpreter-inspired architectures like the IPA-GNN represent a promising path forward for imbuing neural networks with novel capabilities for learning to reason about program executions

    Semantic Fuzzing with Zest

    Get PDF
    Programs expecting structured inputs often consist of both a syntactic analysis stage, which parses raw input, and a semantic analysis stage, which conducts checks on the parsed input and executes the core logic of the program. Generator-based testing tools in the lineage of QuickCheck are a promising way to generate random syntactically valid test inputs for these programs. We present Zest, a technique which automatically guides QuickCheck-like randominput generators to better explore the semantic analysis stage of test programs. Zest converts random-input generators into deterministic parametric generators. We present the key insight that mutations in the untyped parameter domain map to structural mutations in the input domain. Zest leverages program feedback in the form of code coverage and input validity to perform feedback-directed parameter search. We evaluate Zest against AFL and QuickCheck on five Java programs: Maven, Ant, BCEL, Closure, and Rhino. Zest covers 1.03x-2.81x as many branches within the benchmarks semantic analysis stages as baseline techniques. Further, we find 10 new bugs in the semantic analysis stages of these benchmarks. Zest is the most effective technique in finding these bugs reliably and quickly, requiring at most 10 minutes on average to find each bug.Comment: To appear in Proceedings of 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA'19
    corecore