Search CORE

148 research outputs found

Automatic translation of assembly shellcodes to printable byte codes

Author: Géczi Zsolt
Iványi Péter
Publication venue: 'Akademiai Kiado Zrt.'
Publication date: 01/01/2018
Field of study

The generation of printable shellcode is an important computer security research area. The original idea of the printable shellcode generation was to write a binary, executable code in a way that the generated byte code contains only bytes that are represented by the English letters, numbers and punctuation characters. In this way unfortunately only a limited number of CPU instructions can be used. In the originally published paper a small decoder is written with instructions represented by printable characters and the shellcode is decoded on the stack to be executed later. This paper, however describes a proof of concept project, which converts the source code of a full assembly program or shellcode to a new source code, whose compiled binary code contains only printable characters. The paper also presents new, printable character implementation of some CPU instructions

Crossref

Repository of the Academy's Library

EVIL: Exploiting Software via Natural Language

Author: Al-Hossami E.
Cotroneo D.
Cukic B.
Liguori P.
Natella R.
Orbinato V.
Shaikh S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Writing exploits for security assessment is a challenging task. The writer needs to master programming and obfuscation techniques to develop a successful exploit. To make the task easier, we propose an approach (EVIL) to automatically generate exploits in assembly/Python language from descriptions in natural language. The approach leverages Neural Machine Translation (NMT) techniques and a dataset that we developed for this work. We present an extensive experimental study to evaluate the feasibility of EVIL, using both automatic and manual analysis, and both at generating individual statements and entire exploits. The generated code achieved high accuracy in terms of syntactic and semantic correctness

Archivio della ricerca - Università degli studi di Napoli Federico II

Enhancing Robustness of AI Offensive Code Generators via Data Augmentation

Author: Cotroneo Domenico
Cukic Bojan
Improta Cristina
Liguori Pietro
Natella Roberto
Publication venue
Publication date: 08/06/2023
Field of study

In this work, we present a method to add perturbations to the code descriptions, i.e., new inputs in natural language (NL) from well-intentioned developers, in the context of security-oriented code, and analyze how and to what extent perturbations affect the performance of AI offensive code generators. Our experiments show that the performance of the code generators is highly affected by perturbations in the NL descriptions. To enhance the robustness of the code generators, we use the method to perform data augmentation, i.e., to increase the variability and diversity of the training data, proving its effectiveness against both perturbed and non-perturbed code descriptions

arXiv.org e-Print Archive

Recommended from our members

Smashing the Stack with Hydra: The Many Heads of Advanced Polymorphic Shellcode

Author: Prabhu Pratap
Song Yingbo
Stolfo Salvatore
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Recent work on the analysis of polymorphic shellcode engines suggests that modern obfuscation methods would soon eliminate the usefulness of signature-based network intrusion detection methods and supports growing views that the new generation of shellcode cannot be accurately and efficiently represented by the string signatures which current IDS and AV scanners rely upon. In this paper, we expand on this area of study by demonstrating never before seen concepts in advanced shellcode polymorphism with a proof-of-concept engine which we call Hydra. Hydra distinguishes itself by integrating an array of obfuscation techniques, such as recursive NOP sleds and multi-layer ciphering into one system while offering multiple improvements upon existing strategies. We also introduce never before seen attack methods such as byte-splicing statistical mimicry, safe-returns with forking shellcode and syscall-time-locking. In total, Hydra simultaneously attacks signature, statistical, disassembly, behavioral and emulation-based sensors, as well as frustrates ofï¬‚ine forensics. This engine was developed to present an updated view of the frontier of modern polymorphic shellcode and provide an effective tool for evaluation of IDS systems, Cyber test ranges and other related security technologies

Columbia University Academic Commons

Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation

Author: Cotroneo Domenico
Cukic Bojan
De Vivo Simona
Improta Cristina
Liguori Pietro
Natella Roberto
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Neural Machine Translation (NMT) has reached a level of maturity to be recognized as the premier method for the translation between different languages and aroused interest in different research areas, including software engineering. A key step to validate the robustness of the NMT models consists in evaluating the performance of the models on adversarial inputs, i.e., inputs obtained from the original ones by adding small amounts of perturbation. However, when dealing with the specific task of the code generation (i.e., the generation of code starting from a description in natural language), it has not yet been defined an approach to validate the robustness of the NMT models. In this work, we address the problem by identifying a set of perturbations and metrics tailored for the robustness assessment of such models. We present a preliminary experimental evaluation, showing what type of perturbations affect the model the most and deriving useful insights for future directions.Comment: Paper accepted for publication in the proceedings of The 1st Intl. Workshop on Natural Language-based Software Engineering (NLBSE) to be held with ICSE 202

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

06. Computer Science

Author: Northeastern State University
Publication venue: SWOSU Digital Commons
Publication date: 01/01/2015
Field of study

SWOSU Digital Commons (Southwestern Oklahoma State University)

Automating the Correctness Assessment of AI-generated Code for Security Contexts

Author: Cotroneo Domenico
Foggia Alessio
Improta Cristina
Liguori Pietro
Natella Roberto
Publication venue
Publication date: 28/10/2023
Field of study

In this paper, we propose a fully automated method, named ACCA, to evaluate the correctness of AI-generated code for security purposes. The method uses symbolic execution to assess whether the AI-generated code behaves as a reference implementation. We use ACCA to assess four state-of-the-art models trained to generate security-oriented assembly code and compare the results of the evaluation with different baseline solutions, including output similarity metrics, widely used in the field, and the well-known ChatGPT, the AI-powered language model developed by OpenAI. Our experiments show that our method outperforms the baseline solutions and assesses the correctness of the AI-generated code similar to the human-based evaluation, which is considered the ground truth for the assessment in the field. Moreover, ACCA has a very strong correlation with human evaluation (Pearson's correlation coefficient r=0.84 on average). Finally, since it is a fully automated solution that does not require any human intervention, the proposed method performs the assessment of every code snippet in ~0.17s on average, which is definitely lower than the average time required by human analysts to manually inspect the code, based on our experience

arXiv.org e-Print Archive

Combatting Advanced Persistent Threat via Causality Inference and Program Analysis

Author: Kwon Yonghwi
Publication venue: 'Purdue University (bepress)'
Publication date: 01/08/2018
Field of study

Cyber attackers are becoming more and more sophisticated. In particular, Advanced Persistent Threat (APT) is a new class of attack that targets a specifc organization and compromises systems over a long time without being detected. Over the years, we have seen notorious examples of APTs including Stuxnet which disrupted Iranian nuclear centrifuges and data breaches affecting millions of users. Investigating APT is challenging as it occurs over an extended period of time and the attack process is highly sophisticated and stealthy. Also, preventing APTs is diffcult due to ever-expanding attack vectors. In this dissertation, we present proposals for dealing with challenges in attack investigation. Specifcally, we present LDX which conducts precise counter-factual causality inference to determine dependencies between system calls (e.g., between input and output system calls) and allows investigators to determine the origin of an attack (e.g., receiving a spam email) and the propagation path of the attack, and assess the consequences of the attack. LDX is four times more accurate and two orders of magnitude faster than state-of-the-art taint analysis techniques. Moreover, we then present a practical model-based causality inference system, MCI, which achieves precise and accurate causality inference without requiring any modifcation or instrumentation in end-user systems. Second, we show a general protection system against a wide spectrum of attack vectors and methods. Specifcally, we present A2C that prevents a wide range of attacks by randomizing inputs such that any malicious payloads contained in the inputs are corrupted. The protection provided by A2C is both general (e.g., against various attack vectors) and practical (7% runtime overhead)

Purdue E-Pubs

A novel intrusion detection system for internet of things devices and data

Author: Al-Raweshidy H.
Kaushik A.
Publication venue: Springer
Publication date: 01/01/2023
Field of study

As we enter the new age of the Internet of Things (IoT) and wearable gadgets, sensors, and embedded devices are extensively used for data aggregation and its transmission. The extent of the data processed by IoT networks makes it vulnerable to outside attacks. Therefore, it is important to design an intrusion detection system (IDS) that ensures the security, integrity, and confidentiality of IoT networks and their data. State-of-the-art IDSs have poor detection capabilities and incur high communication and device overhead, which is not ideal for IoT applications requiring secured and real-time processing. This research presents a teaching-learning-based optimization enabled intrusion detection system (TLBO-IDS) which effectively protects IoT networks from intrusion attacks and also ensures low overhead at the same time. The proposed TLBO-IDS can detect analysis attacks, fuzzing attacks, shellcode attacks, worms, denial of service (Dos) attacks, exploits, and backdoor intrusion attacks. TLBO-IDS is extensively tested and its performance is compared with state-of-the-art algorithms. In particular, TLBO-IDS outperforms the bat algorithm and genetic algorithm (GA)

UDORA - University of Derby Online Research Archive