6 research outputs found

    Data Embedding Scheme for Efficient Program Behavior Modeling With Neural Networks

    No full text
    As modern programs grow in size and complexity, the importance of program behavior modeling is emerging in various areas. Because of the large amount of data generated by a target program and the difficulty of runtime analysis, previous works in these areas employ deep learning. However, they did not sufficiently consider the input of a target program, since, in our view, program behavior is a history of computational steps consisting of a function and its input arguments. A naive, intuitive way to embed the value of xx as it is in a vector representation creates a tremendously large vector size. Instead, we found that all the values inducing the same runtime behavior can be represented as one identical characteristic value (CV). In this paper, we show that not only can a characteristic value sequence replace the argument input, but it is also efficient to use it as an input vector for a neural network. This efficiency comes from modeling the whole program with multiple LSTM-RNN models and reducing the input space of the neural network. To demonstrate the effectiveness of this replacement, we performed experiments on the problem of program behavior anomaly detection. Our results show that our model achieves better detection performance compared to previous models and similar detection performance even with smaller model sizes. We also provide a visualization of the embedded vectors extracted from the embedding layer in the neural network model to prove that the CV sequence well represents the arguments.N

    Real-Time Anomalous Branch Behavior Inference with a GPU-inspired Engine for Machine Learning Models

    No full text
    Attacks on embedded devices are likely to occur any time in unexpected manners. Thus, the defense systems based on fixed sets of rules will easily be subverted by such unexpected, unknown attacks. Learning-based anomaly detection may potentially prevent new unknown zero-day attacks by leveraging the capability of machine learning (ML) to learn the intricate true nature of software hidden within raw information. This paper introduces our work to develop an MPSoC, called RTAD, which can efficiently support in hardware various ML models that run to detect anomalous behaviors on embedded devices in a real-time fashion, and thus enable the devices to counteract the anomalies in the field. In the IoT era, the importance of security for embedded devices cannot be exaggerated because they will become an enticing target for adversaries as they are being integrated into everyday life to provide users with various services. The above-mentioned potential of learning-based detection is believed to benefit those deployed devices under attacks occurring any time during their field operations in unexpected manners. We hereby assume that ML models are trained with runtime branch information as their data features since a sequence of branches serves as a record of control flow transfers during program execution. In fact, there have been numerous ML studies that examine various types of branches in order to infer (or detect) anomaly in branch behaviors that may be induced by diverse attacks that can cause deviant control flow in software. Our goal of real-time anomalous branch behavior inference poses two challenges to our development of RTAD. Firstly, RTAD must collect and transfer in a timely fashion a sequence of branches as the input to the ML model. Secondly, RTAD must be able to promptly process the delivered branch data with the ML model. To tackle these challenges, we have implemented in RTAD two core components: an input generation module and a GPU-inspired ML processing engine. According to our experiments, RTAD enables various ML models to infer anomaly instantly after the victim program behaves aberrantly as the result of attacks being injected into the system.N
    corecore