71,058 research outputs found

    Obfuscated memory malware detection in resource-constrained iot devices for smart city applications

    Get PDF
    Obfuscated Memory Malware (OMM) presents significant threats to interconnected systems, including smart city applications, for its ability to evade detection through concealment tactics. Existing OMM detection methods primarily focus on binary detection. Their multiclass versions consider a few families only and, thereby, fail to detect much existing and emerging malware. Moreover, their large memory size makes them unsuitable to be executed in resource-constrained embedded/IoT devices. To address this problem, in this paper, we propose a multiclass but lightweight malware detection method capable of identifying recent malware and is suitable to execute in embedded devices. For this, the method considers a hybrid model by combining the feature-learning capabilities of convolutional neural networks with the temporal modeling advantage of bidirectional long short-term memory. The proposed architecture exhibits compact size and fast processing speed, making it suitable for deployment in IoT devices that constitute the major components of smart city systems. Extensive experiments with the recent CIC-Malmem-2022 OMM dataset demonstrate that our method outperforms other machine learning-based models proposed in the literature in both detecting OMM and identifying specific attack types. Our proposed method thus offers a robust yet compact model executable in IoT devices for defending against obfuscated malware

    Hibernus: sustaining computation during intermittent supply for energy-harvesting systems

    No full text
    A key challenge to the future of energy-harvesting systems is the discontinuous power supply that is often generated. We propose a new approach, Hibernus, which enables computation to be sustained during intermittent supply. The approach has a low energy and time overhead which is achieved by reactively hibernating: saving system state only once, when power is about to be lost, and then sleeping until the supply recovers. We validate the approach experimentally on a processor with FRAM nonvolatile memory, allowing it to reactively hibernate using only energy stored in its decoupling capacitance. When compared to a recently proposed technique, the approach reduces processor time and energy overheads by 76-100% and 49-79% respectively

    Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks

    Full text link
    Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady state (the throughput) is important for both compiler designers and performance engineers. Building an analytical model to do so is especially complicated in modern x86-64 Complex Instruction Set Computer (CISC) machines with sophisticated processor microarchitectures in that it is tedious, error prone, and must be performed from scratch for each processor generation. In this paper we present Ithemal, the first tool which learns to predict the throughput of a set of instructions. Ithemal uses a hierarchical LSTM--based approach to predict throughput based on the opcodes and operands of instructions in a basic block. We show that Ithemal is more accurate than state-of-the-art hand-written tools currently used in compiler backends and static machine code analyzers. In particular, our model has less than half the error of state-of-the-art analytical models (LLVM's llvm-mca and Intel's IACA). Ithemal is also able to predict these throughput values just as fast as the aforementioned tools, and is easily ported across a variety of processor microarchitectures with minimal developer effort.Comment: Published at 36th International Conference on Machine Learning (ICML) 201

    Implementing a protected zone in a reconfigurable processor for isolated execution of cryptographic algorithms

    Get PDF
    We design and realize a protected zone inside a reconfigurable and extensible embedded RISC processor for isolated execution of cryptographic algorithms. The protected zone is a collection of processor subsystems such as functional units optimized for high-speed execution of integer operations, a small amount of local memory, and general and special-purpose registers. We outline the principles for secure software implementation of cryptographic algorithms in a processor equipped with the protected zone. We also demonstrate the efficiency and effectiveness of the protected zone by implementing major cryptographic algorithms, namely RSA, elliptic curve cryptography, and AES in the protected zone. In terms of time efficiency, software implementations of these three cryptographic algorithms outperform equivalent software implementations on similar processors reported in the literature. The protected zone is designed in such a modular fashion that it can easily be integrated into any RISC processor; its area overhead is considerably moderate in the sense that it can be used in vast majority of embedded processors. The protected zone can also provide the necessary support to implement TPM functionality within the boundary of a processor

    Transparent code authentication at the processor level

    Get PDF
    The authors present a lightweight authentication mechanism that verifies the authenticity of code and thereby addresses the virus and malicious code problems at the hardware level eliminating the need for trusted extensions in the operating system. The technique proposed tightly integrates the authentication mechanism into the processor core. The authentication latency is hidden behind the memory access latency, thereby allowing seamless on-the-fly authentication of instructions. In addition, the proposed authentication method supports seamless encryption of code (and static data). Consequently, while providing the software users with assurance for authenticity of programs executing on their hardware, the proposed technique also protects the software manufacturers’ intellectual property through encryption. The performance analysis shows that, under mild assumptions, the presented technique introduces negligible overhead for even moderate cache sizes

    XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference

    Full text link
    Binary Neural Networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy. In this paper, we introduce the XNOR Neural Engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM / standard cell memory. The XNE is able to fully compute convolutional and dense layers in autonomy or in cooperation with the core in the MCU to realize more complex behaviors. We show post-synthesis results in 65nm and 22nm technology for the XNE IP and post-layout results in 22nm for the full MCU indicating that this system can drop the energy cost per binary operation to 21.6fJ per operation at 0.4V, and at the same time is flexible and performant enough to execute state-of-the-art BNN topologies such as ResNet-34 in less than 2.2mJ per frame at 8.9 fps.Comment: 11 pages, 8 figures, 2 tables, 3 listings. Accepted for presentation at CODES'18 and for publication in IEEE Transactions on Computer-Aided Design of Circuits and Systems (TCAD) as part of the ESWEEK-TCAD special issu
    corecore