1,424 research outputs found

    Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection

    Full text link
    In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology and framework for efficient and effective real-time malware detection, leveraging the best of conventional machine learning (ML) and deep learning (DL) algorithms. In PROPEDEUTICA, all software processes in the system start execution subjected to a conventional ML detector for fast classification. If a piece of software receives a borderline classification, it is subjected to further analysis via more performance expensive and more accurate DL methods, via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays to the execution of software subjected to deep learning analysis as a way to "buy time" for DL analysis and to rate-limit the impact of possible malware in the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and 877 commonly used benign software samples from various categories for the Windows OS. Our results show that the false positive rate for conventional ML methods can reach 20%, and for modern DL methods it is usually below 6%. However, the classification time for DL can be 100X longer than conventional ML methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the percentage of software subjected to DL analysis was approximately 40% on average. Further, the application of delays in software subjected to ML reduced the detection time by approximately 10%. Finally, we found and discussed a discrepancy between the detection accuracy offline (analysis after all traces are collected) and on-the-fly (analysis in tandem with trace collection). Our insights show that conventional ML and modern DL-based malware detectors in isolation cannot meet the needs of efficient and effective malware detection: high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure

    CONFPROFITT: A CONFIGURATION-AWARE PERFORMANCE PROFILING, TESTING, AND TUNING FRAMEWORK

    Get PDF
    Modern computer software systems are complicated. Developers can change the behavior of the software system through software configurations. The large number of configuration option and their interactions make the task of software tuning, testing, and debugging very challenging. Performance is one of the key aspects of non-functional qualities, where performance bugs can cause significant performance degradation and lead to poor user experience. However, performance bugs are difficult to expose, primarily because detecting them requires specific inputs, as well as specific configurations. While researchers have developed techniques to analyze, quantify, detect, and fix performance bugs, many of these techniques are not effective in highly-configurable systems. To improve the non-functional qualities of configurable software systems, testing engineers need to be able to understand the performance influence of configuration options, adjust the performance of a system under different configurations, and detect configuration-related performance bugs. This research will provide an automated framework that allows engineers to effectively analyze performance-influence configuration options, detect performance bugs in highly-configurable software systems, and adjust configuration options to achieve higher long-term performance gains. To understand real-world performance bugs in highly-configurable software systems, we first perform a performance bug characteristics study from three large-scale opensource projects. Many researchers have studied the characteristics of performance bugs from the bug report but few have reported what the experience is when trying to replicate confirmed performance bugs from the perspective of non-domain experts such as researchers. This study is meant to report the challenges and potential workaround to replicate confirmed performance bugs. We also want to share a performance benchmark to provide real-world performance bugs to evaluate future performance testing techniques. Inspired by our performance bug study, we propose a performance profiling approach that can help developers to understand how configuration options and their interactions can influence the performance of a system. The approach uses a combination of dynamic analysis and machine learning techniques, together with configuration sampling techniques, to profile the program execution, analyze configuration options relevant to performance. Next, the framework leverages natural language processing and information retrieval techniques to automatically generate test inputs and configurations to expose performance bugs. Finally, the framework combines reinforcement learning and dynamic state reduction techniques to guide subject application towards achieving higher long-term performance gains

    ALPACAS: A Language for Parametric Assessment of Critical Architecture Safety

    Get PDF
    This paper introduces Alpacas, a domain-specific language and algorithms aimed at architecture modeling and safety assessment for critical systems. It allows to study the effects of random and systematic faults on complex critical systems and their reliability. The underlying semantic framework of the language is Stochastic Guarded Transition Systems, for which Alpacas provides a feature-rich declarative modeling language and algorithms for symbolic analysis and Monte-Carlo simulation, allowing to compute safety indicators such as minimal cutsets and reliability. Built as a domain-specific language deeply embedded in Scala 3, Alpacas offers generic modeling capabilities and type-safety unparalleled in other existing safety assessment frameworks. This improved expressive power allows to address complex system modeling tasks, such as formalizing the architectural design space of a critical function, and exploring it to identify the most reliable variant. The features and algorithms of Alpacas are illustrated on a case study of a thrust allocation and power dispatch system for an electric vertical takeoff and landing aircraft

    Power Line Monitoring through Data Integrity Analysis with Q-Learning Based Data Analysis Network

    Get PDF
    To monitor and handle big data obtained from electrical, electronic, electro-mechanical, and other equipment linked to the power grid effectively and efficiently, it is important to monitor them continually to gather information on power line integrity. We propose that data transmission analysis and data collection from tools like digital power meters may be used to undertake predictive maintenance on power lines without the need for specialized hardware like power line modems and synthetic data streams. Neural network models such as deep learning may be used for power line integrity analysis systems effectively, safely, and reliably. We adopt Q-learning based data analysis network for analyzing and monitoring power line integrity. The results of experiments performed over 32 km long power line under different scenarios are presented. The proposed framework may be useful for monitoring traditional power lines as well as alternative energy source parks and large users like industries. We discovered that the quantity of data transferred changes based on the problem and the size of the planned data packet. When all phases were absent from all meters, we noted a significant decrease in the amount of data collected from the power line of interest. This implies that there is a power outage during the monitoring. When even one phase is reconnected, we only obtain a portion of the information and a solution to interpret this was necessary. Our Q-network was able to identify and classify simulated 190 entire power outages and 700 single phase outages. The mean square error (MSE) did not exceed 0.10% of the total number of instances, and the MSE of the smart meters for a complete disturbance was only 0.20%, resulting in an average number of conceivable cases of errors and disturbances of 0.12% for the whole operation.publishedVersio

    CyberSpec: Intelligent Behavioral Fingerprinting to Detect Attacks on Crowdsensing Spectrum Sensors

    Full text link
    Integrated sensing and communication (ISAC) is a novel paradigm using crowdsensing spectrum sensors to help with the management of spectrum scarcity. However, well-known vulnerabilities of resource-constrained spectrum sensors and the possibility of being manipulated by users with physical access complicate their protection against spectrum sensing data falsification (SSDF) attacks. Most recent literature suggests using behavioral fingerprinting and Machine/Deep Learning (ML/DL) for improving similar cybersecurity issues. Nevertheless, the applicability of these techniques in resource-constrained devices, the impact of attacks affecting spectrum data integrity, and the performance and scalability of models suitable for heterogeneous sensors types are still open challenges. To improve limitations, this work presents seven SSDF attacks affecting spectrum sensors and introduces CyberSpec, an ML/DL-oriented framework using device behavioral fingerprinting to detect anomalies produced by SSDF attacks affecting resource-constrained spectrum sensors. CyberSpec has been implemented and validated in ElectroSense, a real crowdsensing RF monitoring platform where several configurations of the proposed SSDF attacks have been executed in different sensors. A pool of experiments with different unsupervised ML/DL-based models has demonstrated the suitability of CyberSpec detecting the previous attacks within an acceptable timeframe

    On-device Security and Privacy Mechanisms for Resource-limited Devices: A Bottom-up Approach

    Get PDF
    This doctoral dissertation introduces novel mechanisms to provide on-device security and privacy for resource-limited smart devices and their applications. These mechanisms aim to cover five fundamental contributions in the emerging Cyber-Physical Systems (CPS), Internet of Things (IoT), and Industrial IoT (IIoT) fields. First, we present a host-based fingerprinting solution for device identification that is complementary to other security services like device authentication and access control. Then, we design a kernel- and user-level detection framework that aims to discover compromised resource-limited devices based on behavioral analysis. Further we apply dynamic analysis of smart devices’ applications to uncover security and privacy risks in real-time. Then, we describe a solution to enable digital forensics analysis on data extracted from interconnected resource-limited devices that form a smart environment. Finally, we offer to researchers from industry and academia a collection of benchmark solutions for the evaluation of the discussed security mechanisms on different smart domains. For each contribution, this dissertation comprises specific novel tools and techniques that can be applied either independently or combined to enable a broader security services for the CPS, IoT, and IIoT domains

    Large-scale Wireless Local-area Network Measurement and Privacy Analysis

    Get PDF
    The edge of the Internet is increasingly becoming wireless. Understanding the wireless edge is therefore important for understanding the performance and security aspects of the Internet experience. This need is especially necessary for enterprise-wide wireless local-area networks (WLANs) as organizations increasingly depend on WLANs for mission- critical tasks. To study a live production WLAN, especially a large-scale network, is a difficult undertaking. Two fundamental difficulties involved are (1) building a scalable network measurement infrastructure to collect traces from a large-scale production WLAN, and (2) preserving user privacy while sharing these collected traces to the network research community. In this dissertation, we present our experience in designing and implementing one of the largest distributed WLAN measurement systems in the United States, the Dartmouth Internet Security Testbed (DIST), with a particular focus on our solutions to the challenges of efficiency, scalability, and security. We also present an extensive evaluation of the DIST system. To understand the severity of some potential trace-sharing risks for an enterprise-wide large-scale wireless network, we conduct privacy analysis on one kind of wireless network traces, a user-association log, collected from a large-scale WLAN. We introduce a machine-learning based approach that can extract and quantify sensitive information from a user-association log, even though it is sanitized. Finally, we present a case study that evaluates the tradeoff between utility and privacy on WLAN trace sanitization
    • …
    corecore