6,636 research outputs found

    A Security Monitoring Framework For Virtualization Based HEP Infrastructures

    Full text link
    High Energy Physics (HEP) distributed computing infrastructures require automatic tools to monitor, analyze and react to potential security incidents. These tools should collect and inspect data such as resource consumption, logs and sequence of system calls for detecting anomalies that indicate the presence of a malicious agent. They should also be able to perform automated reactions to attacks without administrator intervention. We describe a novel framework that accomplishes these requirements, with a proof of concept implementation for the ALICE experiment at CERN. We show how we achieve a fully virtualized environment that improves the security by isolating services and Jobs without a significant performance impact. We also describe a collected dataset for Machine Learning based Intrusion Prevention and Detection Systems on Grid computing. This dataset is composed of resource consumption measurements (such as CPU, RAM and network traffic), logfiles from operating system services, and system call data collected from production Jobs running in an ALICE Grid test site and a big set of malware. This malware was collected from security research sites. Based on this dataset, we will proceed to develop Machine Learning algorithms able to detect malicious Jobs.Comment: Proceedings of the 22nd International Conference on Computing in High Energy and Nuclear Physics, CHEP 2016, 10-14 October 2016, San Francisco. Submitted to Journal of Physics: Conference Series (JPCS

    Identifying Native Applications with High Assurance

    Get PDF
    The work described in this paper investigates the problem of identifying and deterring stealthy malicious processes on a host. We point out the lack of strong application iden- tication in main stream operating systems. We solve the application identication problem by proposing a novel iden- tication model in which user-level applications are required to present identication proofs at run time to be authenti- cated by the kernel using an embedded secret key. The se- cret key of an application is registered with a trusted kernel using a key registrar and is used to uniquely authenticate and authorize the application. We present a protocol for secure authentication of applications. Additionally, we de- velop a system call monitoring architecture that uses our model to verify the identity of applications when making critical system calls. Our system call monitoring can be integrated with existing policy specication frameworks to enforce application-level access rights. We implement and evaluate a prototype of our monitoring architecture in Linux as device drivers with nearly no modication of the ker- nel. The results from our extensive performance evaluation shows that our prototype incurs low overhead, indicating the feasibility of our model

    Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection

    Full text link
    In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology and framework for efficient and effective real-time malware detection, leveraging the best of conventional machine learning (ML) and deep learning (DL) algorithms. In PROPEDEUTICA, all software processes in the system start execution subjected to a conventional ML detector for fast classification. If a piece of software receives a borderline classification, it is subjected to further analysis via more performance expensive and more accurate DL methods, via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays to the execution of software subjected to deep learning analysis as a way to "buy time" for DL analysis and to rate-limit the impact of possible malware in the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and 877 commonly used benign software samples from various categories for the Windows OS. Our results show that the false positive rate for conventional ML methods can reach 20%, and for modern DL methods it is usually below 6%. However, the classification time for DL can be 100X longer than conventional ML methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the percentage of software subjected to DL analysis was approximately 40% on average. Further, the application of delays in software subjected to ML reduced the detection time by approximately 10%. Finally, we found and discussed a discrepancy between the detection accuracy offline (analysis after all traces are collected) and on-the-fly (analysis in tandem with trace collection). Our insights show that conventional ML and modern DL-based malware detectors in isolation cannot meet the needs of efficient and effective malware detection: high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure

    A Comparison of Clustering Techniques for Malware Analysis

    Get PDF
    In this research, we apply clustering techniques to the malware detection problem. Our goal is to classify malware as part of a fully automated detection strategy. We compute clusters using the well-known �-means and EM clustering algorithms, with scores obtained from Hidden Markov Models (HMM). The previous work in this area consists of using HMM and �-means clustering technique to achieve the same. The current effort aims to extend it to use EM clustering technique for detection and also compare this technique with the �-means clustering
    corecore