1,584 research outputs found

    Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media

    Full text link
    Social media is often viewed as a sensor into various societal events such as disease outbreaks, protests, and elections. We describe the use of social media as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our approach detects a broad range of cyber-attacks (e.g., distributed denial of service (DDOS) attacks, data breaches, and account hijacking) in an unsupervised manner using just a limited fixed set of seed event triggers. A new query expansion strategy based on convolutional kernels and dependency parses helps model reporting structure and aids in identifying key event characteristics. Through a large-scale analysis over Twitter, we demonstrate that our approach consistently identifies and encodes events, outperforming existing methods.Comment: 13 single column pages, 5 figures, submitted to KDD 201

    Malware Detection using Machine Learning and Deep Learning

    Full text link
    Research shows that over the last decade, malware has been growing exponentially, causing substantial financial losses to various organizations. Different anti-malware companies have been proposing solutions to defend attacks from these malware. The velocity, volume, and the complexity of malware are posing new challenges to the anti-malware community. Current state-of-the-art research shows that recently, researchers and anti-virus organizations started applying machine learning and deep learning methods for malware analysis and detection. We have used opcode frequency as a feature vector and applied unsupervised learning in addition to supervised learning for malware classification. The focus of this tutorial is to present our work on detecting malware with 1) various machine learning algorithms and 2) deep learning models. Our results show that the Random Forest outperforms Deep Neural Network with opcode frequency as a feature. Also in feature reduction, Deep Auto-Encoders are overkill for the dataset, and elementary function like Variance Threshold perform better than others. In addition to the proposed methodologies, we will also discuss the additional issues and the unique challenges in the domain, open research problems, limitations, and future directions.Comment: 11 Pages and 3 Figure

    Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection

    Full text link
    In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology and framework for efficient and effective real-time malware detection, leveraging the best of conventional machine learning (ML) and deep learning (DL) algorithms. In PROPEDEUTICA, all software processes in the system start execution subjected to a conventional ML detector for fast classification. If a piece of software receives a borderline classification, it is subjected to further analysis via more performance expensive and more accurate DL methods, via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays to the execution of software subjected to deep learning analysis as a way to "buy time" for DL analysis and to rate-limit the impact of possible malware in the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and 877 commonly used benign software samples from various categories for the Windows OS. Our results show that the false positive rate for conventional ML methods can reach 20%, and for modern DL methods it is usually below 6%. However, the classification time for DL can be 100X longer than conventional ML methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the percentage of software subjected to DL analysis was approximately 40% on average. Further, the application of delays in software subjected to ML reduced the detection time by approximately 10%. Finally, we found and discussed a discrepancy between the detection accuracy offline (analysis after all traces are collected) and on-the-fly (analysis in tandem with trace collection). Our insights show that conventional ML and modern DL-based malware detectors in isolation cannot meet the needs of efficient and effective malware detection: high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure

    Hidden and Uncontrolled - On the Emergence of Network Steganographic Threats

    Full text link
    Network steganography is the art of hiding secret information within innocent network transmissions. Recent findings indicate that novel malware is increasingly using network steganography. Similarly, other malicious activities can profit from network steganography, such as data leakage or the exchange of pedophile data. This paper provides an introduction to network steganography and highlights its potential application for harmful purposes. We discuss the issues related to countering network steganography in practice and provide an outlook on further research directions and problems.Comment: 11 page

    Model Extraction Warning in MLaaS Paradigm

    Full text link
    Cloud vendors are increasingly offering machine learning services as part of their platform and services portfolios. These services enable the deployment of machine learning models on the cloud that are offered on a pay-per-query basis to application developers and end users. However recent work has shown that the hosted models are susceptible to extraction attacks. Adversaries may launch queries to steal the model and compromise future query payments or privacy of the training data. In this work, we present a cloud-based extraction monitor that can quantify the extraction status of models by observing the query and response streams of both individual and colluding adversarial users. We present a novel technique that uses information gain to measure the model learning rate by users with increasing number of queries. Additionally, we present an alternate technique that maintains intelligent query summaries to measure the learning rate relative to the coverage of the input feature space in the presence of collusion. Both these approaches have low computational overhead and can easily be offered as services to model owners to warn them of possible extraction attacks from adversaries. We present performance results for these approaches for decision tree models deployed on BigML MLaaS platform, using open source datasets and different adversarial attack strategies

    A Survey of Techniques for Improving Security of GPUs

    Full text link
    Graphics processing unit (GPU), although a powerful performance-booster, also has many security vulnerabilities. Due to these, the GPU can act as a safe-haven for stealthy malware and the weakest `link' in the security `chain'. In this paper, we present a survey of techniques for analyzing and improving GPU security. We classify the works on key attributes to highlight their similarities and differences. More than informing users and researchers about GPU security techniques, this survey aims to increase their awareness about GPU security vulnerabilities and potential countermeasures
    corecore