6,905 research outputs found

    Early Warning Analysis for Social Diffusion Events

    Get PDF
    There is considerable interest in developing predictive capabilities for social diffusion processes, for instance to permit early identification of emerging contentious situations, rapid detection of disease outbreaks, or accurate forecasting of the ultimate reach of potentially viral ideas or behaviors. This paper proposes a new approach to this predictive analytics problem, in which analysis of meso-scale network dynamics is leveraged to generate useful predictions for complex social phenomena. We begin by deriving a stochastic hybrid dynamical systems (S-HDS) model for diffusion processes taking place over social networks with realistic topologies; this modeling approach is inspired by recent work in biology demonstrating that S-HDS offer a useful mathematical formalism with which to represent complex, multi-scale biological network dynamics. We then perform formal stochastic reachability analysis with this S-HDS model and conclude that the outcomes of social diffusion processes may depend crucially upon the way the early dynamics of the process interacts with the underlying network's community structure and core-periphery structure. This theoretical finding provides the foundations for developing a machine learning algorithm that enables accurate early warning analysis for social diffusion events. The utility of the warning algorithm, and the power of network-based predictive metrics, are demonstrated through an empirical investigation of the propagation of political memes over social media networks. Additionally, we illustrate the potential of the approach for security informatics applications through case studies involving early warning analysis of large-scale protests events and politically-motivated cyber attacks

    RedAI: A Machine Learning Approach to Cyber Threat Intelligence

    Get PDF
    The world is continually demanding more effective and intelligent solutions and strategies to combat adversary groups across the cyber defense landscape. Cyber Threat Intelligence (CTI) is a field within the domain of cyber security that allows for organizations to utilize threat intelligence and serves as a tool for organizations to proactively harden their defense posture. However, there is a large volume of CTI and it is often a daunting task for organizations to effectively consume, utilize, and apply it to their defense strategies. In this thesis we develop a machine learning solution, named RedAI, to investigate whether open-source intelligence (OSINT) can be effectively integrated into a working approach that accurately classifies cyber threat intelligence. By focusing on open-source and easily available resources, RedAI demonstrates how to use the Structured Threat Information Expression (STIX) (OASIS, 2017) language to objectify, collect, and integrate intelligence and align it to the MITRE ATT&CK framework (MITRE ATT&CK Enterprise, 2021). To test the accuracy of this solution, machine learning models were built using training data and then further tested with test data to determine the model\u27s effectiveness at classifying unknown threat intelligence. The results showed that RedAI could, with high accuracy, use OSINT cyber threat intelligence data to build a machine learning model and then classifying unknown test threat intelligence. Based off these findings, it is apparent that organizations have the ability to leverage OSINT and advanced solutions to augment their cyber defense posture

    NLP Methods in Host-based Intrusion Detection Systems: A Systematic Review and Future Directions

    Full text link
    Host based Intrusion Detection System (HIDS) is an effective last line of defense for defending against cyber security attacks after perimeter defenses (e.g., Network based Intrusion Detection System and Firewall) have failed or been bypassed. HIDS is widely adopted in the industry as HIDS is ranked among the top two most used security tools by Security Operation Centers (SOC) of organizations. Although effective and efficient HIDS is highly desirable for industrial organizations, the evolution of increasingly complex attack patterns causes several challenges resulting in performance degradation of HIDS (e.g., high false alert rate creating alert fatigue for SOC staff). Since Natural Language Processing (NLP) methods are better suited for identifying complex attack patterns, an increasing number of HIDS are leveraging the advances in NLP that have shown effective and efficient performance in precisely detecting low footprint, zero day attacks and predicting the next steps of attackers. This active research trend of using NLP in HIDS demands a synthesized and comprehensive body of knowledge of NLP based HIDS. Thus, we conducted a systematic review of the literature on the end to end pipeline of the use of NLP in HIDS development. For the end to end NLP based HIDS development pipeline, we identify, taxonomically categorize and systematically compare the state of the art of NLP methods usage in HIDS, attacks detected by these NLP methods, datasets and evaluation metrics which are used to evaluate the NLP based HIDS. We highlight the relevant prevalent practices, considerations, advantages and limitations to support the HIDS developers. We also outline the future research directions for the NLP based HIDS development

    Ransomware note detection techniques using supervised machine learning

    Get PDF
    This project is about the detection of ransomware by detecting ransomware notes using supervised machine learning. The goal of the project is to study old ransomnote data to detect notes used in new ransomware campaigns. This is done by extracting the word combinations out of fifty-nine ransom notes and fifty-nine non-ransom notes to define a binary (is or is-not) system of text classification. The hypothesis posed by this project is: A machine learning model trained using ransomnotes from past campaigns will be able to detect notes made in future campaigns. Two machine learning (ML) algorithms are studied; Decision Trees and Support Vector machines (SVM). These ML algorithms were chosen for their ease of implementation and low data requirements. The studied dataset has fewer than sixty raw text documents, therefore models requiring a minimal amount of training data, such as SVM, are prioritized. After training and testing the ML models, the performance of the models is verified using a separate and newer dataset. Most of the project is implemented using Python for application logic and data manipulation while Scikit Learn (sklearn) was used for the training and analysis of the machine learning models. Data is stored using regular files. Incremental comparisons are made using varying levels of data cleaning and feature selection to study which methodologies produce ideal ML models capable of detecting ransomware notes with a low false positive rate. The results of this project are favorable to the goal - it is demonstrated that a single ML model can recognize a ransom note by checking as few as twenty features. Shorter notes tend to have fewer features to check and therefore require an ML model biased towards false positives for reliable detection. It is proposed to combine the output of multiple models in a stacked or "ensemble" configuration [1] to create a system for indicating how confident a positive detection is
    • …
    corecore