46 research outputs found
A Blockchain-Based Retribution Mechanism for Collaborative Intrusion Detection
Collaborative intrusion detection approach uses the shared detection signature between the collaborative participants to facilitate coordinated defense. In the context of collaborative intrusion detection system (CIDS), however, there is no research focusing on the efficiency of the shared detection signature. The inefficient detection signature costs not only the IDS resource but also the process of the peer-to-peer (P2P) network. In this paper, we therefore propose a blockchain-based retribution mechanism, which aims to incentivize the participants to contribute to verifying the efficiency of the detection signature in terms of certain distributed consensus. We implement a prototype using Ethereum blockchain, which instantiates a token-based retribution mechanism and a smart contract-enabled voting-based distributed consensus. We conduct a number of experiments built on the prototype, and the experimental results demonstrate the effectiveness of the proposed approach
Twitter Bots’ Detection with Benford’s Law and Machine Learning
Online Social Networks (OSNs) have grown exponentially in terms of active users and have now become an influential factor in the formation of public opinions. For this reason, the use of bots and botnets for spreading misinformation on OSNs has become a widespread concern. Identifying bots and botnets on Twitter can require complex statistical methods to score a profile based on multiple features. Benford’s Law, or the Law of Anomalous Numbers, states that, in any naturally occurring sequence of numbers, the First Significant Leading Digit (FSLD) frequency follows a particular pattern such that they are unevenly distributed and reducing. This principle can be applied to the first-degree egocentric network of a Twitter profile to assess its conformity to such law and, thus, classify it as a bot profile or normal profile. This paper focuses on leveraging Benford’s Law in combination with various Machine Learning (ML) classifiers to identify bot profiles on Twitter. In addition, a comparison with other statistical methods is produced to confirm our classification results
Word Embeddings for Fake Malware Generation
Signature and anomaly-based techniques are the fundamental methods to detect malware. However, in recent years this type of threat has advanced to become more complex and sophisticated, making these techniques less effective. For this reason, researchers have resorted to state-of-the-art machine learning techniques to combat the threat of information security. Nevertheless, despite the integration of the machine learning models, there is still a shortage of data in training that prevents these models from performing at their peak. In the past, generative models have been found to be highly effective at generating image-like data that are similar to the actual data distribution. In this paper, we leverage the knowledge of generative modeling on opcode sequences and aim to generate malware samples by taking advantage of the contextualized embeddings from BERT. We obtained promising results when differentiating between real and generated samples. We observe that generated malware has such similar characteristics to actual malware that the classifiers are having difficulty in distinguishing between the two, in which the classifiers falsely identify the generated malware as actual malware almost of the time
A Blockchain-Based Tamper-Resistant Logging Framework
Since its introduction in Bitcoin, the blockchain has proven to be a versatile data structure. In its role as an immutable ledger, it has grown beyond its initial use in financial transactions to be used in recording a wide variety of other useful information. In this paper, we explore the application of the blockchain outside of its traditional decentralized, financial domain. We show how, even with only a single “mining” node, a proof-of-work blockchain can be the cornerstone of a tamper resistant logging framework. By attaching a proof-of-work to blocks of logging messages, we make it increasingly difficult for an attacker to modify those logs even after totally compromising the system. Furthermore, we discuss various strategies an attacker might take to modify the logs without detection and show how effective those evasion techniques are against statistical analysis
Robustness of Image-Based Malware Analysis
In previous work, “gist descriptor” features extracted from images have been used in malware classification problems and have shown promising results. In this research, we determine whether gist descriptors are robust with respect to malware obfuscation techniques, as compared to Convolutional Neural Networks (CNN) trained directly on malware images. Using the Python Image Library (PIL), we create images from malware executables and from malware that we obfuscate. We conduct experiments to compare classifying these images with a CNN as opposed to extracting the gist descriptor features from these images to use in classification. For the gist descriptors, we consider a variety of classification algorithms including k-nearest neighbors, random forest, support vector machine, and multi-layer perceptron. We find that gist descriptors are more robust than CNNs, with respect to the obfuscation techniques that we consider
Impact of Location Spoofing Attacks on Performance Prediction in Mobile Networks
Performance prediction in wireless mobile networks is essential for diverse purposes in network management and operation. Particularly, the position of mobile devices is crucial to estimating the performance in the mobile communication setting. With its importance, this paper investigates mobile communication performance based on the coordinate information of mobile devices. We analyze a recent 5G data collection and examine the feasibility of location-based performance prediction. As location information is key to performance prediction, the basic assumption of making a relevant prediction is the correctness of the coordinate information of devices given. With its criticality, this paper also investigates the impact of position falsification on the ML-based performance predictor, which reveals the significant degradation of the prediction performance under such attacks, suggesting the need for effective defense mechanisms against location spoofing threats
Recommended from our members
Robust behavioral malware detection
Computer security attacks evolve to evade deployed defenses. Recent attacks have ranged from exploiting generic software vulnerabilities in memory-unsafe languages such as buffer overflows and format string vulnerabilities to exploiting logic errors in web applications, through means such as SQL injection and cross-site scripting. Furthermore, recent attacks have focused on escalating privileges
and stealing sensitive information by exploiting new hardware or operating system (OS) interfaces. Computer security attacks are also now relying on social engineering techniques to run malicious programs on victims' machines; instances of such abuse include phishing and watering hole attacks, both of which trick people into running malicious code or divulging confidential information. Thus, traditional computer security methods, such as OS confinement and program analysis, will not prevent new attacks that do not violate OS confinement or present illegal program behaviors.
Another challenge is that traditional security approaches have large trusted code bases (TCBs), which include hardware, OSs, and other software components that implement authentication and authorization logic across a distributed system. This is a vulnerable area because these components are complex and often contain vulnerabilities that undermine the overall system's integrity or confidentiality.
Evasive attacks on vulnerable systems -- especially in instances where trusted components turn malicious -- inspire the creation of defenses that can augment formally specified mechanisms against known threats. Specifically, this thesis advances the state of the art in behavioral malware detection -- detecting previously unknown malware in the very early stages of infection within an enterprise network.
Here we assess three fundamental insights of modern-day attacks and then describe a cross-layer defense against such attacks. First, we make a low-level machine state visible to behavioral analysis, significantly minimizing the TCB and its associated vulnerabilities. Specifically, our behavioral detector utilizes an executable code's dynamic properties, with architectural and micro-architectural states as input. Second, we evaluate behavioral detectors against adaptive adversaries. For this purpose, we introduce a new metric to determine a detector's robustness against malware modifications, which serves as a step toward explainability of machine learning-based malware detectors. Finally, we exploit the fact that attacks spread through only a limited number of vectors and propose new techniques to analyze the resulting dynamic correlations created among machines. These insights show that behavioral detectors can efficiently protect both individual devices and end hosts within enterprise networks. We present three types of such behavioral detectors.
Sherlock protects resource-constrained devices, such as mobile phones and Internet-of-things (IoT) devices, without modifying the software/hardware stack. Sherlock's supervised and unsupervised versions outperform prior work by 24.7% and 12.5% (area under the curve (AUC) metric), respectively, and detects stealthy malware that often evades static analysis tools.
The second behavioral detector, Shape-GD, protects devices within an enterprise network. It monitors devices on the network, aggregates data from weak local detectors, overlays that with network-level information, and then makes early, robust predictions regarding malicious activity. Shape-GD achieves its goals by exploiting latent attack semantics. Specifically, it analyzes communication patterns across multiple devices, partitioning them into neighborhoods. Devices within the same neighborhood are likely to be exposed to the same attack vector. Furthermore, we hypothesize that the conditional distribution of false positives is different from that of true positives; i.e., given a neighborhood of nodes, we can compute the aggregate distributional shape of alert feature vectors from the neighborhood itself and provide robust labels.
We evaluate Shape-GD by emulating a large community of Windows systems using the system call traces from a few thousand malicious and benign applications; we simulate both a phishing attack in a corporate email network as well as a watering hole attack through a popular website. In both scenarios, Shape-GD identifies malware early on (~100 infected nodes in a ~100K-node system for watering hole attacks, and ~10 of ~1,000 for phishing attacks) and robustly (with ~100% global true-positive and ~1% global false-positive rates).
The third behavioral detector, Centurion, detects malware across machines monitored by an anti-virus company. It is able to analyze behavior from 5 million Symantec client machines in real time and discovers malware by correlating file downloads across multiple machines. Compared with a recent local detector that analyzes metadata from file downloads, Centurion reduced the number of false positives from ~1M to ~110K and increased the true-positive rate by a factor of ~2.5. In addition, on average, Centurion detects malware 345 days earlier than commercial anti-virus products.Electrical and Computer Engineerin
A Novel Malware Target Recognition Architecture for Enhanced Cyberspace Situation Awareness
The rapid transition of critical business processes to computer networks potentially exposes organizations to digital theft or corruption by advanced competitors. One tool used for these tasks is malware, because it circumvents legitimate authentication mechanisms. Malware is an epidemic problem for organizations of all types. This research proposes and evaluates a novel Malware Target Recognition (MaTR) architecture for malware detection and identification of propagation methods and payloads to enhance situation awareness in tactical scenarios using non-instruction-based, static heuristic features. MaTR achieves a 99.92% detection accuracy on known malware with false positive and false negative rates of 8.73e-4 and 8.03e-4 respectively. MaTR outperforms leading static heuristic methods with a statistically significant 1% improvement in detection accuracy and 85% and 94% reductions in false positive and false negative rates respectively. Against a set of publicly unknown malware, MaTR detection accuracy is 98.56%, a 65% performance improvement over the combined effectiveness of three commercial antivirus products