1,121 research outputs found
Detection of Lying Electrical Vehicles in Charging Coordination Application Using Deep Learning
The simultaneous charging of many electric vehicles (EVs) stresses the
distribution system and may cause grid instability in severe cases. The best
way to avoid this problem is by charging coordination. The idea is that the EVs
should report data (such as state-of-charge (SoC) of the battery) to run a
mechanism to prioritize the charging requests and select the EVs that should
charge during this time slot and defer other requests to future time slots.
However, EVs may lie and send false data to receive high charging priority
illegally. In this paper, we first study this attack to evaluate the gains of
the lying EVs and how their behavior impacts the honest EVs and the performance
of charging coordination mechanism. Our evaluations indicate that lying EVs
have a greater chance to get charged comparing to honest EVs and they degrade
the performance of the charging coordination mechanism. Then, an anomaly based
detector that is using deep neural networks (DNN) is devised to identify the
lying EVs. To do that, we first create an honest dataset for charging
coordination application using real driving traces and information revealed by
EV manufacturers, and then we also propose a number of attacks to create
malicious data. We trained and evaluated two models, which are the multi-layer
perceptron (MLP) and the gated recurrent unit (GRU) using this dataset and the
GRU detector gives better results. Our evaluations indicate that our detector
can detect lying EVs with high accuracy and low false positive rate
PhishDef: URL Names Say It All
Phishing is an increasingly sophisticated method to steal personal user
information using sites that pretend to be legitimate. In this paper, we take
the following steps to identify phishing URLs. First, we carefully select
lexical features of the URLs that are resistant to obfuscation techniques used
by attackers. Second, we evaluate the classification accuracy when using only
lexical features, both automatically and hand-selected, vs. when using
additional features. We show that lexical features are sufficient for all
practical purposes. Third, we thoroughly compare several classification
algorithms, and we propose to use an online method (AROW) that is able to
overcome noisy training data. Based on the insights gained from our analysis,
we propose PhishDef, a phishing detection system that uses only URL names and
combines the above three elements. PhishDef is a highly accurate method (when
compared to state-of-the-art approaches over real datasets), lightweight (thus
appropriate for online and client-side deployment), proactive (based on online
classification rather than blacklists), and resilient to training data
inaccuracies (thus enabling the use of large noisy training data).Comment: 9 pages, submitted to IEEE INFOCOM 201
Machine Learning Aided Static Malware Analysis: A Survey and Tutorial
Malware analysis and detection techniques have been evolving during the last
decade as a reflection to development of different malware techniques to evade
network-based and host-based security protections. The fast growth in variety
and number of malware species made it very difficult for forensics
investigators to provide an on time response. Therefore, Machine Learning (ML)
aided malware analysis became a necessity to automate different aspects of
static and dynamic malware investigation. We believe that machine learning
aided static analysis can be used as a methodological approach in technical
Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware
analysis that has been thoroughly studied before. In this paper, we address
this research gap by conducting an in-depth survey of different machine
learning methods for classification of static characteristics of 32-bit
malicious Portable Executable (PE32) Windows files and develop taxonomy for
better understanding of these techniques. Afterwards, we offer a tutorial on
how different machine learning techniques can be utilized in extraction and
analysis of a variety of static characteristic of PE binaries and evaluate
accuracy and practical generalization of these techniques. Finally, the results
of experimental study of all the method using common data was given to
demonstrate the accuracy and complexity. This paper may serve as a stepping
stone for future researchers in cross-disciplinary field of machine learning
aided malware forensics.Comment: 37 Page
Agnostically Learning Halfspaces
We consider the problem of learning a halfspace in the agnostic framework of Kearns et al., where a learner is given access to a distribution on labelled examples but the labelling may be arbitrary. The learner's goal is to output a hypothesis which performs almost as well as the optimal halfspace with respect to future draws from this distribution. Although the agnostic learning framework does not explicitly deal with noise, it is closely related to learning in worst-case noise models such as malicious noise. We give the first polynomial-time algorithm for agnostically learning halfspaces with respect to several distributions, such as the uniform distribution over the -dimensional Boolean cube {0,1}^n or unit sphere in n-dimensional Euclidean space, as well as any log-concave distribution in n-dimensional Euclidean space. Given any constant additive factor eps>0, our algorithm runs in poly(n) time and constructs a hypothesis whose error rate is within an additive eps of the optimal halfspace. We also show this algorithm agnostically learns Boolean disjunctions in time roughly 2^{\sqrt{n}} with respect to any distribution; this is the first subexponential-time algorithm for this problem. Finally, we obtain a new algorithm for PAC learning halfspaces under the uniform distribution on the unit sphere which can tolerate the highest level of malicious noise of any algorithm to date. Our main tool is a polynomial regression algorithm which finds a polynomial that best fits a set of points with respect to a particular metric. We show that, in fact, this algorithm is an arbitrary-distribution generalization of the well known "low-degree" Fourier algorithm of Linial, Mansour, and Nisan and has excellent noise tolerance properties when minimizing with respect to the L_1 norm. We apply this algorithm in conjunction with a non-standard Fourier transform (which does not use the traditional parity basis) for learning halfspaces over the uniform distribution on the unit sphere; we believe this technique is of independent interest
Detecting cyberattacks in industrial control systems using online learning algorithms
Industrial control systems are critical to the operation of industrial
facilities, especially for critical infrastructures, such as refineries, power
grids, and transportation systems. Similar to other information systems, a
significant threat to industrial control systems is the attack from
cyberspace---the offensive maneuvers launched by "anonymous" in the digital
world that target computer-based assets with the goal of compromising a
system's functions or probing for information. Owing to the importance of
industrial control systems, and the possibly devastating consequences of being
attacked, significant endeavors have been attempted to secure industrial
control systems from cyberattacks. Among them are intrusion detection systems
that serve as the first line of defense by monitoring and reporting potentially
malicious activities. Classical machine-learning-based intrusion detection
methods usually generate prediction models by learning modest-sized training
samples all at once. Such approach is not always applicable to industrial
control systems, as industrial control systems must process continuous control
commands with limited computational resources in a nonstop way. To satisfy such
requirements, we propose using online learning to learn prediction models from
the controlling data stream. We introduce several state-of-the-art online
learning algorithms categorically, and illustrate their efficacies on two
typically used testbeds---power system and gas pipeline. Further, we explore a
new cost-sensitive online learning algorithm to solve the class-imbalance
problem that is pervasive in industrial intrusion detection systems. Our
experimental results indicate that the proposed algorithm can achieve an
overall improvement in the detection rate of cyberattacks in industrial control
systems
- …