3,106 research outputs found
Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data
Recent years have seen the rise of more sophisticated attacks including
advanced persistent threats (APTs) which pose severe risks to organizations and
governments by targeting confidential proprietary information. Additionally,
new malware strains are appearing at a higher rate than ever before. Since many
of these malware are designed to evade existing security products, traditional
defenses deployed by most enterprises today, e.g., anti-virus, firewalls,
intrusion detection systems, often fail at detecting infections at an early
stage.
We address the problem of detecting early-stage infection in an enterprise
setting by proposing a new framework based on belief propagation inspired from
graph theory. Belief propagation can be used either with "seeds" of compromised
hosts or malicious domains (provided by the enterprise security operation
center -- SOC) or without any seeds. In the latter case we develop a detector
of C&C communication particularly tailored to enterprises which can detect a
stealthy compromise of only a single host communicating with the C&C server.
We demonstrate that our techniques perform well on detecting enterprise
infections. We achieve high accuracy with low false detection and false
negative rates on two months of anonymized DNS logs released by Los Alamos
National Lab (LANL), which include APT infection attacks simulated by LANL
domain experts. We also apply our algorithms to 38TB of real-world web proxy
logs collected at the border of a large enterprise. Through careful manual
investigation in collaboration with the enterprise SOC, we show that our
techniques identified hundreds of malicious domains overlooked by
state-of-the-art security products
SQL Injection Detection Using Machine Learning Techniques and Multiple Data Sources
SQL Injection continues to be one of the most damaging security exploits in terms of personal information exposure as well as monetary loss. Injection attacks are the number one vulnerability in the most recent OWASP Top 10 report, and the number of these attacks continues to increase. Traditional defense strategies often involve static, signature-based IDS (Intrusion Detection System) rules which are mostly effective only against previously observed attacks but not unknown, or zero-day, attacks. Much current research involves the use of machine learning techniques, which are able to detect unknown attacks, but depending on the algorithm can be costly in terms of performance. In addition, most current intrusion detection strategies involve collection of traffic coming into the web application either from a network device or from the web application host, while other strategies collect data from the database server logs. In this project, we are collecting traffic from two points: the web application host, and a Datiphy appliance node located between the webapp host and the associated MySQL database server. In our analysis of these two datasets, and another dataset that is correlated between the two, we have been able to demonstrate that accuracy obtained with the correlated dataset using algorithms such as rule-based and decision tree are nearly the same as those with a neural network algorithm, but with greatly improved performance
DeepSQLi: Deep Semantic Learning for Testing SQL Injection
Security is unarguably the most serious concern for Web applications, to
which SQL injection (SQLi) attack is one of the most devastating attacks.
Automatically testing SQLi vulnerabilities is of ultimate importance, yet is
unfortunately far from trivial to implement. This is because the existence of a
huge, or potentially infinite, number of variants and semantic possibilities of
SQL leading to SQLi attacks on various Web applications. In this paper, we
propose a deep natural language processing based tool, dubbed DeepSQLi, to
generate test cases for detecting SQLi vulnerabilities. Through adopting deep
learning based neural language model and sequence of words prediction, DeepSQLi
is equipped with the ability to learn the semantic knowledge embedded in SQLi
attacks, allowing it to translate user inputs (or a test case) into a new test
case, which is semantically related and potentially more sophisticated.
Experiments are conducted to compare DeepSQLi with SQLmap, a state-of-the-art
SQLi testing automation tool, on six real-world Web applications that are of
different scales, characteristics and domains. Empirical results demonstrate
the effectiveness and the remarkable superiority of DeepSQLi over SQLmap, such
that more SQLi vulnerabilities can be identified by using a less number of test
cases, whilst running much faster
PhishDef: URL Names Say It All
Phishing is an increasingly sophisticated method to steal personal user
information using sites that pretend to be legitimate. In this paper, we take
the following steps to identify phishing URLs. First, we carefully select
lexical features of the URLs that are resistant to obfuscation techniques used
by attackers. Second, we evaluate the classification accuracy when using only
lexical features, both automatically and hand-selected, vs. when using
additional features. We show that lexical features are sufficient for all
practical purposes. Third, we thoroughly compare several classification
algorithms, and we propose to use an online method (AROW) that is able to
overcome noisy training data. Based on the insights gained from our analysis,
we propose PhishDef, a phishing detection system that uses only URL names and
combines the above three elements. PhishDef is a highly accurate method (when
compared to state-of-the-art approaches over real datasets), lightweight (thus
appropriate for online and client-side deployment), proactive (based on online
classification rather than blacklists), and resilient to training data
inaccuracies (thus enabling the use of large noisy training data).Comment: 9 pages, submitted to IEEE INFOCOM 201
Command & Control: Understanding, Denying and Detecting - A review of malware C2 techniques, detection and defences
In this survey, we first briefly review the current state of cyber attacks,
highlighting significant recent changes in how and why such attacks are
performed. We then investigate the mechanics of malware command and control
(C2) establishment: we provide a comprehensive review of the techniques used by
attackers to set up such a channel and to hide its presence from the attacked
parties and the security tools they use. We then switch to the defensive side
of the problem, and review approaches that have been proposed for the detection
and disruption of C2 channels. We also map such techniques to widely-adopted
security controls, emphasizing gaps or limitations (and success stories) in
current best practices.Comment: Work commissioned by CPNI, available at c2report.org. 38 pages.
Listing abstract compressed from version appearing in repor
The Challenges in SDN/ML Based Network Security : A Survey
Machine Learning is gaining popularity in the network security domain as many
more network-enabled devices get connected, as malicious activities become
stealthier, and as new technologies like Software Defined Networking (SDN)
emerge. Sitting at the application layer and communicating with the control
layer, machine learning based SDN security models exercise a huge influence on
the routing/switching of the entire SDN. Compromising the models is
consequently a very desirable goal. Previous surveys have been done on either
adversarial machine learning or the general vulnerabilities of SDNs but not
both. Through examination of the latest ML-based SDN security applications and
a good look at ML/SDN specific vulnerabilities accompanied by common attack
methods on ML, this paper serves as a unique survey, making a case for more
secure development processes of ML-based SDN security applications.Comment: 8 pages. arXiv admin note: substantial text overlap with
arXiv:1705.0056
Survey on detecting and preventing web application broken access control attacks
Web applications are an essential component of the current wide range of digital services proposition including financial and governmental services as well as social networking and communications. Broken access control vulnerabilities pose a huge risk to that echo system because they allow the attacker to circumvent the allocated permissions and rights and perform actions that he is not authorized to perform. This paper gives a broad survey of the current research progress on approaches used to detect access control vulnerabilities exploitations and attacks in web application components. It categorizes these approaches based on their key techniques and compares the different detection methods in addition to evaluating their strengths and weaknesses. We also spotted and elaborated on some exciting research gaps found in the current literature, Finally, the paper summarizes the general detection approaches and suggests potential research directions for the future
- …