22,206 research outputs found
Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data
Recent years have seen the rise of more sophisticated attacks including
advanced persistent threats (APTs) which pose severe risks to organizations and
governments by targeting confidential proprietary information. Additionally,
new malware strains are appearing at a higher rate than ever before. Since many
of these malware are designed to evade existing security products, traditional
defenses deployed by most enterprises today, e.g., anti-virus, firewalls,
intrusion detection systems, often fail at detecting infections at an early
stage.
We address the problem of detecting early-stage infection in an enterprise
setting by proposing a new framework based on belief propagation inspired from
graph theory. Belief propagation can be used either with "seeds" of compromised
hosts or malicious domains (provided by the enterprise security operation
center -- SOC) or without any seeds. In the latter case we develop a detector
of C&C communication particularly tailored to enterprises which can detect a
stealthy compromise of only a single host communicating with the C&C server.
We demonstrate that our techniques perform well on detecting enterprise
infections. We achieve high accuracy with low false detection and false
negative rates on two months of anonymized DNS logs released by Los Alamos
National Lab (LANL), which include APT infection attacks simulated by LANL
domain experts. We also apply our algorithms to 38TB of real-world web proxy
logs collected at the border of a large enterprise. Through careful manual
investigation in collaboration with the enterprise SOC, we show that our
techniques identified hundreds of malicious domains overlooked by
state-of-the-art security products
Inferring processes of cultural transmission: the critical role of rare variants in distinguishing neutrality from novelty biases
Neutral evolution assumes that there are no selective forces distinguishing
different variants in a population. Despite this striking assumption, many
recent studies have sought to assess whether neutrality can provide a good
description of different episodes of cultural change. One approach has been to
test whether neutral predictions are consistent with observed progeny
distributions, recording the number of variants that have produced a given
number of new instances within a specified time interval: a classic example is
the distribution of baby names. Using an overlapping generations model we show
that these distributions consist of two phases: a power law phase with a
constant exponent of -3/2, followed by an exponential cut-off for variants with
very large numbers of progeny. Maximum likelihood estimations of the model
parameters provide a direct way to establish whether observed empirical
patterns are consistent with neutral evolution. We apply our approach to a
complete data set of baby names from Australia. Crucially we show that analyses
based on only the most popular variants, as is often the case in studies of
cultural evolution, can provide misleading evidence for underlying transmission
hypotheses. While neutrality provides a plausible description of progeny
distributions of abundant variants, rare variants deviate from neutrality.
Further, we develop a simulation framework that allows for the detection of
alternative cultural transmission processes. We show that anti-novelty bias is
able to replicate the complete progeny distribution of the Australian data set
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
Each month, more attacks are launched with the aim of making web users
believe that they are communicating with a trusted entity which compels them to
share their personal, financial information. Phishing costs Internet users
billions of dollars every year. Researchers at Carnegie Mellon University (CMU)
created an anti-phishing landing page supported by Anti-Phishing Working Group
(APWG) with the aim to train users on how to prevent themselves from phishing
attacks. It is used by financial institutions, phish site take down vendors,
government organizations, and online merchants. When a potential victim clicks
on a phishing link that has been taken down, he / she is redirected to the
landing page. In this paper, we present the comparative analysis on two
datasets that we obtained from APWG's landing page log files; one, from
September 7, 2008 - November 11, 2009, and other from January 1, 2014 - April
30, 2014. We found that the landing page has been successful in training users
against phishing. Forty six percent users clicked lesser number of phishing
URLs from January 2014 to April 2014 which shows that training from the landing
page helped users not to fall for phishing attacks. Our analysis shows that
phishers have started to modify their techniques by creating more legitimate
looking URLs and buying large number of domains to increase their activity. We
observed that phishers are exploiting ICANN accredited registrars to launch
their attacks even after strict surveillance. We saw that phishers are trying
to exploit free subdomain registration services to carry out attacks. In this
paper, we also compared the phishing e-mails used by phishers to lure victims
in 2008 and 2014. We found that the phishing e-mails have changed considerably
over time. Phishers have adopted new techniques like sending promotional
e-mails and emotionally targeting users in clicking phishing URLs
Graph-theoretic characterization of cyber-threat infrastructures
In this paper, we investigate cyber-threats and the underlying infrastructures. More precisely, we detect and analyze cyber-threat infrastructures for the purpose of unveiling key players (owners, domains, IPs, organizations, malware families, etc.) and the relationships between these players. To this end, we propose metrics to measure the badness of different infrastructure elements using graph theoretic concepts such as centrality concepts and Google PageRank. In addition, we quantify the sharing of infrastructure elements among different malware samples and families to unveil potential groups that are behind specific attacks. Moreover, we study the evolution of cyber-threat infrastructures over time to infer patterns of cyber-criminal activities. The proposed study provides the capability to derive insights and intelligence about cyber-threat infrastructures. Using one year dataset, we generate notable results regarding emerging threats and campaigns, important players behind threats, linkages between cyber-threat infrastructure elements, patterns of cyber-crimes, etc
An Evasion Attack against ML-based Phishing URL Detectors
Background: Over the year, Machine Learning Phishing URL classification
(MLPU) systems have gained tremendous popularity to detect phishing URLs
proactively. Despite this vogue, the security vulnerabilities of MLPUs remain
mostly unknown. Aim: To address this concern, we conduct a study to understand
the test time security vulnerabilities of the state-of-the-art MLPU systems,
aiming at providing guidelines for the future development of these systems.
Method: In this paper, we propose an evasion attack framework against MLPU
systems. To achieve this, we first develop an algorithm to generate adversarial
phishing URLs. We then reproduce 41 MLPU systems and record their baseline
performance. Finally, we simulate an evasion attack to evaluate these MLPU
systems against our generated adversarial URLs. Results: In comparison to
previous works, our attack is: (i) effective as it evades all the models with
an average success rate of 66% and 85% for famous (such as Netflix, Google) and
less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively;
(ii) realistic as it requires only 23ms to produce a new adversarial URL
variant that is available for registration with a median cost of only
$11.99/year. We also found that popular online services such as Google
SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that
Adversarial training (successful defence against evasion attack) does not
significantly improve the robustness of these systems as it decreases the
success rate of our attack by only 6% on average for all the models. (iv)
Further, we identify the security vulnerabilities of the considered MLPU
systems. Our findings lead to promising directions for future research.
Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but
also highlights implications for future study towards assessing and improving
these systems.Comment: Draft for ACM TOP
Linguistic Diversity on the Internet: Arabic, Chinese and Cyrillic Script Top-Level Domain Names
The deployment of Arabic, Chinese, and Cyrillic top-level domain names is explored in this research by analyzing technical and policy documents of the Internet Corporation for Assigned Names and Numbers (ICANN), as well as newspaper articles in the respective language regions. The tension between English uniformity at the root level of the Internet׳s domain names system, and language diversity in the global Internet community, has resulted in various technological solutions surrounding Arabic, Chinese, and Cyrillic language domain names. These standards and technological solutions ensure the security and stability of the Internet; however, they do not comprehensively address the linguistic diversity needs of the Internet. ICANN has been transforming into an international policy organization, yet its linguistic diversity policies appear disconnected from the diversity policies of the United Nations, and remain technically oriented. Linguistic diversity in relation to IDNs at this stage mostly focus on the language representation of major languages that are spoken in powerful nation-states, who use the rhetoric of national pride, local business branding, and inclusion of non-English speakers. This situation surfaces the tension between nation-states and the new international governing institution ICANN
- …