22,206 research outputs found

    Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data

    Get PDF
    Recent years have seen the rise of more sophisticated attacks including advanced persistent threats (APTs) which pose severe risks to organizations and governments by targeting confidential proprietary information. Additionally, new malware strains are appearing at a higher rate than ever before. Since many of these malware are designed to evade existing security products, traditional defenses deployed by most enterprises today, e.g., anti-virus, firewalls, intrusion detection systems, often fail at detecting infections at an early stage. We address the problem of detecting early-stage infection in an enterprise setting by proposing a new framework based on belief propagation inspired from graph theory. Belief propagation can be used either with "seeds" of compromised hosts or malicious domains (provided by the enterprise security operation center -- SOC) or without any seeds. In the latter case we develop a detector of C&C communication particularly tailored to enterprises which can detect a stealthy compromise of only a single host communicating with the C&C server. We demonstrate that our techniques perform well on detecting enterprise infections. We achieve high accuracy with low false detection and false negative rates on two months of anonymized DNS logs released by Los Alamos National Lab (LANL), which include APT infection attacks simulated by LANL domain experts. We also apply our algorithms to 38TB of real-world web proxy logs collected at the border of a large enterprise. Through careful manual investigation in collaboration with the enterprise SOC, we show that our techniques identified hundreds of malicious domains overlooked by state-of-the-art security products

    Inferring processes of cultural transmission: the critical role of rare variants in distinguishing neutrality from novelty biases

    Full text link
    Neutral evolution assumes that there are no selective forces distinguishing different variants in a population. Despite this striking assumption, many recent studies have sought to assess whether neutrality can provide a good description of different episodes of cultural change. One approach has been to test whether neutral predictions are consistent with observed progeny distributions, recording the number of variants that have produced a given number of new instances within a specified time interval: a classic example is the distribution of baby names. Using an overlapping generations model we show that these distributions consist of two phases: a power law phase with a constant exponent of -3/2, followed by an exponential cut-off for variants with very large numbers of progeny. Maximum likelihood estimations of the model parameters provide a direct way to establish whether observed empirical patterns are consistent with neutral evolution. We apply our approach to a complete data set of baby names from Australia. Crucially we show that analyses based on only the most popular variants, as is often the case in studies of cultural evolution, can provide misleading evidence for underlying transmission hypotheses. While neutrality provides a plausible description of progeny distributions of abundant variants, rare variants deviate from neutrality. Further, we develop a simulation framework that allows for the detection of alternative cultural transmission processes. We show that anti-novelty bias is able to replicate the complete progeny distribution of the Australian data set

    Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page

    Full text link
    Each month, more attacks are launched with the aim of making web users believe that they are communicating with a trusted entity which compels them to share their personal, financial information. Phishing costs Internet users billions of dollars every year. Researchers at Carnegie Mellon University (CMU) created an anti-phishing landing page supported by Anti-Phishing Working Group (APWG) with the aim to train users on how to prevent themselves from phishing attacks. It is used by financial institutions, phish site take down vendors, government organizations, and online merchants. When a potential victim clicks on a phishing link that has been taken down, he / she is redirected to the landing page. In this paper, we present the comparative analysis on two datasets that we obtained from APWG's landing page log files; one, from September 7, 2008 - November 11, 2009, and other from January 1, 2014 - April 30, 2014. We found that the landing page has been successful in training users against phishing. Forty six percent users clicked lesser number of phishing URLs from January 2014 to April 2014 which shows that training from the landing page helped users not to fall for phishing attacks. Our analysis shows that phishers have started to modify their techniques by creating more legitimate looking URLs and buying large number of domains to increase their activity. We observed that phishers are exploiting ICANN accredited registrars to launch their attacks even after strict surveillance. We saw that phishers are trying to exploit free subdomain registration services to carry out attacks. In this paper, we also compared the phishing e-mails used by phishers to lure victims in 2008 and 2014. We found that the phishing e-mails have changed considerably over time. Phishers have adopted new techniques like sending promotional e-mails and emotionally targeting users in clicking phishing URLs

    Graph-theoretic characterization of cyber-threat infrastructures

    Get PDF
    In this paper, we investigate cyber-threats and the underlying infrastructures. More precisely, we detect and analyze cyber-threat infrastructures for the purpose of unveiling key players (owners, domains, IPs, organizations, malware families, etc.) and the relationships between these players. To this end, we propose metrics to measure the badness of different infrastructure elements using graph theoretic concepts such as centrality concepts and Google PageRank. In addition, we quantify the sharing of infrastructure elements among different malware samples and families to unveil potential groups that are behind specific attacks. Moreover, we study the evolution of cyber-threat infrastructures over time to infer patterns of cyber-criminal activities. The proposed study provides the capability to derive insights and intelligence about cyber-threat infrastructures. Using one year dataset, we generate notable results regarding emerging threats and campaigns, important players behind threats, linkages between cyber-threat infrastructure elements, patterns of cyber-crimes, etc

    An Evasion Attack against ML-based Phishing URL Detectors

    Full text link
    Background: Over the year, Machine Learning Phishing URL classification (MLPU) systems have gained tremendous popularity to detect phishing URLs proactively. Despite this vogue, the security vulnerabilities of MLPUs remain mostly unknown. Aim: To address this concern, we conduct a study to understand the test time security vulnerabilities of the state-of-the-art MLPU systems, aiming at providing guidelines for the future development of these systems. Method: In this paper, we propose an evasion attack framework against MLPU systems. To achieve this, we first develop an algorithm to generate adversarial phishing URLs. We then reproduce 41 MLPU systems and record their baseline performance. Finally, we simulate an evasion attack to evaluate these MLPU systems against our generated adversarial URLs. Results: In comparison to previous works, our attack is: (i) effective as it evades all the models with an average success rate of 66% and 85% for famous (such as Netflix, Google) and less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively; (ii) realistic as it requires only 23ms to produce a new adversarial URL variant that is available for registration with a median cost of only $11.99/year. We also found that popular online services such as Google SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that Adversarial training (successful defence against evasion attack) does not significantly improve the robustness of these systems as it decreases the success rate of our attack by only 6% on average for all the models. (iv) Further, we identify the security vulnerabilities of the considered MLPU systems. Our findings lead to promising directions for future research. Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but also highlights implications for future study towards assessing and improving these systems.Comment: Draft for ACM TOP

    Linguistic Diversity on the Internet: Arabic, Chinese and Cyrillic Script Top-Level Domain Names

    Get PDF
    The deployment of Arabic, Chinese, and Cyrillic top-level domain names is explored in this research by analyzing technical and policy documents of the Internet Corporation for Assigned Names and Numbers (ICANN), as well as newspaper articles in the respective language regions. The tension between English uniformity at the root level of the Internet׳s domain names system, and language diversity in the global Internet community, has resulted in various technological solutions surrounding Arabic, Chinese, and Cyrillic language domain names. These standards and technological solutions ensure the security and stability of the Internet; however, they do not comprehensively address the linguistic diversity needs of the Internet. ICANN has been transforming into an international policy organization, yet its linguistic diversity policies appear disconnected from the diversity policies of the United Nations, and remain technically oriented. Linguistic diversity in relation to IDNs at this stage mostly focus on the language representation of major languages that are spoken in powerful nation-states, who use the rhetoric of national pride, local business branding, and inclusion of non-English speakers. This situation surfaces the tension between nation-states and the new international governing institution ICANN
    • …
    corecore