655 research outputs found

    Real-time classification of malicious URLs on Twitter using Machine Activity Data

    Get PDF
    Massive online social networks with hundreds of millions of active users are increasingly being used by Cyber criminals to spread malicious software (malware) to exploit vulnerabilities on the machines of users for personal gain. Twitter is particularly susceptible to such activity as, with its 140 character limit, it is common for people to include URLs in their tweets to link to more detailed information, evidence, news reports and so on. URLs are often shortened so the endpoint is not obvious before a person clicks the link. Cyber criminals can exploit this to propagate malicious URLs on Twitter, for which the endpoint is a malicious server that performs unwanted actions on the person’s machine. This is known as a drive-by-download. In this paper we develop a machine classification system to distinguish between malicious and benign URLs within seconds of the URL being clicked (i.e. ‘real-time’). We train the classifier using machine activity logs created while interacting with URLs extracted from Twitter data collected during a large global event – the Superbowl – and test it using data from another large sporting event – the Cricket World Cup. The results show that machine activity logs produce precision performances of up to 0.975 on training data from the first event and 0.747 on a test data from a second event. Furthermore, we examine the properties of the learned model to explain the relationship between machine activity and malicious software behaviour, and build a learning curve for the classifier to illustrate that very small samples of training data can be used with only a small detriment to performance

    Prediction of drive-by download attacks on Twitter

    Get PDF
    The popularity of Twitter for information discovery, coupled with the automatic shortening of URLs to save space, given the 140 character limit, provides cybercriminals with an opportunity to obfuscate the URL of a malicious Web page within a tweet. Once the URL is obfuscated, the cybercriminal can lure a user to click on it with enticing text and images before carrying out a cyber attack using a malicious Web server. This is known as a drive-by download. In a drive-by download a user’s computer system is infected while interacting with the malicious endpoint, often without them being made aware the attack has taken place. An attacker can gain control of the system by exploiting unpatched system vulnerabilities and this form of attack currently represents one of the most common methods employed. In this paper we build a machine learning model using machine activity data and tweet metadata to move beyond post-execution classification of such URLs as malicious, to predict a URL will be malicious with 0.99 F-measure (using 10-fold cross-validation) and 0.833 (using an unseen test set) at 1 second into the interaction with the URL. Thus providing a basis from which to kill the connection to the server before an attack has completed and proactively blocking and preventing an attack, rather than reacting and repairing at a later date

    IntelliFlow : um enfoque proativo para adicionar inteligência de ameaças cibernéticas a redes definidas por software

    Get PDF
    Orientador: Christian Rodolfo Esteve RothenbergDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Segurança tem sido uma das principais preocupações enfrentadas pela computação em rede principalmente, com o aumento das ameaças à medida que a Internet comercial e economias afins crescem rapidamente. Tecnologias de virtualização que permitem serviços em nuvem em escala colocam novos desafios para a segurança das infraestruturas computacionais, exigindo novos mecanismos que combinem o best-of-breed para reagir contra as metodologias de ataque emergentes. Nosso trabalho busca explorar os avanços na Cyber Threat Intelligence (CTI) no contexto da arquitetura de redes definidas por software, ou em inglês, Software Defined Networking (SDN). Enquanto a CTI representa uma abordagem recente para o combate de ameaças baseada em fontes confiáveis, a partir do compartihamento de informação e conhecimento sobre atividades criminais virtuais, a SDN é uma tendência recente na arquitetura de redes computacionais baseada em princípios de modulação e programabilidade. Nesta dissertação, nós propomos IntelliFlow, um sistema de detecção de inteligência para SDN que segue a abordagem proativa usando OpenFlow para efetivar contramedidas para as ameaças aprendidas a partir de um plano de inteligência distribuida. Nós mostramos a partir de uma implementação de prova de conceito que o sistema proposto é capaz de trazer uma série de benefícios em termos de efetividade e eficiência, contribuindo no plano geral para a segurança de projetos de computação de rede modernosAbstract: Security is a major concern in computer networking which faces increasing threats as the commercial Internet and related economies continue to grow. Virtualization technologies enabling scalable Cloud services pose further challenges to the security of computer infrastructures, demanding novel mechanisms combining the best-of-breed to counter certain types of attacks. Our work aims to explore advances in Cyber Threat Intelligence (CTI) in the context of Software Defined Networking (SDN) architectures. While CTI represents a recent approach to combat threats based on reliable sources, by sharing information and knowledge about computer criminal activities, SDN is a recent trend in architecting computer networks based on modularization and programmability principles. In this dissertation, we propose IntelliFlow, an intelligent detection system for SDN that follows a proactive approach using OpenFlow to deploy countermeasures to the threats learned through a distributed intelligent plane. We show through a proof of concept implementation that the proposed system is capable of delivering a number of benefits in terms of effectiveness and efficiency, altogether contributing to the security of modern computer network designsMestradoEngenharia de ComputaçãoMestre em Engenharia Elétrica159905/2013-3CNP

    The Dark Side(-Channel) of Mobile Devices: A Survey on Network Traffic Analysis

    Full text link
    In recent years, mobile devices (e.g., smartphones and tablets) have met an increasing commercial success and have become a fundamental element of the everyday life for billions of people all around the world. Mobile devices are used not only for traditional communication activities (e.g., voice calls and messages) but also for more advanced tasks made possible by an enormous amount of multi-purpose applications (e.g., finance, gaming, and shopping). As a result, those devices generate a significant network traffic (a consistent part of the overall Internet traffic). For this reason, the research community has been investigating security and privacy issues that are related to the network traffic generated by mobile devices, which could be analyzed to obtain information useful for a variety of goals (ranging from device security and network optimization, to fine-grained user profiling). In this paper, we review the works that contributed to the state of the art of network traffic analysis targeting mobile devices. In particular, we present a systematic classification of the works in the literature according to three criteria: (i) the goal of the analysis; (ii) the point where the network traffic is captured; and (iii) the targeted mobile platforms. In this survey, we consider points of capturing such as Wi-Fi Access Points, software simulation, and inside real mobile devices or emulators. For the surveyed works, we review and compare analysis techniques, validation methods, and achieved results. We also discuss possible countermeasures, challenges and possible directions for future research on mobile traffic analysis and other emerging domains (e.g., Internet of Things). We believe our survey will be a reference work for researchers and practitioners in this research field.Comment: 55 page

    Performance Evaluation of Machine Learning Techniques for Identifying Forged and Phony Uniform Resource Locators (URLs)

    Get PDF
    Since the invention of Information and Communication Technology (ICT), there has been a great shift from the erstwhile traditional approach of handling information across the globe to the usage of this innovation. The application of this initiative cut across almost all areas of human endeavours. ICT is widely utilized in education and production sectors as well as in various financial institutions. It is of note that many people are using it genuinely to carry out their day to day activities while others are using it to perform nefarious activities at the detriment of other cyber users. According to several reports which are discussed in the introductory part of this work, millions of people have become victims of fake Uniform Resource Locators (URLs) sent to their mails by spammers. Financial institutions are not left out in the monumental loss recorded through this illicit act over the years. It is worth mentioning that, despite several approaches currently in place, none could confidently be confirmed to provide the best and reliable solution. According to several research findings reported in the literature, researchers have demonstrated how machine learning algorithms could be employed to verify and confirm compromised and fake URLs in the cyberspace. Inconsistencies have however been noticed in the researchers’ findings and also their corresponding results are not dependable based on the values obtained and conclusions drawn from them. Against this backdrop, the authors carried out a comparative analysis of three learning algorithms (Naïve Bayes, Decision Tree and Logistics Regression Model) for verification of compromised, suspicious and fake URLs and determine which is the best of all based on the metrics (F-Measure, Precision and Recall) used for evaluation. Based on the confusion metrics measurement, the result obtained shows that the Decision Tree (ID3) algorithm achieves the highest values for recall, precision and f-measure. It unarguably provides efficient and credible means of maximizing the detection of compromised and malicious URLs. Finally, for future work, authors are of the opinion that two or more supervised learning algorithms can be hybridized to form a single effective and more efficient algorithm for fake URLs verification.Keywords: Learning-algorithms, Forged-URL, Phoney-URL, performance-compariso

    Emotions behind drive-by download propagation on Twitter

    Get PDF
    Twitter has emerged as one of the most popular platforms to get updates on entertainment and current events. However, due to its 280 character restriction and automatic shortening of URLs, it is continuously targeted by cybercriminals to carry out drive-by download attacks, where a user’s system is infected by merely visiting a Web page. Popular events that attract a large number of users are used by cybercriminals to infect and propagate malware by using popular hashtags and creating misleading tweets to lure users to malicious Web pages. A drive-by download attack is carried out by obfuscating a malicious URL in an enticing tweet and used as clickbait to lure users to a malicious Web page. In this paper we answer the following two questions: Why are certain malicious tweets retweeted more than others? Do emotions reflecting in a tweet drive virality? We gathered tweets from seven different sporting events over three years and identified those tweets that used to carry to out a drive-by download attack. From the malicious (N=105,642) and benign (N=169,178) data sample identified, we built models to predict information flow size and survival. We define size as the number of retweets of an original tweet, and survival as the duration of the original tweet’s presence in the study window. We selected the zero-truncated negative binomial (ZTNB) regression method for our analysis based on the distribution exhibited by our dependent size measure and the comparison of results with other predictive models. We used the Cox regression technique to model the survival of information flows as it estimates proportional hazard rates for independent measures. Our results show that both social and content factors are statistically significant for the size and survival of information flows for both malicious and benign tweets. In the benign data sample, positive emotions and positive sentiment reflected in the tweet significantly predict size and survival. In contrast, for the malicious data sample, negative emotions, especially fear, are associated with both size and survival of information flows

    A Reputation Score Driven E-Mail Mitigation System

    Get PDF
    E-mail inspection and mitigation systems are necessary in today\u27s world due to frequent bombardment of adversarial attacks leverage phishing techniques. The process and accuracy in identifying a phishing attack present significant challenges due to data encryption hindering the ability to conduct signature matching, context analysis of a message, and synchronization of alerts in distributed detection systems. The author recognizes a grand challenge that the increase in the number of data analysis systems corresponds to an overall increase in the delivery time delay of an e-mail message. This work enhances PhishLimiter as a solution to combat phishing attacks using machine learning techniques to analyze 27 e-mail features and Software-Defined Networking (SDN) to optimize network transactions. PhishLimiter uses a two-lane inspection approach of Store-and-Forward (SF) and Forward-and-Inspect (FI) to distinguish whether traffic is held for analysis or immediately forwarded to the destination. The results of the work demonstrated PhishLimiter as a viable solution to combat Phishing attacks while minimizing delivery time of e-mail messages
    • …
    corecore