56 research outputs found

    Spatiotemporal Patterns and Predictability of Cyberattacks

    Get PDF
    Y.C.L. was supported by Air Force Office of Scientific Research (AFOSR) under grant no. FA9550-10-1-0083 and Army Research Office (ARO) under grant no. W911NF-14-1-0504. S.X. was supported by Army Research Office (ARO) under grant no. W911NF-13-1-0141. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Peer reviewedPublisher PD

    DAG-Based Attack and Defense Modeling: Don't Miss the Forest for the Attack Trees

    Full text link
    This paper presents the current state of the art on attack and defense modeling approaches that are based on directed acyclic graphs (DAGs). DAGs allow for a hierarchical decomposition of complex scenarios into simple, easily understandable and quantifiable actions. Methods based on threat trees and Bayesian networks are two well-known approaches to security modeling. However there exist more than 30 DAG-based methodologies, each having different features and goals. The objective of this survey is to present a complete overview of graphical attack and defense modeling techniques based on DAGs. This consists of summarizing the existing methodologies, comparing their features and proposing a taxonomy of the described formalisms. This article also supports the selection of an adequate modeling technique depending on user requirements

    ‎An Artificial Intelligence Framework for Supporting Coarse-Grained Workload Classification in Complex Virtual Environments

    Get PDF
    Cloud-based machine learning tools for enhanced Big Data applications}‎, ‎where the main idea is that of predicting the ``\emph{next}'' \emph{workload} occurring against the target Cloud infrastructure via an innovative \emph{ensemble-based approach} that combines the effectiveness of different well-known \emph{classifiers} in order to enhance the whole accuracy of the final classification‎, ‎which is very relevant at now in the specific context of \emph{Big Data}‎. ‎The so-called \emph{workload categorization problem} plays a critical role in improving the efficiency and reliability of Cloud-based big data applications‎. ‎Implementation-wise‎, ‎our method proposes deploying Cloud entities that participate in the distributed classification approach on top of \emph{virtual machines}‎, ‎which represent classical ``commodity'' settings for Cloud-based big data applications‎. ‎Given a number of known reference workloads‎, ‎and an unknown workload‎, ‎in this paper we deal with the problem of finding the reference workload which is most similar to the unknown one‎. ‎The depicted scenario turns out to be useful in a plethora of modern information system applications‎. ‎We name this problem as \emph{coarse-grained workload classification}‎, ‎because‎, ‎instead of characterizing the unknown workload in terms of finer behaviors‎, ‎such as CPU‎, ‎memory‎, ‎disk‎, ‎or network intensive patterns‎, ‎we classify the whole unknown workload as one of the (possible) reference workloads‎. ‎Reference workloads represent a category of workloads that are relevant in a given applicative environment‎. ‎In particular‎, ‎we focus our attention on the classification problem described above in the special case represented by \emph{virtualized environments}‎. ‎Today‎, ‎\emph{Virtual Machines} (VMs) have become very popular because they offer important advantages to modern computing environments such as cloud computing or server farms‎. ‎In virtualization frameworks‎, ‎workload classification is very useful for accounting‎, ‎security reasons‎, ‎or user profiling‎. ‎Hence‎, ‎our research makes more sense in such environments‎, ‎and it turns out to be very useful in a special context like Cloud Computing‎, ‎which is emerging now‎. ‎In this respect‎, ‎our approach consists of running several machine learning-based classifiers of different workload models‎, ‎and then deriving the best classifier produced by the \emph{Dempster-Shafer Fusion}‎, ‎in order to magnify the accuracy of the final classification‎. ‎Experimental assessment and analysis clearly confirm the benefits derived from our classification framework‎. ‎The running programs which produce unknown workloads to be classified are treated in a similar way‎. ‎A fundamental aspect of this paper concerns the successful use of data fusion in workload classification‎. ‎Different types of metrics are in fact fused together using the Dempster-Shafer theory of evidence combination‎, ‎giving a classification accuracy of slightly less than 80%80\%‎. ‎The acquisition of data from the running process‎, ‎the pre-processing algorithms‎, ‎and the workload classification are described in detail‎. ‎Various classical algorithms have been used for classification to classify the workloads‎, ‎and the results are compared‎

    Avatar captcha : telling computers and humans apart via face classification and mouse dynamics.

    Get PDF
    Bots are malicious, automated computer programs that execute malicious scripts and predefined functions on an affected computer. They pose cybersecurity threats and are one of the most sophisticated and common types of cybercrime tools today. They spread viruses, generate spam, steal personal sensitive information, rig online polls and commit other types of online crime and fraud. They sneak into unprotected systems through the Internet by seeking vulnerable entry points. They access the system’s resources like a human user does. Now the question arises how do we counter this? How do we prevent bots and on the other hand allow human users to access the system resources? One solution is by designing a CAPTCHA (Completely Automated Public Turing Tests to tell Computers and Humans Apart), a program that can generate and grade tests that most humans can pass but computers cannot. It is used as a tool to distinguish humans from malicious bots. They are a class of Human Interactive Proofs (HIPs) meant to be easily solvable by humans and economically infeasible for computers. Text CAPTCHAs are very popular and commonly used. For each challenge, they generate a sequence of alphabets by distorting standard fonts, requesting users to identify them and type them out. However, they are vulnerable to character segmentation attacks by bots, English language dependent and are increasingly becoming too complex for people to solve. A solution to this is to design Image CAPTCHAs that use images instead of text and require users to identify certain images to solve the challenges. They are user-friendly and convenient for human users and a much more challenging problem for bots to solve. In today’s Internet world the role of user profiling or user identification has gained a lot of significance. Identity thefts, etc. can be prevented by providing authorized access to resources. To achieve timely response to a security breach frequent user verification is needed. However, this process must be passive, transparent and non-obtrusive. In order for such a system to be practical it must be accurate, efficient and difficult to forge. Behavioral biometric systems are usually less prominent however, they provide numerous and significant advantages over traditional biometric systems. Collection of behavior data is non-obtrusive and cost-effective as it requires no special hardware. While these systems are not unique enough to provide reliable human identification, they have shown to be highly accurate in identity verification. In accomplishing everyday tasks, human beings use different styles, strategies, apply unique skills and knowledge, etc. These define the behavioral traits of the user. Behavioral biometrics attempts to quantify these traits to profile users and establish their identity. Human computer interaction (HCI)-based biometrics comprise of interaction strategies and styles between a human and a computer. These unique user traits are quantified to build profiles for identification. A specific category of HCI-based biometrics is based on recording human interactions with mouse as the input device and is known as Mouse Dynamics. By monitoring the mouse usage activities produced by a user during interaction with the GUI, a unique profile can be created for that user that can help identify him/her. Mouse-based verification approaches do not record sensitive user credentials like usernames and passwords. Thus, they avoid privacy issues. An image CAPTCHA is proposed that incorporates Mouse Dynamics to help fortify it. It displays random images obtained from Yahoo’s Flickr. To solve the challenge the user must identify and select a certain class of images. Two theme-based challenges have been designed. They are Avatar CAPTCHA and Zoo CAPTCHA. The former displays human and avatar faces whereas the latter displays different animal species. In addition to the dynamically selected images, while attempting to solve the CAPTCHA, the way each user interacts with the mouse i.e. mouse clicks, mouse movements, mouse cursor screen co-ordinates, etc. are recorded nonobtrusively at regular time intervals. These recorded mouse movements constitute the Mouse Dynamics Signature (MDS) of the user. This MDS provides an additional secure technique to segregate humans from bots. The security of the CAPTCHA is tested by an adversary executing a mouse bot attempting to solve the CAPTCHA challenges

    Improving the accuracy of spoofed traffic inference in inter-domain traffic

    Get PDF
    Ascertaining that a network will forward spoofed traffic usually requires an active probing vantage point in that network, effectively preventing a comprehensive view of this global Internet vulnerability. We argue that broader visibility into the spoofing problem may lie in the capability to infer lack of Source Address Validation (SAV) compliance from large, heavily aggregated Internet traffic data, such as traffic observable at Internet Exchange Points (IXPs). The key idea is to use IXPs as observatories to detect spoofed packets, by leveraging Autonomous System (AS) topology knowledge extracted from Border Gateway Protocol (BGP) data to infer which source addresses should legitimately appear across parts of the IXP switch fabric. In this thesis, we demonstrate that the existing literature does not capture several fundamental challenges to this approach, including noise in BGP data sources, heuristic AS relationship inference, and idiosyncrasies in IXP interconnec- tivity fabrics. We propose Spoofer-IX, a novel methodology to navigate these challenges, leveraging Customer Cone semantics of AS relationships to guide precise classification of inter-domain traffic as In-cone, Out-of-cone ( spoofed ), Unverifiable, Bogon, and Unas- signed. We apply our methodology on extensive data analysis using real traffic data from two distinct IXPs in Brazil, a mid-size and a large-size infrastructure. In the mid-size IXP with more than 200 members, we find an upper bound volume of Out-of-cone traffic to be more than an order of magnitude less than the previous method inferred on the same data, revealing the practical importance of Customer Cone semantics in such analysis. We also found no significant improvement in deployment of SAV in networks using the mid-size IXP between 2017 and 2019. In hopes that our methods and tools generalize to use by other IXPs who want to avoid use of their infrastructure for launching spoofed-source DoS attacks, we explore the feasibility of scaling the system to larger and more diverse IXP infrastructures. To promote this goal, and broad replicability of our results, we make the source code of Spoofer-IX publicly available. This thesis illustrates the subtleties of scientific assessments of operational Internet infrastructure, and the need for a community focus on reproducing and repeating previous methods.A constatação de que uma rede encaminharå tråfego falsificado geralmente requer um ponto de vantagem ativo de medição nessa rede, impedindo efetivamente uma visão abrangente dessa vulnerabilidade global da Internet. Isto posto, argumentamos que uma visibilidade mais ampla do problema de spoofing pode estar na capacidade de inferir a falta de conformidade com as pråticas de Source Address Validation (SAV) a partir de dados de tråfego da Internet altamente agregados, como o tråfego observåvel nos Internet Exchange Points (IXPs). A ideia chave Ê usar IXPs como observatórios para detectar pacotes falsificados, aproveitando o conhecimento da topologia de sistemas autônomos extraído dos dados do protocolo BGP para inferir quais endereços de origem devem aparecer legitimamente nas comunicaçþes atravÊs da infra-estrutura de um IXP. Nesta tese, demonstramos que a literatura existente não captura diversos desafios fundamentais para essa abordagem, incluindo ruído em fontes de dados BGP, inferência heurística de relacionamento de sistemas autônomos e características específicas de interconectividade nas infraestruturas de IXPs. Propomos o Spoofer-IX, uma nova metodologia para superar esses desafios, utilizando a semântica do Customer Cone de relacionamento de sistemas autônomos para guiar com precisão a classificação de tråfego inter-domínio como In-cone, Out-of-cone ( spoofed ), Unverifiable, Bogon, e Unassigned. Aplicamos nossa metodologia em anålises extensivas sobre dados reais de tråfego de dois IXPs distintos no Brasil, uma infraestrutura de mÊdio porte e outra de grande porte. No IXP de tamanho mÊdio, com mais de 200 membros, encontramos um limite superior do volume de tråfego Out-of-cone uma ordem de magnitude menor que o mÊtodo anterior inferiu sob os mesmos dados, revelando a importância pråtica da semântica do Customer Cone em tal anålise. AlÊm disso, não encontramos melhorias significativas na implantação do Source Address Validation (SAV) em redes usando o IXP de tamanho mÊdio entre 2017 e 2019. Na esperança de que nossos mÊtodos e ferramentas sejam aplicåveis para uso por outros IXPs que desejam evitar o uso de sua infraestrutura para iniciar ataques de negação de serviço atravÊs de pacotes de origem falsificada, exploramos a viabilidade de escalar o sistema para infraestruturas IXP maiores e mais diversas. Para promover esse objetivo e a ampla replicabilidade de nossos resultados, disponibilizamos publicamente o código fonte do Spoofer-IX. Esta tese ilustra as sutilezas das avaliaçþes científicas da infraestrutura operacional da Internet e a necessidade de um foco da comunidade na reprodução e repetição de mÊtodos anteriores

    A Multi-Agent Systems Approach for Analysis of Stepping Stone Attacks

    Get PDF
    Stepping stone attacks are one of the most sophisticated cyber-attacks, in which attackers make a chain of compromised hosts to reach a victim target. In this Dissertation, an analytic model with Multi-Agent systems approach has been proposed to analyze the propagation of stepping stones attacks in dynamic vulnerability graphs. Because the vulnerability configuration in a network is inherently dynamic, in this Dissertation a biased min-consensus technique for dynamic graphs with fixed and switching topology is proposed as a distributed technique to calculate the most vulnerable path for stepping stones attacks in dynamic vulnerability graphs. We use min-plus algebra to analyze and provide necessary and sufficient convergence conditions to the shortest path in the fixed topology case. A necessary condition for the switching topology case is provided. Most cyber-attacks involve an attacker launching a multi-stage attack by exploiting a sequence of hosts. This multi-stage attack generates a chain of ``stepping stones” from the origin to target. The choice of stepping stones is a function of the degree of exploitability, the impact, attacker’s capability, masking origin location, and intent. In this Dissertation, we model and analyze scenarios wherein an attacker employs multiple strategies to choose stepping stones. The problem is modeled as an Adjacency Quadratic Shortest Path using dynamic vulnerability graphs with multi-agent dynamic system approach. With this approach, the shortest stepping stone path with maximum node degree and the shortest stepping stone path with maximum impact are modeled and analyzed. Because embedded controllers are omnipresent in networks, in this Dissertation as a Risk Mitigation Strategy, a cyber-attack tolerant control strategy for embedded controllers is proposed. A dual redundant control architecture that combines two identical controllers that are switched periodically between active and restart modes is proposed. The strategy is addressed to mitigate the impact due to the corruption of the controller software by an adversary. We analyze the impact of the resetting and restarting the controller software and performance of the switching process. The minimum requirements in the control design, for effective mitigation of cyber-attacks to the control software that implies a “fast” switching period is provided. The simulation results demonstrate the effectiveness of the proposed strategy when the time to fully reset and restart the controller is faster than the time taken by an adversary to compromise the controller. The results also provide insights into the stability and safety regions and the factors that determine the effectiveness of the proposed strategy

    Multibiometric security in wireless communication systems

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 05/08/2010.This thesis has aimed to explore an application of Multibiometrics to secured wireless communications. The medium of study for this purpose included Wi-Fi, 3G, and WiMAX, over which simulations and experimental studies were carried out to assess the performance. In specific, restriction of access to authorized users only is provided by a technique referred to hereafter as multibiometric cryptosystem. In brief, the system is built upon a complete challenge/response methodology in order to obtain a high level of security on the basis of user identification by fingerprint and further confirmation by verification of the user through text-dependent speaker recognition. First is the enrolment phase by which the database of watermarked fingerprints with memorable texts along with the voice features, based on the same texts, is created by sending them to the server through wireless channel. Later is the verification stage at which claimed users, ones who claim are genuine, are verified against the database, and it consists of five steps. Initially faced by the identification level, one is asked to first present one’s fingerprint and a memorable word, former is watermarked into latter, in order for system to authenticate the fingerprint and verify the validity of it by retrieving the challenge for accepted user. The following three steps then involve speaker recognition including the user responding to the challenge by text-dependent voice, server authenticating the response, and finally server accepting/rejecting the user. In order to implement fingerprint watermarking, i.e. incorporating the memorable word as a watermark message into the fingerprint image, an algorithm of five steps has been developed. The first three novel steps having to do with the fingerprint image enhancement (CLAHE with 'Clip Limit', standard deviation analysis and sliding neighborhood) have been followed with further two steps for embedding, and extracting the watermark into the enhanced fingerprint image utilising Discrete Wavelet Transform (DWT). In the speaker recognition stage, the limitations of this technique in wireless communication have been addressed by sending voice feature (cepstral coefficients) instead of raw sample. This scheme is to reap the advantages of reducing the transmission time and dependency of the data on communication channel, together with no loss of packet. Finally, the obtained results have verified the claims

    Dynamic Protocol Reverse Engineering a Grammatical Inference Approach

    Get PDF
    Round trip engineering of software from source code and reverse engineering of software from binary files have both been extensively studied and the state-of-practice have documented tools and techniques. Forward engineering of protocols has also been extensively studied and there are firmly established techniques for generating correct protocols. While observation of protocol behavior for performance testing has been studied and techniques established, reverse engineering of protocol control flow from observations of protocol behavior has not received the same level of attention. State-of-practice in reverse engineering the control flow of computer network protocols is comprised of mostly ad hoc approaches. We examine state-of-practice tools and techniques used in three open source projects: Pidgin, Samba, and rdesktop . We examine techniques proposed by computational learning researchers for grammatical inference. We propose to extend the state-of-art by inferring protocol control flow using grammatical inference inspired techniques to reverse engineer automata representations from captured data flows. We present evidence that grammatical inference is applicable to the problem domain under consideration

    Private and censorship-resistant communication over public networks

    Get PDF
    Society’s increasing reliance on digital communication networks is creating unprecedented opportunities for wholesale surveillance and censorship. This thesis investigates the use of public networks such as the Internet to build robust, private communication systems that can resist monitoring and attacks by powerful adversaries such as national governments. We sketch the design of a censorship-resistant communication system based on peer-to-peer Internet overlays in which the participants only communicate directly with people they know and trust. This ‘friend-to-friend’ approach protects the participants’ privacy, but it also presents two significant challenges. The first is that, as with any peer-to-peer overlay, the users of the system must collectively provide the resources necessary for its operation; some users might prefer to use the system without contributing resources equal to those they consume, and if many users do so, the system may not be able to survive. To address this challenge we present a new game theoretic model of the problem of encouraging cooperation between selfish actors under conditions of scarcity, and develop a strategy for the game that provides rational incentives for cooperation under a wide range of conditions. The second challenge is that the structure of a friend-to-friend overlay may reveal the users’ social relationships to an adversary monitoring the underlying network. To conceal their sensitive relationships from the adversary, the users must be able to communicate indirectly across the overlay in a way that resists monitoring and attacks by other participants. We address this second challenge by developing two new routing protocols that robustly deliver messages across networks with unknown topologies, without revealing the identities of the communication endpoints to intermediate nodes or vice versa. The protocols make use of a novel unforgeable acknowledgement mechanism that proves that a message has been delivered without identifying the source or destination of the message or the path by which it was delivered. One of the routing protocols is shown to be robust to attacks by malicious participants, while the other provides rational incentives for selfish participants to cooperate in forwarding messages
    • …
    corecore