217 research outputs found

    Application of information theory and statistical learning to anomaly detection

    Get PDF
    In today\u27s highly networked world, computer intrusions and other attacks area constant threat. The detection of such attacks, especially attacks that are new or previously unknown, is important to secure networks and computers. A major focus of current research efforts in this area is on anomaly detection.;In this dissertation, we explore applications of information theory and statistical learning to anomaly detection. Specifically, we look at two difficult detection problems in network and system security, (1) detecting covert channels, and (2) determining if a user is a human or bot. We link both of these problems to entropy, a measure of randomness information content, or complexity, a concept that is central to information theory. The behavior of bots is low in entropy when tasks are rigidly repeated or high in entropy when behavior is pseudo-random. In contrast, human behavior is complex and medium in entropy. Similarly, covert channels either create regularity, resulting in low entropy, or encode extra information, resulting in high entropy. Meanwhile, legitimate traffic is characterized by complex interdependencies and moderate entropy. In addition, we utilize statistical learning algorithms, Bayesian learning, neural networks, and maximum likelihood estimation, in both modeling and detecting of covert channels and bots.;Our results using entropy and statistical learning techniques are excellent. By using entropy to detect covert channels, we detected three different covert timing channels that were not detected by previous detection methods. Then, using entropy and Bayesian learning to detect chat bots, we detected 100% of chat bots with a false positive rate of only 0.05% in over 1400 hours of chat traces. Lastly, using neural networks and the idea of human observational proofs to detect game bots, we detected 99.8% of game bots with no false positives in 95 hours of traces. Our work shows that a combination of entropy measures and statistical learning algorithms is a powerful and highly effective tool for anomaly detection

    Quantitative analysis of the release order of defensive mechanisms

    Get PDF
    PhD ThesisDependency on information technology (IT) and computer and information security (CIS) has become a critical concern for many organizations. This concern has essentially centred on protecting secrecy, confidentiality, integrity and availability of information. To overcome this concern, defensive mechanisms, which encompass a variety of services and protections, have been proposed to protect system resources from misuse. Most of these defensive mechanisms, such as CAPTCHAs and spam filters, rely in the first instance on a single algorithm as a defensive mechanism. Attackers would eventually break each mechanism. So, each algorithm would ultimately become useless and the system no longer protected. Although this broken algorithm will be replaced by a new algorithm, no one shed light on a set of algorithms as a defensive mechanism. This thesis looks at a set of algorithms as a holistic defensive mechanism. Our hypothesis is that the order in which a set of defensive algorithms is released has a significant impact on the time taken by attackers to break the combined set of algorithms. The rationale behind this hypothesis is that attackers learn from their attempts, and that the release schedule of defensive mechanisms can be adjusted so as to impair the learning process. To demonstrate the correctness of our hypothesis, an experimental study involving forty participants was conducted to evaluate the effect of algorithms’ order on the time taken to break them. In addition, this experiment explores how the learning process of attackers could be observed. The results showed that the order in which algorithms are released has a statistically significant impact on the time attackers take to break all algorithms. Based on these results, a model has been constructed using Stochastic Petri Nets, which facilitate theoretical analysis of the release order of a set of algorithms approach. Moreover, a tailored optimization algorithm is proposed using a Markov Decision Process model in order to obtain efficiently the optimal release strategy for any given model by maximizing the time taken to break a set of algorithms. As our hypothesis is based on the learning acquisition ability of attackers while interacting with the system, the Attacker Learning Curve (ALC) concept is developed. Based on empirical results of the ALC, an attack strategy detection approach is introduced and evaluated, which has achieved a detection success rate higher than 70%. The empirical findings in this detection approach provide a new understanding of not only how to detect the attack strategy used, but also how to track the attack strategy through the probabilities of classifying results that may provide an advantage for optimising the release order of defensive mechanisms

    Detecting Abnormal Behavior in Web Applications

    Get PDF
    The rapid advance of web technologies has made the Web an essential part of our daily lives. However, network attacks have exploited vulnerabilities of web applications, and caused substantial damages to Internet users. Detecting network attacks is the first and important step in network security. A major branch in this area is anomaly detection. This dissertation concentrates on detecting abnormal behaviors in web applications by employing the following methodology. For a web application, we conduct a set of measurements to reveal the existence of abnormal behaviors in it. We observe the differences between normal and abnormal behaviors. By applying a variety of methods in information extraction, such as heuristics algorithms, machine learning, and information theory, we extract features useful for building a classification system to detect abnormal behaviors.;In particular, we have studied four detection problems in web security. The first is detecting unauthorized hotlinking behavior that plagues hosting servers on the Internet. We analyze a group of common hotlinking attacks and web resources targeted by them. Then we present an anti-hotlinking framework for protecting materials on hosting servers. The second problem is detecting aggressive behavior of automation on Twitter. Our work determines whether a Twitter user is human, bot or cyborg based on the degree of automation. We observe the differences among the three categories in terms of tweeting behavior, tweet content, and account properties. We propose a classification system that uses the combination of features extracted from an unknown user to determine the likelihood of being a human, bot or cyborg. Furthermore, we shift the detection perspective from automation to spam, and introduce the third problem, namely detecting social spam campaigns on Twitter. Evolved from individual spammers, spam campaigns manipulate and coordinate multiple accounts to spread spam on Twitter, and display some collective characteristics. We design an automatic classification system based on machine learning, and apply multiple features to classifying spam campaigns. Complementary to conventional spam detection methods, our work brings efficiency and robustness. Finally, we extend our detection research into the blogosphere to capture blog bots. In this problem, detecting the human presence is an effective defense against the automatic posting ability of blog bots. We introduce behavioral biometrics, mainly mouse and keyboard dynamics, to distinguish between human and bot. By passively monitoring user browsing activities, this detection method does not require any direct user participation, and improves the user experience

    Dagstuhl News January - December 2008

    Get PDF
    "Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

    Criminal Innovation and the Warrant Requirement: Reconsidering the Rights-Police Efficiency Trade-Off

    Full text link
    It is routinely assumed that there is a trade-off between police efficiency and the warrant requirement. But existing analysis ignores the interaction between law-enforcement investigative practices and criminal innovation. Narrowing the definition of a search or otherwise limiting the requirement for a warrant gives criminals greater incentive to innovate to avoid detection. With limited resources to develop countermeasures, law enforcement officers will often be just as effective at capturing criminals when facing higher Fourth Amendment hurdles. We provide a game-theoretic model that shows that when law-enforcement investigation and criminal innovation are considered in a dynamic context, the police efficiency rationale for lowering Fourth Amendment rights is often inapt. We analyze how this impacts both criminal activity and innocent communications that individuals seek to keep private in the digital age. We show that both law-enforcement and noncriminal privacy concerns may be better promoted by maintaining the warrant requirement

    A taxonomy of phishing research

    Get PDF
    Phishing is a widespread threat that has attracted a lot of attention from the security community. A significant amount of research has focused on designing automated mitigation techniques. However, these techniques have largely only proven successful at catching previously witnessed phishing campaigns. Characteristics of phishing emails and web pages were thoroughly analyzed, but not enough emphasis was put on exploring alternate attack vectors. Novel education approaches were shown to be effective at teaching users to recognize phishing attacks and are adaptable to other kinds of threats. In this thesis, we explore a large amount of existing literature on phishing and present a comprehensive taxonomy of the current state of phishing research. With our extensive literature review, we will illuminate both areas of phishing research we believe will prove fruitful and areas that seem to be oversaturated

    Money & Trust in Digital Society, Bitcoin and Stablecoins in ML enabled Metaverse Telecollaboration

    Full text link
    We present a state of the art and positioning book, about Digital society tools, namely; Web3, Bitcoin, Metaverse, AI/ML, accessibility, safeguarding and telecollaboration. A high level overview of Web3 technologies leads to a description of blockchain, and the Bitcoin network is specifically selected for detailed examination. Suitable components of the extended Bitcoin ecosystem are described in more depth. Other mechanisms for native digital value transfer are described, with a focus on `money'. Metaverse technology is over-viewed, primarily from the perspective of Bitcoin and extended reality. Bitcoin is selected as the best contender for value transfer in metaverses because of it's free and open source nature, and network effect. Challenges and risks of this approach are identified. A cloud deployable virtual machine based technology stack deployment guide with a focus on cybersecurity best practice can be downloaded from GitHub to experiment with the technologies. This deployable lab is designed to inform development of secure value transaction, for small and medium sized companies

    Image Understanding for Automatic Human and Machine Separation.

    Get PDF
    PhDThe research presented in this thesis aims to extend the capabilities of human interaction proofs in order to improve security in web applications and services. The research focuses on developing a more robust and efficient Completely Automated Public Turing test to tell Computers and Human Apart (CAPTCHA) to increase the gap between human recognition and machine recognition. Two main novel approaches are presented, each one of them targeting a different area of human and machine recognition: a character recognition test, and an image recognition test. Along with the novel approaches, a categorisation for the available CAPTCHA methods is also introduced. The character recognition CAPTCHA is based on the creation of depth perception by using shadows to represent characters. The characters are created by the imaginary shadows produced by a light source, using as a basis the gestalt principle that human beings can perceive whole forms instead of just a collection of simple lines and curves. This approach was developed in two stages: firstly, two dimensional characters, and secondly three-dimensional character models. The image recognition CAPTCHA is based on the creation of cartoons out of faces. The faces used belong to people in the entertainment business, politicians, and sportsmen. The principal basis of this approach is that face perception is a cognitive process that humans perform easily and with a high rate of success. The process involves the use of face morphing techniques to distort the faces into cartoons, allowing the resulting image to be more robust against machine recognition. Exhaustive tests on both approaches using OCR software, SIFT image recognition, and face recognition software show an improvement in human recognition rate, whilst preventing robots break through the tests

    Accountable Algorithms

    Get PDF
    Many important decisions historically made by people are now made by computers. Algorithms count votes, approve loan and credit card applications, target citizens or neighborhoods for police scrutiny, select taxpayers for IRS audit, grant or deny immigration visas, and more. The accountability mechanisms and legal standards that govern such decision processes have not kept pace with technology. The tools currently available to policymakers, legislators, and courts were developed to oversee human decisionmakers and often fail when applied to computers instead. For example, how do you judge the intent of a piece of software? Because automated decision systems can return potentially incorrect, unjustified, or unfair results, additional approaches are needed to make such systems accountable and governable. This Article reveals a new technological toolkit to verify that automated decisions comply with key standards of legal fairness. We challenge the dominant position in the legal literature that transparency will solve these problems. Disclosure of source code is often neither necessary (because of alternative techniques from computer science) nor sufficient (because of the issues analyzing code) to demonstrate the fairness of a process. Furthermore, transparency may be undesirable, such as when it discloses private information or permits tax cheats or terrorists to game the systems determining audits or security screening. The central issue is how to assure the interests of citizens, and society as a whole, in making these processes more accountable. This Article argues that technology is creating new opportunities—subtler and more flexible than total transparency—to design decisionmaking algorithms so that they better align with legal and policy objectives. Doing so will improve not only the current governance of automated decisions, but also—in certain cases—the governance of decisionmaking in general. The implicit (or explicit) biases of human decisionmakers can be difficult to find and root out, but we can peer into the “brain” of an algorithm: computational processes and purpose specifications can be declared prior to use and verified afterward. The technological tools introduced in this Article apply widely. They can be used in designing decisionmaking processes from both the private and public sectors, and they can be tailored to verify different characteristics as desired by decisionmakers, regulators, or the public. By forcing a more careful consideration of the effects of decision rules, they also engender policy discussions and closer looks at legal standards. As such, these tools have far-reaching implications throughout law and society. Part I of this Article provides an accessible and concise introduction to foundational computer science techniques that can be used to verify and demonstrate compliance with key standards of legal fairness for automated decisions without revealing key attributes of the decisions or the processes by which the decisions were reached. Part II then describes how these techniques can assure that decisions are made with the key governance attribute of procedural regularity, meaning that decisions are made under an announced set of rules consistently applied in each case. We demonstrate how this approach could be used to redesign and resolve issues with the State Department’s diversity visa lottery. In Part III, we go further and explore how other computational techniques can assure that automated decisions preserve fidelity to substantive legal and policy choices. We show how these tools may be used to assure that certain kinds of unjust discrimination are avoided and that automated decision processes behave in ways that comport with the social or legal standards that govern the decision. We also show how automated decisionmaking may even complicate existing doctrines of disparate treatment and disparate impact, and we discuss some recent computer science work on detecting and removing discrimination in algorithms, especially in the context of big data and machine learning. And lastly, in Part IV, we propose an agenda to further synergistic collaboration between computer science, law, and policy to advance the design of automated decision processes for accountabilit
    • …
    corecore