10 research outputs found

    Security Applications of Formal Language Theory

    Get PDF
    We present an approach to improving the security of complex, composed systems based on formal language theory, and show how this approach leads to advances in input validation, security modeling, attack surface reduction, and ultimately, software design and programming methodology. We cite examples based on real-world security flaws in common protocols representing different classes of protocol complexity. We also introduce a formalization of an exploit development technique, the parse tree differential attack, made possible by our conception of the role of formal grammars in security. These insights make possible future advances in software auditing techniques applicable to static and dynamic binary analysis, fuzzing, and general reverse-engineering and exploit development. Our work provides a foundation for verifying critical implementation components with considerably less burden to developers than is offered by the current state of the art. It additionally offers a rich basis for further exploration in the areas of offensive analysis and, conversely, automated defense tools and techniques. This report is divided into two parts. In Part I we address the formalisms and their applications; in Part II we discuss the general implications and recommendations for protocol and software design that follow from our formal analysis

    A Grammatical Inference Approach to Language-Based Anomaly Detection in XML

    Full text link
    False-positives are a problem in anomaly-based intrusion detection systems. To counter this issue, we discuss anomaly detection for the eXtensible Markup Language (XML) in a language-theoretic view. We argue that many XML-based attacks target the syntactic level, i.e. the tree structure or element content, and syntax validation of XML documents reduces the attack surface. XML offers so-called schemas for validation, but in real world, schemas are often unavailable, ignored or too general. In this work-in-progress paper we describe a grammatical inference approach to learn an automaton from example XML documents for detecting documents with anomalous syntax. We discuss properties and expressiveness of XML to understand limits of learnability. Our contributions are an XML Schema compatible lexical datatype system to abstract content in XML and an algorithm to learn visibly pushdown automata (VPA) directly from a set of examples. The proposed algorithm does not require the tree representation of XML, so it can process large documents or streams. The resulting deterministic VPA then allows stream validation of documents to recognize deviations in the underlying tree structure or datatypes.Comment: Paper accepted at First Int. Workshop on Emerging Cyberthreats and Countermeasures ECTCM 201

    Speaking the Local Dialect: Exploiting differences between IEEE 802.15.4 Receivers with Commodity Radios for fingerprinting, targeted attacks, and WIDS evasion

    Get PDF
    Producing IEEE 802.15.4 PHY-frames reliably accepted by some digital radio receivers, but rejected by others---depending on the receiver chip\u27s make and model---has strong implications for wireless security. Attackers could target specific receivers by crafting shaped charges, attack frames that appear valid to the intended target and are ignored by all other recipients. By transmitting in the unique, slightly non-compliant dialect of the intended receivers, attackers would be able to create entire communication streams invisible to others, including wireless intrusion detection and prevention systems (WIDS/WIPS). These scenarios are no longer theoretic. We present methods of producing such IEEE 802.15.4 frames with commodity digital radio chips widely used in building inexpensive 802.15.4-conformant devices. Typically, PHY-layer fingerprinting requires software-defined radios that cost orders of magnitude more than the chips they fingerprint; however, our methods do not require a software-defined radio and use the same inexpensive chips. Knowledge of such differences, and the ability to fingerprint them is crucial for defenders. We investigate new methods of fingerprinting IEEE 802.15.4 devices by exploring techniques to differentiate between multiple 802.15.4-conformant radio-hardware manufacturers and firmware distributions. Further, we point out the implications of these results for WIDS, both with respect to WIDS evasion techniques and countering such evasion

    Automatic Generation of Input Grammars Using Symbolic Execution

    Get PDF
    Invalid input often leads to unexpected behavior in a program and is behind a plethora of known and unknown vulnerabilities. To prevent improper input from being processed, the input needs to be validated before the rest of the program executes. Formal language theory facilitates the definition and recognition of proper inputs. We focus on the problem of defining valid input after the program has already been written. We construct a parser that infers the structure of inputs which avoid vulnerabilities while existing work focuses on inferring the structure of input the program anticipates. We present a tool that constructs an input language, given the program as input, using symbolic execution on symbolic arguments. This differs from existing work which tracks the execution of concrete inputs to infer a grammar. We test our tool on programs with known vulnerabilities, including programs in the GNU Coreutils library, and we demonstrate how the parser catches known invalid inputs. We conclude that the synthesis of the complete parser cannot be entirely automated due to limitations of symbolic execution tools and issues of computability. A more comprehensive parser must additionally be informed by examples and counterexamples of the input language

    Parsing Protocol Standards to Parse Standard Protocols

    Get PDF
    Internet protocol standards have been slow to adopt formal protocol description languages and methodologies, and are still largely written as English prose. This makes it hard to check them for correctness, or to automatically derive implementations from standards. Reasons for this are both technical and social. Some methodologies effectively describe complex communication patterns, but cannot model protocol data. Others are unnecessarily tied to particular description formats, or use unfamiliar concepts and terminology, and don't address usability by standards developers. We assess the viability of existing approaches to modelling and parsing protocol data, and identify missing features needed to represent emerging protocols. We present a typed protocol representation that can describe: (i) the format of protocol data, including data-dependent formats; (ii) contextual information needed to maintain parser state, where correct parsing may depend on out-of-band information or prior packets; and (iii) transformations and helper functions needed for multi-stage parsing. We discuss social barriers to adoption, and describe a set of principles to encourage use of formal languages within the Internet standards process. We show how to integrate our approach with the existing standards process, using QUIC as an example

    Bridging the Gap Between Intent and Outcome: Knowledge, Tools & Principles for Security-Minded Decision-Making

    Get PDF
    Well-intentioned decisions---even ones intended to improve aggregate security--- may inadvertently jeopardize security objectives. Adopting a stringent password composition policy ostensibly yields high-entropy passwords; however, such policies often drive users to reuse or write down passwords. Replacing URLs in emails with safe URLs that navigate through a gatekeeper service that vets them before granting user access may reduce user exposure to malware; however, it may backfire by reducing the user\u27s ability to parse the URL or by giving the user a false sense of security if user expectations misalign with the security checks delivered by the vetting process. A short timeout threshold may ensure the user is promptly logged out when the system detects they are away; however, if an infuriated user copes by inserting a USB stick in their computer to emulate mouse movements, then not only will the detection mechanism fail but the insertion of the USB stick may present a new attack surface. These examples highlight the disconnect between decision-maker intentions and decision outcomes. Our focus is on bridging this gap. This thesis explores six projects bound together by the core objective of empowering people to make decisions that achieve their security and privacy objectives. First, we use grounded theory to examine Amazon reviews of password logbooks and to obtain valuable insights into users\u27 password management beliefs, motivations, and behaviors. Second, we present a discrete-event simulation we built to assess the efficacy of password policies. Third, we explore the idea of supplementing language-theoretic security with human-computability boundaries. Fourth, we conduct an eye-tracking study to understand users\u27 visual processes while parsing and classifying URLs. Fifth, we discuss preliminary findings from a study conducted on Amazon Mechanical Turk to examine why users fall for unsafe URLs. And sixth, we develop a logic-based representation of mismorphisms, which allows us to express the root causes of security problems. Each project demonstrates a key technique that can help in bridging the gap between intent and outcome
    corecore