585 research outputs found

    Quantitative And Qualitative Evaluation Of Metrics On Object Graphs Extracted By Abstract Interpretation

    Get PDF
    Evaluating programming-language based techniques is crucial to judge their usefulness in practice but requires a careful selection of systems on which to evaluate the technique. Since it is particularly hard to evaluate a heavyweight technique, such as one that requires adding annotations to the code or rewriting the system in a radically different language, it is common to use a lightweight proxy to predict the technique\u27s usefulness for a system. But the reliability of such a proxy is unclear. We propose a principled data-driven approach to derive a lightweight proxy for a heavyweight technique that requires adding annotations to the code. The approach involves the following: computing metrics (DiffMetrics) that measure differences between a system representation (e.g., the code structure) and the system representation extracted by the heavyweight technique (e.g., abstraction of the runtime structure); identifying the outliers of the DiffMetrics; identifying code patterns and classifying the outliers based on the identified code patterns; implementing visitors that look for the code patterns on systems with no annotations; identifying code metrics that correlate strongly with the DiffMetrics. For a new system with no annotations, a proxy predicts if the heavyweight technique may be useful based on the results from the visitors and the code metrics. To evaluate the approach, we run the visitors and compute code metrics on four systems that were previously not analyzed. The proxy predicts that the heavyweight technique may be useful two of the systems. Thus, the abstract runtime structure may be significantly different from the code structure for those systems. To validate the proxy\u27s predictions, we run the heavyweight technique on the two systems to confirm the predictions. Such a principled approach is reusable and can be applied on any programming-language based technique to identify systems for evaluation and for a better understanding the types of systems for which a technique is most useful

    Static Extraction Of Dataflow Communication For Security

    Get PDF
    The cost of security vulnerabilities in widely-deployed code such as mobile applications is high. As a result, many companies are using Architectural Risk Analysis (ARA) to find security vulnerabilities before releasing their applications. The existing analyses are focused on finding local coding bugs such as a hard-coded password, rather than architectural flaws such as bypassing the authentication component. During ARA, to find vulnerabilities that are architectural flaws, security architects use a forest-level view of the runtime architecture instead of reading the code. Unfortunately, such a view is often missing from the documentation or is inconsistent with the code. This thesis contributes Scoria, a semi-automated approach for finding architectural flaws that uses a static analysis to extract from code with annotations an approximation of the runtime architecture as an abstract object graph with dataflow edges that refer to abstract objects. The annotations express local, modular hints about architectural tiers, logical containment, and strict encapsulation, such that the extracted object graph is hierarchical, which provides architects with both high-level and detailed understanding of the runtime architecture. Moreover, the abstract object graph is sound such that it has unique representatives for all objects and dataflow communication that may exist at runtime. Architects assisted by Scoria can write as machine-checkable constraints various security policies that are documented only informally. The constraints are in terms of object provenance and indirect communication and can find vulnerabilities missed by constraints that focus only on the presence or the absence of communication, or constraints that track only information flow from sources to sinks. The evaluation consists of expressing several rules from the CERT Secure Coding Standard for Java for which automated detection was previously unavailable. Scoria is also being used to find information disclosure in open-source Android apps. Based on an existing benchmark, Scoria performs better than commercial and research tools in terms of precision and recall. Scoria is thus making Architectural Risk Analysis, which is today mostly manual and informal, a more rigorous, principled and repeatable activity

    Static Extraction Of Dataflow Communication For Security

    Get PDF
    The cost of security vulnerabilities in widely-deployed code such as mobile applications is high. As a result, many companies are using Architectural Risk Analysis (ARA) to find security vulnerabilities before releasing their applications. The existing analyses are focused on finding local coding bugs such as a hard-coded password, rather than architectural flaws such as bypassing the authentication component. During ARA, to find vulnerabilities that are architectural flaws, security architects use a forest-level view of the runtime architecture instead of reading the code. Unfortunately, such a view is often missing from the documentation or is inconsistent with the code. This thesis contributes Scoria, a semi-automated approach for finding architectural flaws that uses a static analysis to extract from code with annotations an approximation of the runtime architecture as an abstract object graph with dataflow edges that refer to abstract objects. The annotations express local, modular hints about architectural tiers, logical containment, and strict encapsulation, such that the extracted object graph is hierarchical, which provides architects with both high-level and detailed understanding of the runtime architecture. Moreover, the abstract object graph is sound such that it has unique representatives for all objects and dataflow communication that may exist at runtime. Architects assisted by Scoria can write as machine-checkable constraints various security policies that are documented only informally. The constraints are in terms of object provenance and indirect communication and can find vulnerabilities missed by constraints that focus only on the presence or the absence of communication, or constraints that track only information flow from sources to sinks. The evaluation consists of expressing several rules from the CERT Secure Coding Standard for Java for which automated detection was previously unavailable. Scoria is also being used to find information disclosure in open-source Android apps. Based on an existing benchmark, Scoria performs better than commercial and research tools in terms of precision and recall. Scoria is thus making Architectural Risk Analysis, which is today mostly manual and informal, a more rigorous, principled and repeatable activity

    Evaluation Of An Architectural-Level Approach For Finding Security Vulnerabilities

    Get PDF
    The cost of security vulnerabilities of a software system is high. As a result, many techniques have been developed to find the vulnerabilities at development time. Of particular interest are static analysis techniques that can consider all possible executions of a system. But, static analysis can suffer from a large number of false positives. A recently developed approach, Scoria, is a semi-automated static analysis that requires security architects to annotate the code, typecheck the annotations, extract a hierarchical object graph and write constraints in order to find security vulnerabilities in a system. This thesis evaluates Scoria on three systems (sizes 6 KLOC, 6 KLOC and 25 KLOC) from different application domains (Android and Web) and confirms that Scoria can find security vulnerabilities in those systems without an excessive number of false positives

    Automated Refinement Of Hierarchical Object Graphs

    Get PDF
    Object graphs help explain the runtime structure of a system. To make object graphs convey design intent, one insight is to use abstraction by hierarchy, i.e., to show objects that are implementation details as children of architecturally-relevant objects from the application domain. But additional information is needed to express this object hierarchy, using ownership type qualifiers in the code. Adding qualifiers after the fact involves manual overhead, and requires developers to switch between adding qualifiers in the code and looking at abstract object graphs to understand the object structures that the qualifiers describe. We propose an approach where developers express their design intent by refining an object graph directly, while an inference analysis infers valid qualifiers in the code. We present, formalize and implement the inference analysis. Novel features of the inference analysis compared to closely related work include a larger set of qualifiers to support less restrictive object hierarchy (logical containment) in addition to strict hierarchy (strict encapsulation), as well as object uniqueness and object borrowing. A separate extraction analysis then uses these qualifiers and extracts an updated object graph. We evaluate the approach on two subject systems. One of the subject systems is reproduced from an experiment using related techniques and another ownership type system, which enables a meaningful comparison. For the other subject system, we use its documentation to pick refinements that express design intent. We compute metrics on the refinements (how many attempts on each subject system) and classify them by their type. We also compute metrics on the inferred qualifiers and metrics on the object graphs to enable quantitative comparison. Moreover, we qualitatively compare the hierarchical object graphs with the flat object graphs and with each other, by highlighting how they express design intent. Finally, we confirm that the approach can infer from refinements valid qualifiers such that the extracted object graphs reflect the design intent of the refinements

    Towards Implicit Parallel Programming for Systems

    Get PDF
    Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually. In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution. We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates

    Towards Implicit Parallel Programming for Systems

    Get PDF
    Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually. In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution. We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates

    A multi-paradigm language for reactive synthesis

    Get PDF
    This paper proposes a language for describing reactive synthesis problems that integrates imperative and declarative elements. The semantics is defined in terms of two-player turn-based infinite games with full information. Currently, synthesis tools accept linear temporal logic (LTL) as input, but this description is less structured and does not facilitate the expression of sequential constraints. This motivates the use of a structured programming language to specify synthesis problems. Transition systems and guarded commands serve as imperative constructs, expressed in a syntax based on that of the modeling language Promela. The syntax allows defining which player controls data and control flow, and separating a program into assumptions and guarantees. These notions are necessary for input to game solvers. The integration of imperative and declarative paradigms allows using the paradigm that is most appropriate for expressing each requirement. The declarative part is expressed in the LTL fragment of generalized reactivity(1), which admits efficient synthesis algorithms, extended with past LTL. The implementation translates Promela to input for the Slugs synthesizer and is written in Python. The AMBA AHB bus case study is revisited and synthesized efficiently, identifying the need to reorder binary decision diagrams during strategy construction, in order to prevent the exponential blowup observed in previous work.Comment: In Proceedings SYNT 2015, arXiv:1602.0078
    • …
    corecore