585 research outputs found
Quantitative And Qualitative Evaluation Of Metrics On Object Graphs Extracted By Abstract Interpretation
Evaluating programming-language based techniques is crucial to judge their usefulness in practice but requires a careful selection of systems on
which to evaluate the technique. Since it is particularly hard to evaluate a heavyweight technique, such as one that requires adding annotations
to the code or rewriting the system in a radically different language, it is common to use a lightweight proxy to predict the technique\u27s usefulness
for a system. But the reliability of such a proxy is unclear.
We propose a principled data-driven approach to derive a lightweight proxy for a heavyweight technique that requires adding annotations to the code.
The approach involves the following: computing metrics (DiffMetrics) that measure differences between a system representation (e.g., the code structure)
and the system representation extracted by the heavyweight technique (e.g., abstraction of the runtime structure); identifying the outliers of the
DiffMetrics; identifying code patterns and classifying the outliers based on the identified code patterns; implementing visitors that look for the
code patterns on systems with no annotations; identifying code metrics that correlate strongly with the DiffMetrics. For a new system with no annotations,
a proxy predicts if the heavyweight technique may be useful based on the results from the visitors and the code metrics.
To evaluate the approach, we run the visitors and compute code metrics on four systems that were previously not analyzed. The proxy predicts that the
heavyweight technique may be useful two of the systems. Thus, the abstract runtime structure may be significantly different from the code structure for
those systems. To validate the proxy\u27s predictions, we run the heavyweight technique on the two systems to confirm the predictions.
Such a principled approach is reusable and can be applied on any programming-language based technique to identify systems for evaluation and for a
better understanding the types of systems for which a technique is most useful
Static Extraction Of Dataflow Communication For Security
The cost of security vulnerabilities in widely-deployed code such as mobile applications is high. As a result, many companies are using Architectural Risk Analysis (ARA) to find security vulnerabilities before releasing their applications. The existing analyses are focused on finding local coding bugs such as a hard-coded password, rather than architectural flaws such as bypassing the authentication component. During ARA, to find vulnerabilities that are architectural flaws, security architects use a forest-level view of the runtime architecture instead of reading the code. Unfortunately, such a view is often missing from the documentation or is inconsistent with the code.
This thesis contributes Scoria, a semi-automated approach for finding architectural flaws that uses a static analysis to extract from code with annotations an approximation of the runtime architecture as an abstract object graph with dataflow edges that refer to abstract objects. The annotations express local, modular hints about architectural tiers, logical containment, and strict encapsulation, such that the extracted object graph is hierarchical, which provides architects with both high-level and detailed understanding of the runtime architecture. Moreover, the abstract object graph is sound such that it has unique representatives for all objects and dataflow communication that may exist at runtime. Architects assisted by Scoria can write as machine-checkable constraints various security policies that are documented only informally. The constraints are in terms of object provenance and indirect communication and can find vulnerabilities missed by constraints that focus only on the presence or the absence of communication, or constraints that track only information flow from sources to sinks.
The evaluation consists of expressing several rules from the CERT Secure Coding Standard for Java for which automated detection was previously unavailable. Scoria is also being used to find information disclosure in open-source Android apps. Based on an existing benchmark, Scoria performs better than commercial and research tools in terms of precision and recall. Scoria is thus making Architectural Risk Analysis, which is today mostly manual and informal, a more rigorous, principled and repeatable activity
Static Extraction Of Dataflow Communication For Security
The cost of security vulnerabilities in widely-deployed code such as mobile applications is high. As a result, many companies are using Architectural Risk Analysis (ARA) to find security vulnerabilities before releasing their applications. The existing analyses are focused on finding local coding bugs such as a hard-coded password, rather than architectural flaws such as bypassing the authentication component. During ARA, to find vulnerabilities that are architectural flaws, security architects use a forest-level view of the runtime architecture instead of reading the code. Unfortunately, such a view is often missing from the documentation or is inconsistent with the code.
This thesis contributes Scoria, a semi-automated approach for finding architectural flaws that uses a static analysis to extract from code with annotations an approximation of the runtime architecture as an abstract object graph with dataflow edges that refer to abstract objects. The annotations express local, modular hints about architectural tiers, logical containment, and strict encapsulation, such that the extracted object graph is hierarchical, which provides architects with both high-level and detailed understanding of the runtime architecture. Moreover, the abstract object graph is sound such that it has unique representatives for all objects and dataflow communication that may exist at runtime. Architects assisted by Scoria can write as machine-checkable constraints various security policies that are documented only informally. The constraints are in terms of object provenance and indirect communication and can find vulnerabilities missed by constraints that focus only on the presence or the absence of communication, or constraints that track only information flow from sources to sinks.
The evaluation consists of expressing several rules from the CERT Secure Coding Standard for Java for which automated detection was previously unavailable. Scoria is also being used to find information disclosure in open-source Android apps. Based on an existing benchmark, Scoria performs better than commercial and research tools in terms of precision and recall. Scoria is thus making Architectural Risk Analysis, which is today mostly manual and informal, a more rigorous, principled and repeatable activity
Evaluation Of An Architectural-Level Approach For Finding Security Vulnerabilities
The cost of security vulnerabilities of a software system is high. As a result,
many techniques have been developed to find the vulnerabilities at development time. Of particular interest are static analysis techniques that can consider all possible executions of a system. But, static analysis can suffer from a large number of false positives.
A recently developed approach, Scoria, is a semi-automated static analysis that requires security architects to annotate the code, typecheck the annotations, extract a hierarchical object graph and write constraints in order to find security vulnerabilities in a system.
This thesis evaluates Scoria on three systems (sizes 6 KLOC, 6 KLOC and
25 KLOC) from different application domains (Android and Web) and confirms that Scoria can find security vulnerabilities in those systems without an excessive number of false positives
Automated Refinement Of Hierarchical Object Graphs
Object graphs help explain the runtime structure of a system. To make object graphs convey design intent, one insight is to use
abstraction by hierarchy, i.e., to show objects that are implementation details as children of architecturally-relevant objects from the application domain. But additional information is needed to express this object hierarchy, using ownership type qualifiers in the code. Adding qualifiers after the fact involves manual overhead, and requires developers to switch between adding qualifiers in the code and looking at abstract object graphs to understand the object structures that the qualifiers describe. We propose an approach where developers express their design intent by refining an object graph directly, while an inference analysis infers valid qualifiers in the code. We present, formalize and implement the inference analysis. Novel features of the inference analysis compared to closely related work include a larger set of qualifiers to support less restrictive object hierarchy (logical containment) in addition to strict hierarchy (strict encapsulation), as well as object uniqueness and object borrowing. A separate extraction analysis then uses these qualifiers and extracts an updated object graph. We evaluate the approach on two subject systems. One of the subject systems is reproduced from an experiment using related techniques and another ownership type system, which enables a meaningful comparison. For the other subject system, we use its documentation to pick refinements that express design intent. We compute metrics on the refinements (how many attempts on each subject system) and classify them by their type. We also compute metrics on the inferred qualifiers and metrics on the object graphs to enable quantitative comparison. Moreover, we qualitatively compare the hierarchical object graphs with the flat object graphs and with each other, by highlighting how they express design intent. Finally, we confirm that the approach can infer from refinements valid qualifiers such that the extracted object graphs reflect the design intent of the refinements
Towards Implicit Parallel Programming for Systems
Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually.
In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution.
We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates
Towards Implicit Parallel Programming for Systems
Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually.
In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution.
We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates
A multi-paradigm language for reactive synthesis
This paper proposes a language for describing reactive synthesis problems
that integrates imperative and declarative elements. The semantics is defined
in terms of two-player turn-based infinite games with full information.
Currently, synthesis tools accept linear temporal logic (LTL) as input, but
this description is less structured and does not facilitate the expression of
sequential constraints. This motivates the use of a structured programming
language to specify synthesis problems. Transition systems and guarded commands
serve as imperative constructs, expressed in a syntax based on that of the
modeling language Promela. The syntax allows defining which player controls
data and control flow, and separating a program into assumptions and
guarantees. These notions are necessary for input to game solvers. The
integration of imperative and declarative paradigms allows using the paradigm
that is most appropriate for expressing each requirement. The declarative part
is expressed in the LTL fragment of generalized reactivity(1), which admits
efficient synthesis algorithms, extended with past LTL. The implementation
translates Promela to input for the Slugs synthesizer and is written in Python.
The AMBA AHB bus case study is revisited and synthesized efficiently,
identifying the need to reorder binary decision diagrams during strategy
construction, in order to prevent the exponential blowup observed in previous
work.Comment: In Proceedings SYNT 2015, arXiv:1602.0078
- …