6 research outputs found
Enhancing Reuse of Constraint Solutions to Improve Symbolic Execution
Constraint solution reuse is an effective approach to save the time of
constraint solving in symbolic execution. Most of the existing reuse approaches
are based on syntactic or semantic equivalence of constraints; e.g. the Green
framework is able to reuse constraints which have different representations but
are semantically equivalent, through canonizing constraints into syntactically
equivalent normal forms. However, syntactic/semantic equivalence is not a
necessary condition for reuse--some constraints are not syntactically or
semantically equivalent, but their solutions still have potential for reuse.
Existing approaches are unable to recognize and reuse such constraints.
In this paper, we present GreenTrie, an extension to the Green framework,
which supports constraint reuse based on the logical implication relations
among constraints. GreenTrie provides a component, called L-Trie, which stores
constraints and solutions into tries, indexed by an implication partial order
graph of constraints. L-Trie is able to carry out logical reduction and logical
subset and superset querying for given constraints, to check for reuse of
previously solved constraints. We report the results of an experimental
assessment of GreenTrie against the original Green framework, which shows that
our extension achieves better reuse of constraint solving result and saves
significant symbolic execution time.Comment: this paper has been submitted to conference ISSTA 201
On The Use of Over-Approximate Analysis in Support of Software Development and Testing
The effectiveness of dynamic program analyses, such as profiling and memory-leak detection, crucially depend on the quality of the test inputs. However, adequate sets of inputs are rarely available. Existing automated input generation techniques can help but tend to be either too expensive or ineffective. For example, traditional symbolic execution scales poorly to real-world programs and random input generation may never reach deep states within the program.
For scalable, effective, automated input generation that can better support dynamic analysis, I propose an approach that extends traditional symbolic execution by targeting increasingly small fragments of a program. The approach starts by generating inputs for the whole program and progressively introduces additional unconstrained state until it reaches a given program coverage objective. This approach is applicable to any client dynamic analysis requiring high coverage that is also tolerant of over-approximated program behavior--behavior that cannot occur on a complete execution.
To assess the effectiveness of my approach, I applied it to two client techniques. The first technique infers the actual path taken by a program execution by observing the CPU's electromagnetic emanations and requires inputs to generate a model that can recognize executed path segments.
The client inference works by piece wise matching the observed emanation waveform to those recorded in a model.
It requires the model to be complete (i.e. contain every piece) and the waveforms are sufficiently distinct that the inclusion of extra samples is unlikely to cause a misinference.
After applying my approach to generate inputs covering all subsegments of the program’s execution paths, I designed a source generator to automatically construct a harness and scaffolding to replay these inputs against fragments of the original program.
The inference client constructs the model by recording the harness execution.
The second technique performs automated regression testing by identifying behavioral differences between two program versions and requires inputs to perform differential testing.
It explores local behavior in a neighborhood of the program changes by generating inputs to functions near (as measured by call-graph) to the modified code.
The inputs are then concretely executed on both versions, periodically checking internal state for behavioral differences.
The technique requires high coverage inputs for a full examination, and tolerates infeasible local state since both versions likely execute it equivalently.
I will then present a separate technique to improve the coverage obtained by symbolic execution of floating-point programs.
This technique is equally applicable to both traditional symbolic execution and my progressively under-constrained symbolic execution.
Its key idea is to approximate floating-point expressions with fixed-point analogs.
In concluding, I will also discuss future research directions, including additional empirical evaluations and the investigation of additional client analyses that could benefit from my approach.Ph.D