8,705 research outputs found
Understanding, Discovering and Leveraging a Software System's Effective Configuration Space
Many modern software systems are highly configurable. While a high degree of configurability has many benefits, such as extensibility, reusability and portability, it also has its costs. In the worst case, the full configuration space of a system is the exponentially large combination of all possible option settings and every configuration can potentially produce unique behavior in the software system. Therefore, this software configuration space explosion problem adds combinatorial complexity to many already difficult software engineering tasks.
To date, much of the research in this area has tackled this problem using black-box techniques, such as combinatorial interaction testing (CIT). Although these techniques are promising in systematizing the testing and analysis of configurable systems, they ignore a system's internal structure and we think that is a huge missed opportunity. We hypothesize that systems are often structured such that their effective configuration spaces -- the set of configurations needed to achieve a specific goal -- are often much smaller than their full configuration spaces. And if we can efficiently identify or approximate the effective configuration spaces, then we can use that information to greatly improve various software engineering tasks.
To understand the effective configuration spaces of software systems, we used symbolic evaluation, a white-box analysis, to capture all executions a system can take under any configuration. The symbolic evaluation results confirmed that the effective configuration spaces are in fact the composition of many small, self-contained groupings of options. And we developed analysis techniques to succinctly characterize how configurations interact with a system's internal structures. We showed that while the majority of a system's interactions are relatively low strength, some important high-strength interactions do exist, and that existing approaches such as CIT are highly unlikely to generate them in practice.
Results from our in-depth investigations serve as the foundation for developing new approaches to efficiently discovering effective configuration spaces. We proposed a new algorithm called interaction tree discovery (iTree) that aims to identify sets of configurations that are smaller than those generated by CIT, while also including important high-strength interactions missed by practical applications of CIT. On each iteration of iTree, we first use low-strength covering array to test the system under, and then apply machine learning techniques to discover new interactions that are potentially responsible for any new coverage seen. By repeating this process, iTree builds up a set of configurations likely to contain key high-strength interactions. We evaluated iTree and our results strongly suggest that iTree can identify high-coverage sets of configurations more effectively than traditional CIT or random sampling.
We next developed the interaction learning approach that estimates the configuration interactions underlying the effective configuration space by building classification models for iTree execution results. This approach is light-weight, yet produces accurate estimates of the interactions; making leveraging effective configuration spaces practical for many software engineering tasks. Using this approach, we were able to approximate the effective configuration space of the ~1M-LOC MySQL, something that is infeasible using existing techniques, at very low cost
Beyond XSPEC: Towards Highly Configurable Analysis
We present a quantitative comparison between software features of the defacto
standard X-ray spectral analysis tool, XSPEC, and ISIS, the Interactive
Spectral Interpretation System. Our emphasis is on customized analysis, with
ISIS offered as a strong example of configurable software. While noting that
XSPEC has been of immense value to astronomers, and that its scientific core is
moderately extensible--most commonly via the inclusion of user contributed
"local models"--we identify a series of limitations with its use beyond
conventional spectral modeling. We argue that from the viewpoint of the
astronomical user, the XSPEC internal structure presents a Black Box Problem,
with many of its important features hidden from the top-level interface, thus
discouraging user customization. Drawing from examples in custom modeling,
numerical analysis, parallel computation, visualization, data management, and
automated code generation, we show how a numerically scriptable, modular, and
extensible analysis platform such as ISIS facilitates many forms of advanced
astrophysical inquiry.Comment: Accepted by PASP, for July 2008 (15 pages
Recommended from our members
Improving Performance of M-to-N Processing and Data Redistribution in In Transit Analysis and Visualization
In an in transit setting, a parallel data producer, such as a numerical simulation, runs on one set of ranks M, while a data consumer, such as a parallel visualization application, runs on a different set of ranks N. One of the central challenges in this in transit setting is to determine the mapping of data from the set of M producer ranks to the set of N consumer ranks. This is a challenging problem for several reasons, such as the producer and consumer codes potentially having different scaling characteristics and different data models. The resulting mapping from M to N ranks can have a significant impact on aggregate application performance. In this work, we present an approach for performing this M-to-N mapping in a way that has broad applicability across a diversity of data producer and consumer applications. We evaluate its design and performance with
a study that runs at high concurrency on a modern HPC platform. By leveraging design characteristics, which facilitate an “intelligent” mapping from M-to-N, we observe significant performance gains are possible in terms of several different metrics, including time-to-solution and amount of data moved
Performance-Detective: Automatic Deduction of Cheap and Accurate Performance Models
The many configuration options of modern applications make it difficult for users to select a performance-optimal configuration. Performance models help users in understanding system performance and choosing a fast configuration. Existing performance modeling approaches for applications and configurable systems either require a full-factorial experiment design or a sampling design based on heuristics. This results in high costs for achieving accurate models. Furthermore, they require repeated execution of experiments to account for measurement noise. We propose Performance-Detective, a novel code analysis tool that deduces insights on the interactions of program parameters. We use the insights to derive the smallest necessary experiment design and avoiding repetitions of measurements when possible, significantly lowering the cost of performance modeling. We evaluate Performance-Detective using two case studies where we reduce the number of measurements from up to 3125 to only 25, decreasing cost to only 2.9% of the previously needed core hours, while maintaining accuracy of the resulting model with 91.5% compared to 93.8% using all 3125 measurements
- …