11 research outputs found

    Using Machine Learning to Infer Constraints for Product Lines

    Get PDF
    Variability intensive systems may include several thousand features allowing for an enormous number of possible configurations, including wrong ones (e.g. the derived product does not compile). For years, engineers have been using constraints to a priori restrict the space of possible configurations, i.e. to exclude configurations that would violate these constraints. The challenge is to find the set of constraints that would be both precise (allow all correct configurations) and complete (never allow a wrong configuration with respect to some oracle). In this paper, we propose the use of a machine learning approach to infer such product-line constraints from an oracle that is able to assess whether a given product is correct. We propose to randomly generate products from the product line, keeping for each of them its resolution model. Then we classify these products accord-ing to the oracle, and use their resolution models to infer crosstree constraints over the product-line. We validate our approach on a product-line video generator, using a simple computer vision algorithm as an oracle. We show that an interesting set of cross-tree constraint can be generated, with reasonable precision and recall

    Towards quality assurance of software product lines with adversarial configurations

    Get PDF
    International audienceSoftware product line (SPL) engineers put a lot of effort to ensure that, through the setting of a large number of possible configuration options, products are acceptable and well-tailored to customers’ needs. Unfortunately, options and their mutual interactions create a huge configuration space which is intractable to exhaustively explore. Instead of testing all products, machine learning is increasingly employed to approximate the set of acceptable products out of a small training sample of configurations. Machine learning (ML) techniques can refine a software product line through learned constraints and a priori prevent non-acceptable products to be derived. In this paper, we use adversarial ML techniques to generate adversarial configurations fooling ML classifiers and pinpoint incorrect classifications of products (videos) derived from an industrial video generator. Our attacks yield (up to) a 100% misclassification rate and a drop in accuracy of 5%. We discuss the implications these results have on SPL quality assurance

    Monte Carlo Tree Search for Feature Model Analyses: a General Framework for Decision-Making

    Get PDF
    The colossal solution spaces of most configurable systems make intractable their exhaustive exploration. Accordingly, relevant anal-yses remain open research problems. There exist analyses alterna-tives such as SAT solving or constraint programming. However, none of them have explored simulation-based methods. Monte Carlo-based decision making is a simulation based method for deal-ing with colossal solution spaces using randomness. This paper proposes a conceptual framework that tackles various of those anal-yses using Monte Carlo methods, which have proven to succeed in vast search spaces (e.g., game theory). Our general framework is described formally, and its flexibility to cope with a diversity of analysis problemsis discussed (e.g., finding defective configurations, feature model reverse engineering or getting optimal performance configurations). Additionally, we present a Python implementation of the framework that shows the feasibility of our proposal. With this contribution, we envision that different problems can be ad dressed using Monte Carlo simulations and that our framework can be used to advance the state of the art a step forward.Ministerio de Economía y Competitividad RTI2018-101204-B-C22 (OPHELIA

    Uniform and scalable sampling of highly configurable systems

    Get PDF
    Many analyses on confgurable software systems are intractable when confronted with colossal and highly-constrained confguration spaces. These analyses could instead use statistical inference, where a tractable sample accurately predicts results for the entire space. To do so, the laws of statistical inference requires each member of the population to be equally likely to be included in the sample, i.e., the sampling process needs to be “uniform”. SAT-samplers have been developed to generate uniform random samples at a reasonable computational cost. However, there is a lack of experimental validation over colossal spaces to show whether the samplers indeed produce uniform samples or not. This paper (i) proposes a new sampler named BDDSampler, (ii) presents a new statistical test to verify sampler uniformity, and (iii) reports the evaluation of BDDSampler and fve other state-of-the-art samplers: KUS, QuickSampler, Smarch, Spur, and Unigen2. Our experimental results show only BDDSampler satisfes both scalability and uniformity.Universidad Nacional de Educación a Distancia (UNED) OPTIVAC 096-034091 2021V/PUNED/008Ministerio de Ciencia, Innovación y Universidades RTI2018-101204-B-C22 (OPHELIA)Comunidad Autónoma de Madrid ROBOCITY2030-DIH-CM S2018/NMT-4331Agencia Estatal de Investigación TIN2017-90644-RED

    Empirical Assessment of Generating Adversarial Configurations for Software Product Lines

    Get PDF
    International audienceSoftware product line (SPL) engineering allows the derivation of products tailored to stakeholders' needs through the setting of a large number of configuration options. Unfortunately, options and their interactions create a huge configuration space which is either intractable or too costly to explore exhaustively. Instead of covering all products, machine learning (ML) approximates the set of acceptable products (e.g., successful builds, passing tests) out of a training set (a sample of configurations). However, ML techniques can make prediction errors yielding non-acceptable products wasting time, energy and other resources. We apply adversarial machine learning techniques to the world of SPLs and craft new configurations faking to be acceptable configurations but that are not and vice-versa. It allows to diagnose prediction errors and take appropriate actions. We develop two adversarial configuration generators on top of state-of-the-art attack algorithms and capable of synthesizing configurations that are both adversarial and conform to logical constraints. We empirically assess our generators within two case studies: an industrial video synthesizer (MOTIV) and an industry-strength, open-source Web-appconfigurator (JHipster). For the two cases, our attacks yield (up to) a 100% misclassification rate without sacrificing the logical validity of adversarial configurations. This work lays the foundations of a quality assurance framework for ML-based SPLs

    Learning Very Large Configuration Spaces: What Matters for Linux Kernel Sizes

    Get PDF
    Linux kernels are used in a wide variety of appliances, many of them having strong requirements on the kernel size due to constraints such as limited memory or instant boot. With more than ten thousands of configuration options to choose from, obtaining a suitable trade off between kernel size and functionality is an extremely hard problem. Developers, contributors, and users actually spend significant effort to document, understand, and eventually tune (combinations of) options for meeting a kernel size. In this paper, we investigate how machine learning can help explain what matters for predicting a given Linux kernel size. Unveiling what matters in such very large configuration space is challenging for two reasons: (1) whatever the time we spend on it, we can only build and measure a tiny fraction of possible kernel configurations; (2) the prediction model should be both accurate and interpretable. We compare different machine learning algorithms and demonstrate the benefits of specific feature encoding and selection methods to learn an accurate model that is fast to compute and simple to interpret. Our results are validated over 95,854 kernel configurations and show that we can achieve low prediction errors over a reduced set of options. We also show that we can extract interpretable information for refining documentation and experts' knowledge of Linux, or even assigning more sensible default values to options

    Constraint-based automated reconstruction of grape bunches from 3D range data for high-throughput phenotyping

    Get PDF
    With increasing global population, the resources for agriculture required to feed the growing number of people are becoming scarce. Estimates expect that by 2050, 60 % more food will be necessary. Nowadays, 70 % of fresh water is used by agriculture and experts see no potential for new land to use for crop plants. This means that existing land has to be used efficiently in a sustainable way. To support this, plant breeders aim at the improvement of yield, quality, disease-resistance, and other important characteristics of the crops. Reports show that grapevine cultivation uses more than three times of the amount of fungicides than the cultivation of fruit trees or vegetables. This is caused by grapevine being prone to various fungal diseases and pests that quickly spread over fields. A loose grape bunch architecture is one of the most important physical barriers that make the establishment of a fungal infection less likely. The grape bunch architecture is mostly defined by the inner stem skeleton. The phenotyping of grape bunches refers to the measurement of the phenotypes, i.e., the observable traits of a plant, like the diameter of berries or the lengths of stems. Because of their perishable nature, grape bunches have to be processed in a relatively short time. On the other hand, genetic analyses require data from a large number of them. Manual phenotyping is error-prone and highly labor- and time-intensive, yielding the need for automated, high-throughput methods. The objective of this thesis is to develop a completely automated pipeline that gets as input a 3D pointcloud showing a grape bunch and computes a 3D reconstruction of the complete grape bunch, including the inner stem skeleton. The result is a 3D estimation of the grape bunch that represents not only dimensions (e.g. berry diameters) or statistics (e.g. the number of berries), but the geometry and topology as well. All architectural (i.e., geometrical and topological) traits can be derived from this complete 3D reconstruction. We aim at high-throughput phenotyping by automatizing all steps and removing any requirement for interaction with the user, while still providing an interface for a detailed visualization and possible adjustments of the parameters. There are several challenges to this task: ripe grape bunches are subject to a high amount of self-occlusion, rendering a direct reconstruction of the stem skeleton impossible. The stem skeleton structure is complex, thus, the manual creation of training data is hard. We aim at a cross-cultivation approach and there is high variability between cultivars and even between grape bunches of the same cultivar. Thus, we cannot rely on statistical distributions for single plant organ dimensions. We employ geometrical and topological constraints to meet the challenge of cross-cultivar optimization and foster efficient sampling of infinitely large hypotheses spaces, resulting in Pearson correlation coefficients between 0.7 and 0.9 for established traits traditionally used by breeders. The active working time is reduced by a factor of 12. We evaluate the pipeline for the application on scans taken in a lab environment and in the field
    corecore