82 research outputs found

    Dreaming of eReading Futures

    Get PDF

    Understanding Variability-Aware Analysis in Low-Maturity Variant-Rich Systems

    Get PDF
    Context: Software systems often exist in many variants to support varying stakeholder requirements, such as specific market segments or hardware constraints. Systems with many variants (a.k.a. variant-rich systems) are highly complex due to the variability introduced to support customization. As such, assuring the quality of these systems is also challenging since traditional single-system analysis techniques do not scale when applied. To tackle this complexity, several variability-aware analysis techniques have been conceived in the last two decades to assure the quality of a branch of variant-rich systems called software product lines. Unfortunately, these techniques find little application in practice since many organizations do use product-line engineering techniques, but instead rely on low-maturity \clo~strategies to manage their software variants. For instance, to perform an analysis that checks that all possible variants that can be configured by customers (or vendors) in a car personalization system conform to specified performance requirements, an organization needs to explicitly model system variability. However, in low-maturity variant-rich systems, this and similar kinds of analyses are challenging to perform due to (i) immature architectures that do not systematically account for variability, (ii) redundancy that is not exploited to reduce analysis effort, and (iii) missing essential meta-information, such as relationships between features and their implementation in source code.Objective: The overarching goal of the PhD is to facilitate quality assurance in low-maturity variant-rich systems. Consequently, in the first part of the PhD (comprising this thesis) we focus on gaining a better understanding of quality assurance needs in such systems and of their properties.Method: Our objectives are met by means of (i) knowledge-seeking research through case studies of open-source systems as well as surveys and interviews with practitioners; and (ii) solution-seeking research through the implementation and systematic evaluation of a recommender system that supports recording the information necessary for quality assurance in low-maturity variant-rich systems. With the former, we investigate, among other things, industrial needs and practices for analyzing variant-rich systems; and with the latter, we seek to understand how to obtain information necessary to leverage variability-aware analyses.Results: Four main results emerge from this thesis: first, we present the state-of-practice in assuring the quality of variant-rich systems, second, we present our empirical understanding of features and their characteristics, including information sources for locating them; third, we present our understanding of how best developers\u27 proactive feature location activities can be supported during development; and lastly, we present our understanding of how features are used in the code of non-modular variant-rich systems, taking the case of feature scattering in the Linux kernel.Future work: In the second part of the PhD, we will focus on processes for adapting variability-aware analyses to low-maturity variant-rich systems.Keywords:\ua0Variant-rich Systems, Quality Assurance, Low Maturity Software Systems, Recommender Syste

    Modelling, Reverse Engineering, and Learning Software Variability

    Get PDF
    The society expects software to deliver the right functionality, in a short amount of time and with fewer resources, in every possible circumstance whatever are the hardware, the operating systems, the compilers, or the data fed as input. For fitting such a diversity of needs, it is common that software comes in many variants and is highly configurable through configuration options, runtime parameters, conditional compilation directives, menu preferences, configuration files, plugins, etc. As there is no one-size-fits-all solution, software variability ("the ability of a software system or artifact to be efficiently extended, changed, customized or configured for use in a particular context") has been studied the last two decades and is a discipline of its own. Though highly desirable, software variability also introduces an enormous complexity due to the combinatorial explosion of possible variants. For example, the Linux kernel has 15000+ options and most of them can have 3 values: "yes", "no", or "module". Variability is challenging for maintaining, verifying, and configuring software systems (Web applications, Web browsers, video tools, etc.). It is also a source of opportunities to better understand a domain, create reusable artefacts, deploy performance-wise optimal systems, or find specialized solutions to many kinds of problems. In many scenarios, a model of variability is either beneficial or mandatory to explore, observe, and reason about the space of possible variants. For instance, without a variability model, it is impossible to establish a sampling strategy that would satisfy the constraints among options and meet coverage or testing criteria. I address a central question in this HDR manuscript: How to model software variability? I detail several contributions related to modelling, reverse engineering, and learning software variability. I first contribute to support the persons in charge of manually specifying feature models, the de facto standard for modeling variability. I develop an algebra together with a language for supporting the composition, decomposition, diff, refactoring, and reasoning of feature models. I further establish the syntactic and semantic relationships between feature models and product comparison matrices, a large class of tabular data. I then empirically investigate how these feature models can be used to test in the large configurable systems with different sampling strategies. Along this effort, I report on the attempts and lessons learned when defining the "right" variability language. From a reverse engineering perspective, I contribute to synthesize variability information into models and from various kinds of artefacts. I develop foundations and methods for reverse engineering feature models from satisfiability formulae, product comparison matrices, dependencies files and architectural information, and from Web configurators. I also report on the degree of automation and show that the involvement of developers and domain experts is beneficial to obtain high-quality models. Thirdly, I contribute to learning constraints and non-functional properties (performance) of a variability-intensive system. I describe a systematic process "sampling, measuring, learning" that aims to enforce or augment a variability model, capturing variability knowledge that domain experts can hardly express. I show that supervised, statistical machine learning can be used to synthesize rules or build prediction models in an accurate and interpretable way. This process can even be applied to huge configuration space, such as the Linux kernel one. Despite a wide applicability and observed benefits, I show that each individual line of contributions has limitations. I defend the following answer: a supervised, iterative process (1) based on the combination of reverse engineering, modelling, and learning techniques; (2) capable of integrating multiple variability information (eg expert knowledge, legacy artefacts, dynamic observations). Finally, this work opens different perspectives related to so-called deep software variability, security, smart build of configurations, and (threats to) science

    An approach to safely evolve preprocessor-based C program families.

    Get PDF
    Desde os anos 70, o pré-processador C é amplamente utilizado na prática para adaptar sistemas para diferentes plataformas e cenários de aplicação. Na academia, no entanto, o pré-processador tem recebido fortes críticas desde o início dos anos 90. Os pesquisadores têm criticado a sua falta de modularidade, a sua propensão para introduzir erros sutis e sua ofuscação do código fonte. Para entender melhor os problemas de usar o pré-processador C,considerando a percepção dos desenvolvedores, realizamos 40 entrevistas e uma pesquisa entre 202 desenvolvedores. Descobrimos que os desenvolvedores lidam com três problemas comuns na prática: erros relacionados à configuração, testes combinatórios e compreensão do código. Os desenvolvedores agravam estes problemas ao usar diretivas não disciplinadas, as quais não respeitam a estrutura sintática do código. Para evoluir famílias de programas de forma segura, foram propostas duas estratégias para a detecção de erros relacionados à configuração e um conjunto de 14 refatoramentos para remover diretivas não disciplinadas. Para lidar melhor com a grande quantidade de configurações do código fonte, a primeira estratégia considera todo o conjunto de configurações do código fonte e a segunda estratégia utiliza amostragem. Para propor um algoritmo de amostragem adequado, foram comparados 10 algoritmos com relação ao esforço (número de configurações para testar) e capacidade de detecção de erros (número de erros detectados nas configurações da amostra). Com base nos resultados deste estudo, foi proposto um algoritmo de amostragem. Estudos empíricos foram realizados usando 40 sistemas C do mundo real. Detectamos 128 erros relacionados à configuração, enviamos 43 correções para erros ainda não corrigidos e os desenvolvedores aceitaram 65% das correções. Os resultados de nossa pesquisa mostram que a maioria dos desenvolvedores preferem usar a versão refatorada,ou seja,disciplinada do código fonte,ao invés do código original com as diretivas não disciplinadas. Além disso,os desenvolvedores aceitaram 21 (75%) das 28 sugestões enviadas para transformar diretivas não disciplinadas em disciplinadas. Nossa pesquisa apresenta resultados úteis para desenvolvedores de código C durante suas tarefas de desenvolvimento, contribuindo para minimizar o número de erros relacionados à configuração, melhorar a compreensão e a manutenção do código fonte e orientar os desenvolvedores para realizar testes combinatórios.Since the 70s, the C preprocessor is still widely used in practice in a numbers of projects, including Apache,Linux ,and Libssh, totail or systems to different platforms and application scenarios. In academia,however, the preprocess or has received strong critic is msinceatl east the early 90s. Researchers have criticized its lack of separation of concerns, its proneness to introduce subtle errors, and its obfuscation of the source code. To better understand the problems of using the C preprocessor, taking the perception of developers into account, we conducted 40 interviewsandasurveyamong 202 developers. We found that developers deal with three common problems in practice: configuration-related bugs, combinatorial testing, and code comprehension. Developers aggravate these problems when using undisciplined directives (i.e., bad smells regarding preprocessor use), which are preprocessor directives thatdo notrespect thesyntactic structureof thesource code. To safely evolve preprocessor based program families, we proposed strategies to detect configuration-relatedbugs and bad smells, and a set of 14 refactorings to remove bad smells. To better deal with exponential configuration spaces, our strategies uses variability-aware analysis that considers the entire set of possible configurations, and sampling, which allows to reuse C tools that consider only one configuration at a time to detect bugs. To propose a suitable sampling algorithm, we compared 10 algorithms with respect to effort (i.e., number of configurations to test) andbug-detection capabilities (i.e.,numberofbugs detected in the sampled configurations). Based on the results, we proposed a sampling algorithm with an useful balance between effort and bug-detection capability. We performed empirical studies using a corpus of 40 C real-world systems. We detected 128 configuration-related bugs, submitted 43 patches to fix bugs not fixed yet, and developers accepted 65% of the patches. The results of our survey show that most developers prefer to use the refactored (i.e., disciplined) version of the code instead of the original code with undisciplined directives. Furthermore, developers accepted 21 (75%) out of 28 patches submitted to refactor undisciplined into disciplined directives. Our work presents useful findings for C developers during their development tasks, contributing to minimize the chances of introducing configuration-related bugs and bad smells, improve code comprehension, and guide developers to perform combinatorial testing

    Variability Bugs::Program and Programmer Perspective

    Get PDF

    Estratégias comutativas para análise de confiabilidade em linha de produtos de software

    Get PDF
    Dissertação (mestrado) — Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2016.Engenharia de linha de produtos de software é uma forma de gerenciar sistematicamente a variabilidade e a comunalidade em sistemas de software, possibilitando a síntese automática de programas relacionados (produtos) a partir de um conjunto de artefatos reutilizáveis. No entanto, o número de produtos em uma linha de produtos de software pode crescer exponencialmente em função de seu número de características, tornando inviável vericar a qualidade de cada um desses produtos isoladamente. Existem diversas abordagens cientes de variabilidade para análise de linha de produtos, as quais adaptam técnicas de análise de produtos isolados para lidar com a variabilidade de forma e ciente. Tais abordagens podem ser classificadas em três dimensões de análise (product-based, family-based e feature-based ), mas, particularmente no contexto de análise de conabilidade, não existe uma teoria que compreenda (a) uma especificação formal das três dimensões e das estratégias de análise resultantes e (b) prova de que tais análises são equivalentes uma à outra. A falta de uma teoria com essas propriedades impede que se raciocine formalmente sobre o relacionamento entre as dimensões de análise e técnicas de análise derivadas, limitando a con ança nos resultados correspondentes a elas. Para preencher essa lacuna, apresentamos uma linha de produtos que implementa cinco abordagens para análise de con abilidade de linhas de produtos. Encontrou-se evidência empírica de que as cinco abordagens são equivalentes, no sentido em que resultam em con abilidades iguais ao analisar uma mesma linha de produtos. Além disso, formalizamos três das estratégias implementadas e provamos que elas são corretas, contanto que a abordagem probabilística para análise de con abilidade de produtos individuais também o seja. Por m, apresentamos um diagrama comutativo de passos intermediários de análise, o qual relaciona estratégias diferentes e permite reusar demonstrações de corretude entre elas.Software product line engineering is a means to systematically manage variability and commonality in software systems, enabling the automated synthesis of related programs (products) from a set of reusable assets. However, the number of products in a software product line may grow exponentially with the number of features, so it is practically infeasible to quality-check each of these products in isolation. There is a number of variability-aware approaches to product-line analysis that adapt single-product analysis techniques to cope with variability in an e cient way. Such approaches can be classi ed along three analysis dimensions (product-based, family-based, and feature-based), but, particularly in the context of reliability analysis, there is no theory comprising both (a) a formal speci cation of the three dimensions and resulting analysis strategies and (b) proof that such analyses are equivalent to one another. The lack of such a theory prevents formal reasoning on the relationship between the analysis dimensions and derived analysis techniques, thereby limiting the con dence in the corresponding results. To ll this gap, we present a product line that implements ve approaches to reliability analysis of product lines. We have found empirical evidence that all ve approaches are equivalent, in the sense that they yield equal reliabilities from analyzing a given product line. We also formalize three of the implemented strategies and prove that they are sound with respect to the probabilistic approach to reliability analysis of a single product. Furthermore, we present a commuting diagram of intermediate analysis steps, which relates di erent strategies and enables the reuse of soundness proofs between them

    Systematic Reuse and Ad Hoc Forking to Develop Software Variants

    Get PDF