33 research outputs found

    Mutation Testing Advances: An Analysis and Survey

    Get PDF

    Code Coverage Measurement and Fault Localization Approaches

    Get PDF
    Code coverage measurement plays an important role in white-box testing, both in industrial practice and academic research. Several areas are highly dependent on code coverage as well, including test case generation, test prioritization, fault localization, and others. Out of these areas, this dissertation focuses on two main topics, and the thesis points are divided into two parts accordingly. The first part consists of one thesis point that discusses the differences between methods for measuring code coverage in Java and the effects of these differences. The second part focuses on a fault localization technique called spectrum-based fault localization that utilizes code coverage to estimate the risk of each program element being faulty. More specifically, the corresponding two thesis points are discussing the improvement of the efficiency of spectrum-based approaches by incorporating external information, e.g., users’ knowledge, and context data extracted from call chains

    Fundamental Approaches to Software Engineering

    Get PDF
    computer software maintenance; computer software selection and evaluation; formal logic; formal methods; formal specification; programming languages; semantics; software engineering; specifications; verificatio

    Test Generation and Test Prioritization for Simulink Models with Dynamic Behavior

    Get PDF
    All engineering disciplines are founded and rely on models, although they may differ on purposes and usages of modeling. Among the different disciplines, the engineering of Cyber Physical Systems (CPSs) particularly relies on models with dynamic behaviors (i.e., models that exhibit time-varying changes). The Simulink modeling platform greatly appeals to CPS engineers since it captures dynamic behavior models. It further provides seamless support for two indispensable engineering activities: (1) automated verification of abstract system models via model simulation, and (2) automated generation of system implementation via code generation. We identify three main challenges in the verification and testing of Simulink models with dynamic behavior, namely incompatibility, oracle and scalability challenges. We propose a Simulink testing approach that attempts to address these challenges. Specifically, we propose a black-box test generation approach, implemented based on meta-heuristic search, that aims to maximize diversity in test output signals generated by Simulink models. We argue that in the CPS domain test oracles are likely to be manual and therefore the main cost driver of testing. In order to lower the cost of manual test oracles, we propose a test prioritization algorithm to automatically rank test cases generated by our test generation algorithm according to their likelihood to reveal a fault. Engineers can then select, according to their test budget, a subset of the most highly ranked test cases. To demonstrate scalability, we evaluate our testing approach using industrial Simulink models. Our evaluation shows that our test generation and test prioritization approaches outperform baseline techniques that rely on random testing and structural coverage

    Certifications of Critical Systems – The CECRIS Experience

    Get PDF
    In recent years, a considerable amount of effort has been devoted, both in industry and academia, to the development, validation and verification of critical systems, i.e. those systems whose malfunctions or failures reach a critical level both in terms of risks to human life as well as having a large economic impact.Certifications of Critical Systems – The CECRIS Experience documents the main insights on Cost Effective Verification and Validation processes that were gained during work in the European Research Project CECRIS (acronym for Certification of Critical Systems). The objective of the research was to tackle the challenges of certification by focusing on those aspects that turn out to be more difficult/important for current and future critical systems industry: the effective use of methodologies, processes and tools.The CECRIS project took a step forward in the growing field of development, verification and validation and certification of critical systems. It focused on the more difficult/important aspects of critical system development, verification and validation and certification process. Starting from both the scientific and industrial state of the art methodologies for system development and the impact of their usage on the verification and validation and certification of critical systems, the project aimed at developing strategies and techniques supported by automatic or semi-automatic tools and methods for these activities, setting guidelines to support engineers during the planning of the verification and validation phases

    Where were the repair ingredients for Defects4j bugs?

    Get PDF
    A significant body of automated program repair research has built approaches under the redundancy assumption. Patches are then heuristically generated by leveraging repair ingredients (change actions and donor code) that are found in code bases (either the buggy program itself or big code). For example, common change actions (i.e., fix patterns) are frequently mined offline and serve as an important ingredient for many patch generation engines. Although the repetitiveness of code changes has been studied in general, the literature provides little insight into the relationship between the performance of the repair system and the source code base where the change actions were mined. Similarly, donor code is another important repair ingredient to concretize patches guided by abstract patterns. Yet, little attention has been paid to where such ingredients can actually be found. Through a large scale empirical study on the execution results of 24 repair systems evaluated on realworld bugs from Defects4J, we provide a comprehensive view on the distribution of repair ingredients that are relevant for these bugs. In particular, we show that (1) a half of bugs cannot be fixed simply because the relevant repair ingredient is not available in the search space of donor code; (2) bugs that are correctly fixed by literature tools are mostly addressed with shallow change actions; (3) programs with little history of changes can benefit from mining change actions in other programs; (4) parts of donor code to repair a given bug can be found separately at different search locations; (5) bug-triggering test cases are a rich source for donor code search

    Certifications of Critical Systems – The CECRIS Experience

    Get PDF
    In recent years, a considerable amount of effort has been devoted, both in industry and academia, to the development, validation and verification of critical systems, i.e. those systems whose malfunctions or failures reach a critical level both in terms of risks to human life as well as having a large economic impact.Certifications of Critical Systems – The CECRIS Experience documents the main insights on Cost Effective Verification and Validation processes that were gained during work in the European Research Project CECRIS (acronym for Certification of Critical Systems). The objective of the research was to tackle the challenges of certification by focusing on those aspects that turn out to be more difficult/important for current and future critical systems industry: the effective use of methodologies, processes and tools.The CECRIS project took a step forward in the growing field of development, verification and validation and certification of critical systems. It focused on the more difficult/important aspects of critical system development, verification and validation and certification process. Starting from both the scientific and industrial state of the art methodologies for system development and the impact of their usage on the verification and validation and certification of critical systems, the project aimed at developing strategies and techniques supported by automatic or semi-automatic tools and methods for these activities, setting guidelines to support engineers during the planning of the verification and validation phases

    A multi-armed bandit approach for enhancing test case prioritization in continuous integration environments

    Get PDF
    Orientador: Silvia Regina VergilioTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 10/12/2021Inclui referênciasÁrea de concentração: Ciência da ComputaçãoResumo: A Integração Contínua (do inglês Continuous Integration, CI) é uma prática comum e amplamente adotada na indústria que permite a integração frequente de mudanças de software, tornando a evolução do software mais rápida e econômica. Em ambientes que adotam CI, o Teste de Regressão (do inglês Regression Testing, RT) é fundamental para assegurar que mudanças realizadas não afetaram negativamente o comportamento do sistema. No entanto, RT é uma tarefa cara. Para reduzir os custos do RT, o uso de técnicas de priorização de casos de teste (do inglês Test Case Prioritization, TCP) desempenha um papel importante. Essas técnicas visam a identificar a ordem para os casos de teste que maximiza objetivos específicos, como a detecção antecipada de falhas. Recentemente, muitos estudos surgiram no contexto de TCP para ambientes de CI (do inglês Test Case Prioritization in Continuous Integration, TCPCI), mas poucos estudos consideram particularidades destes ambientes, tais como restrições de tempo e a volatilidade dos casos de teste, ou seja, eles não consideram o ambiente dinâmico do ciclo de vida do software no qual novos casos de teste podem ser adicionados ou removidos (descontinuados) de um ciclo para outro. A volatilidade de casos de teste está relacionada ao dilema de Exploração versus Intensificação (do inglês Exploration versus Exploitation, EvE). Para resolver este dilema uma abordagem precisa balancear: i) a diversidade do conjunto de testes; e ii) a quantidade de novos casos de teste e testes que possuem alta probabilidade de revelar defeitos. Para lidar com isso, a maioria das abordagens usa, além do histórico de falhas, outras métricas que consideram instrumentação de código ou necessitam de informações adicionais, tais como a cobertura de testes. Contudo, manter as informações atualizadas pode ser difícil e consumir tempo, e não ser escalável devido ao orçamento de teste do ambiente de CI. Neste contexto, e para lidar apropriadamente com o problema de TCPCI, este trabalho apresenta uma abordagem baseada em problemas Multi-Armed Bandit (MAB) chamada COLEMAN (Combinatorial VOlatiLE Multi-Armed BANdiT). Problemas MAB são uma classe de problemas de decisão sequencial que são intensamente estudados para resolver o dilema de EvE. O problema de TCPCI enquadra-se na categoria volátil e combinatorial, pois múltiplos braços (casos de teste) necessitam ser selecionados, e eles são adicionados ou removidos ao longos dos ciclos. COLEMAN foi avaliada em diferentes sistemas do mundo real, orçamentos de teste, funções de recompensa, e políticas MAB, em relação a diferentes abordagens da literatura, e também no contexto de Sistemas Altamente Configuráveis (do inglês Highly-Configurable Software, HCS). Diferentes indicadores de qualidade foram utilizados, englobando diferentes perspectivas tais como a eficácia da detecção de defeitos (com e sem considerar custo), rápida detecção de defeitos, redução do tempo de teste, tempo de priorização, e acurácia. Os resultados mostram que a abordagem COLEMAN é promissora e endossam sua aplicabilidade no problema de TCPCI. Em comparação com RETECS, uma abordagem do estado da arte baseada em Aprendizado por Reforço, COLEMAN apresenta uma melhor eficácia em detectar defeitos em ˜ 82% dos casos, e detecta-os mais rapidamente em 100% dos casos. COLEMAN gasta um tempo negligível, menos do que um segundo para executar, e é mais estável do que a abordagem RETECS, ou seja, melhor se adapta para lidar com os picos de defeitos. Quando comparada com uma abordagem baseada em busca, COLEMAN provê soluções próximas das ótimas em ˜ 90% dos casos, e soluções razoáveis em ˜ 92% dos casos em comparação com uma abordagem determinística. Portanto, a contribuição deste trabalho é introduzir uma abordagem eficiente e eficaz para o problema de TCPCI.Abstract: Continuous Integration (CI) is a practice commonly and widely adopted in the industry to allowfrequent integration of software changes, making software evolution faster and cost-effective. In CIenvironments, Regression Testing (RT) is fundamental to ensure that changes have not adverselyaffected existing features of the system. However, RT is an expensive task. To reduce RT costs,the use of Test Case Prioritization (TCP) techniques plays an important role. These techniquesattempt to identify the test case order that maximizes specific goals, such as early fault detection.Recently, many studies on TCP in CI environments (TCPCI) have arisen, but few pieces of workconsider CI particularities, such as the time constraint and the test case volatility, that is, they donot consider the dynamic environment of the software life-cycle in which new test cases can beadded or removed (discontinued) over time. The test case volatility is a characteristic related tothe Exploration versus Exploitation (EvE) dilemma. To solve such a dilemma an approach needsto balance: i) the diversity of the test suite; and ii) the quantity of new test cases and test casesthat are error-prone or that comprise high fault-detection capabilities. To deal with this, mostapproaches use, besides the failure-history, other measures that rely on code instrumentation orrequire additional information, such as testing coverage. However, maintaining this informationupdated can be difficult and time-consuming, not scalable due to the test budget of CI environments.In this context, and to properly deal with the TCPCI problem, this work presents an approachbased on Multi-Armed Bandit (MAB) called COLEMAN (Combinatorial VOlatiLE Multi-ArmedBANdiT). The MAB problems are a class of sequential decision problems that are intensivelystudied for solving the EvE dilemma. The TCPCI problem falls into the category of volatileand combinatorial MAB, because multiple arms (test cases) need to be selected, and they areadded or removed over the cycles. COLEMAN was evaluated under different real-world softwaresystems, time budgets, reward functions, and MAB policies, against different approaches fromthe literature, and also considering the Highly-Configurable Software context. Different qualityindicators were used to encompass different perspectives such as fault detection effectiveness (andwith cost consideration), early fault detection, test time reduction, prioritization time, and accuracy.The outcomes show that COLEMAN is promising and endorse its applicability for the TCPCIproblem. COLEMAN outperforms RETECS, a state-of-the-art approach based on ReinforcementLearning, and stands out mainly regarding fault detection effectiveness (in ~ 82% of the cases)and early fault detection (in 100%). COLEMAN spends a negligible time, less than one second toexecute, and is more stable than RETECS, that is, adapts better to deal with peak of faults. Whencompared with a search-based approach, COLEMAN provides near-optimal solutions in ~ 90% ofthe cases, and in comparison with a deterministic approach, provides reasonable solutions in 92%of the cases. Thus, the main contribution of this work is to provide an efficient and efficaciousMAB-based approach for the TCPCI problem

    Deep Learning Software Repositories

    Get PDF
    Bridging the abstraction gap between artifacts and concepts is the essence of software engineering (SE) research problems. SE researchers regularly use machine learning to bridge this gap, but there are three fundamental issues with traditional applications of machine learning in SE research. Traditional applications are too reliant on labeled data. They are too reliant on human intuition, and they are not capable of learning expressive yet efficient internal representations. Ultimately, SE research needs approaches that can automatically learn representations of massive, heterogeneous, datasets in situ, apply the learned features to a particular task and possibly transfer knowledge from task to task. Improvements in both computational power and the amount of memory in modern computer architectures have enabled new approaches to canonical machine learning tasks. Specifically, these architectural advances have enabled machines that are capable of learning deep, compositional representations of massive data depots. The rise of deep learning has ushered in tremendous advances in several fields. Given the complexity of software repositories, we presume deep learning has the potential to usher in new analytical frameworks and methodologies for SE research and the practical applications it reaches. This dissertation examines and enables deep learning algorithms in different SE contexts. We demonstrate that deep learners significantly outperform state-of-the-practice software language models at code suggestion on a Java corpus. Further, these deep learners for code suggestion automatically learn how to represent lexical elements. We use these representations to transmute source code into structures for detecting similar code fragments at different levels of granularity—without declaring features for how the source code is to be represented. Then we use our learning-based framework for encoding fragments to intelligently select and adapt statements in a codebase for automated program repair. In our work on code suggestion, code clone detection, and automated program repair, everything for representing lexical elements and code fragments is mined from the source code repository. Indeed, our work aims to move SE research from the art of feature engineering to the science of automated discovery
    corecore