1,005 research outputs found
MintHint: Automated Synthesis of Repair Hints
Being able to automatically repair programs is an extremely challenging task.
In this paper, we present MintHint, a novel technique for program repair that
is a departure from most of today's approaches. Instead of trying to fully
automate program repair, which is often an unachievable goal, MintHint performs
statistical correlation analysis to identify expressions that are likely to
occur in the repaired code and generates, using pattern-matching based
synthesis, repair hints from these expressions. Intuitively, these hints
suggest how to rectify a faulty statement and help developers find a complete,
actual repair. MintHint can address a variety of common faults, including
incorrect, spurious, and missing expressions.
We present a user study that shows that developers' productivity can improve
manyfold with the use of repair hints generated by MintHint -- compared to
having only traditional fault localization information. We also apply MintHint
to several faults of a widely used Unix utility program to further assess the
effectiveness of the approach. Our results show that MintHint performs well
even in situations where (1) the repair space searched does not contain the
exact repair, and (2) the operational specification obtained from the test
cases for repair is incomplete or even imprecise
Recommended from our members
Automated Testing and Debugging for Big Data Analytics
The prevalence of big data analytics in almost every large-scale software system has generated a substantial push to build data-intensive scalable computing (DISC) frameworks such as Google MapReduce and Apache Spark that can fully harness the power of existing data centers. However, frameworks once used by domain experts are now being leveraged by data scientists, business analysts, and researchers. This shift in user demographics calls for immediate advancements in the development, debugging, and testing practices of big data applications, which are falling behind compared to the DISC framework design and implementation. In practice, big data applications often fail as users are unable to test all behaviors emerging from interleaving dataflow operators, user-defined functions, and framework's code. "Testing based on a random sample" rarely guarantees the reliability and "trial and error" and "print" debugging methods are expensive and time-consuming. Thus, the current practice of developing a big data application must be improved and the tools built to enhance the developer's productivity must adapt to the distinct characteristics of data-intensive scalable computing. By synthesizing ideas from software engineering and database systems, our hypothesis is that we can design effective and scalable testing and debugging algorithms for big data analytics without compromising the performance and efficiency of the underlying DISC framework. To design such techniques, we investigate how we can build interactive and responsive debugging primitives that significantly reduce the debugging time, yet do not pose much performance overhead on big data applications. Furthermore, we investigate how we can leverage data provenance techniques from databases and fault-isolation algorithms from software engineering to pinpoint the minimal subset of failure-inducing inputs efficiently. To improve the reliability of big data analytics, we investigate how we can abstract the semantics of dataflow operators and use them in tandem with the semantics of user-defined functions to generate a minimum set of synthetic test inputs capable of revealing more defects than the entire input dataset.To examine the first hypothesis, we introduce interactive, real-time debugging primitives for big data analytics through innovative and scalable debugging features such as simulated breakpoint, dynamic watchpoint, and crash culprit identification. Second, we design a new automated fault localization approach that combines insights from both the software engineering and database literature to bring delta debugging closer to a reality in the big data applications by leveraging data provenance and by constructing systems optimizations for debugging provenance queries. Lastly, we devise a new symbolic-execution based white-box testing algorithm for big data applications that abstracts the implementation of dataflow operators using logical specifications instead of modeling their implementations and combines them with the semantics of any arbitrary user-defined function. We instantiate the idea of an interactive debugging algorithm as BigDebug, the idea of an automated debugging algorithm as BigSift, and the idea of symbolic execution-based testing as BigTest. Our investigation shows that the interactive debugging primitives can scale to terabytes---our record-level tracing incurs less than 25% overhead on average and provides up to 100% time saving compared to the baseline replay debugger. Second, we observe that by combining data provenance with delta debugging, we can identify the minimum faulty input in just under 30% of the original job execution time. Lastly, we verify that by abstracting dataflow operators using logical specifications, we can efficiently generate the most concise test data suitable for local testing while revealing twice as many faults as prior approaches. Our investigations collectively demonstrate that developer productivity can be significantly improved through effective and scalable testing and debugging techniques for big data analytics, without impacting the DISC framework's performance. This dissertation affirms the feasibility of automated debugging and testing techniques for big data analytics---techniques that were previously considered infeasible for large-scale data processing
Anchor: Locating Android Framework-specific Crashing Faults
Android framework-specific app crashes are hard to debug. Indeed, the
callback-based event-driven mechanism of Android challenges crash localization
techniques that are developed for traditional Java programs. The key challenge
stems from the fact that the buggy code location may not even be listed within
the stack trace. For example, our empirical study on 500 framework-specific
crashes from an open benchmark has revealed that 37 percent of the crash types
are related to bugs that are outside the stack traces. Moreover, Android
programs are a mixture of code and extra-code artifacts such as the Manifest
file. The fact that any artifact can lead to failures in the app execution
creates the need to position the localization target beyond the code realm. In
this paper, we propose Anchor, a two-phase suspicious bug location suggestion
tool. Anchor specializes in finding crash-inducing bugs outside the stack
trace. Anchor is lightweight and source code independent since it only requires
the crash message and the apk file to locate the fault. Experimental results,
collected via cross-validation and in-the-wild dataset evaluation, show that
Anchor is effective in locating Android framework-specific crashing faults.Comment: 12 page
Automated metamorphic testing on the analyses of feature models
Copyright © 2010 Elsevier B.V. All rights reserved.Context: A feature model (FM) represents the valid combinations of features in a domain. The automated extraction of information from FMs is a complex task that involves numerous analysis operations, techniques and tools. Current testing methods in this context are manual and rely on the ability of the tester to decide whether the output of an analysis is correct. However, this is acknowledged to be time-consuming, error-prone and in most cases infeasible due to the combinatorial complexity of the analyses, this is known as the oracle problem.Objective: In this paper, we propose using metamorphic testing to automate the generation of test data for feature model analysis tools overcoming the oracle problem. An automated test data generator is presented and evaluated to show the feasibility of our approach.Method: We present a set of relations (so-called metamorphic relations) between input FMs and the set of products they represent. Based on these relations and given a FM and its known set of products, a set of neighbouring FMs together with their corresponding set of products are automatically generated and used for testing multiple analyses. Complex FMs representing millions of products can be efficiently created by applying this process iteratively.Results: Our evaluation results using mutation testing and real faults reveal that most faults can be automatically detected within a few seconds. Two defects were found in FaMa and another two in SPLOT, two real tools for the automated analysis of feature models. Also, we show how our generator outperforms a related manual suite for the automated analysis of feature models and how this suite can be used to guide the automated generation of test cases obtaining important gains in efficiency.Conclusion: Our results show that the application of metamorphic testing in the domain of automated analysis of feature models is efficient and effective in detecting most faults in a few seconds without the need for a human oracle.This work has been partially supported by the European Commission(FEDER)and Spanish Government under CICYT project SETI(TIN2009-07366)and the Andalusian Government project ISABEL(TIC-2533)
Simulation-based testing of highly configurable cyber-physical systems: automation, optimization and debugging
Sistema Ziber-Fisikoek sistema ziber digitalak sistema fisikoekin uztartzen dituzte. Sistema hauen aldakortasuna handitzen ari da erabiltzaileen hainbat behar betetzeko. Ondorioz, sistema ziber-fisikoa aldakorrak edota produktu lerroak ari dira garatzen eta sistema hauek milaka edo milioika konfiguraziotan konfiguratu daitezke. Sistema ziber-fisiko aldakorren test eta balidazioa prozesua garestia da, batez ere probatu beharreko konfigurazio kopuruaren ondorioz. Konfigurazio kopuru altuak sistemaren prototipo bat erabiltzea ezinezkoa egiten du. Horregatik, sistema ziber-fisiko aldagarriak simulazio modeloak erabilita probatzen dira. Hala ere, simulazio bidez sistema
ziber-fisikoak probatzea erronka izaten jarraitzen du. Hasteko, simulazio denbora altua izaten da normalki, software-az aparte, sistema fisikoa simulatu behar delako. Sistema fisiko hau normalean modelo matematiko konplexuen bitartez modelatzen da, konputazionalki garestia delarik. Jarraitzeko, sistema ziber-fisikoek ingeniaritzaren domeinu ezberdinak dituzte tartean, adibidez mekanika edo elektronika. Domeinu bakoitzak bere simulazio erremienta erabiltzen du, eta erremienta guzti hauek interkonektatzeko ko-simulazioa erabiltzen da. Nahiz eta ko-simulazioa abantaila bat izan ematen duen flexibilitateagatik, simulagailu ezberdinen erabilerak simulazio denbora handiagotzen du. Azkenik, sistema ziber-fisikoak simulaziopean probatzean, probak maila ezberdinetan egin behar dira (adb., Model, Software eta Hardware-in-the-Loop mailak), eta honek, proba-kasuak exekutatzeko denbora handitzen du. Tesi honen helburua sistema ziber-fisiko aldakorren test jardunbideak hobetzea da, horretarako automatizazio, optimizazio eta arazketa metodoak proposatzen ditu. Automatizazioari dagokionez, lehenengo, erremienta-bidezko metodologia bat proposatzen da. Metodologia hau test sistema instantziak automatikoki sortzeko gai
da, test sistema hauek sistema ziber-fisiko aldagarrien konfigurazioak automatikoki probatzeko gai dira (adb., test orakuluen bitartez). Bigarren, test frogak automatikoki sortzeko planteamendu bat proposatzen da helburu anitzeko bilaketa algoritmoak erabilita. Optimizazioari dagokionez, test frogen aukeraketarako planteamendu bat eta test frogen priorizaziorako beste planteamendu bat proposatzen dira, biak bilaketa alix goritmoak erabiliz, sistema ziber-fisiko aldakorrak test maila ezberdinetan probatzeko helburuarekin. Arazketari dagokionez, “espektroan oinarritutako falten lokalizazioa” izeneko teknika bat produktu lerroen testuingurura adaptatu da, eta faltak isolatzeko metodo bat proposatzen da. Honek, falta ezberdinak lokalizatzea errezten du ez bakarrik sistema ziber-fisiko aldakorretan, baizik eta edozein produktu lerrotan non “feature model” delako modeloak erabiltzen diren aldakortasuna kudeatzeko.Los sistemas cyber-físicos (CPSs) integran tecnologías digitales con procesos físicos. La variabilidad de estos sistemas está creciendo para responder a la demanda de diferentes clientes. Como consecuencia de ello, los CPSs están volviéndose configurables e incluso líneas de producto, lo que significa que pueden ser configurados en miles y millones de configuraciones. El testeo de sistemas cyber-físicos configurables es un proceso costoso, en general debido a la cantidad de configuraciones que han de ser testeadas. El número de configuraciones a testear hace imposible el uso de un prototipo del sistema. Por ello, los sistemas CPSs configurables están siendo testeadas utilizando modelos de simulación. Sin embargo, el testeo de sistemas cyber-físicos
bajo simulación sigue siendo un reto. Primero, el tiempo de simulación es normalmente largo, ya que, además del software, la capa física del CPS ha de ser testeada. Esta capa física es típicamente modelada con modelos matemáticos complejos, lo cual es computacionalmente caro. Segundo, los sistemas cyber-físicos implican el uso de diferentes dominios de la ingeniería, como por ejemplo la mecánica o la electrónica. Por ello, para interconectar diferentes herramientas de modelado y simulación hace falta el uso de la co-simulación. A pesar de que la co-simulación es una ventaja en términos de flexibilidad para los ingenieros, el uso de diferentes simuladores hace que el tiempo de simulación sea más largo. Por último, al testear sistemas cyberfísicos haciendo uso de simulación, existen diferentes niveles (p.ej., Model, Software y Hardware-in-the-Loop), lo cual incrementa el tiempo para ejecutar casos de test.
Esta tesis tiene como objetivo avanzar en la práctica actual del testeo de sistemas cyber-físicos configurables, proponiendo métodos para la automatización, optimización y depuración. En cuanto a la automatización, primero, se propone una metodología soportada por una herramienta para generar automáticamente instancias de sistemas de test que permiten testear automáticamente configuraciones del sistema CPS configurable (p.ej., haciendo uso de oráculos de test). Segundo, se propone un enfoque para generación de casos de test basado en algoritmos de búsqueda multiobjetivo, los cuales generan un conjunto de casos de test. En cuanto a la optimización, se propone un enfoque para selección y otro para priorización de casos de test, ambos basados en algoritmos de búsqueda, de cara a testear eficientemente sistemas cyberfísicos configurables en diferentes niveles de test. En cuanto a la depuración, se adapta una técnica llamada “Localización de Fallos Basada en Espectro” al contexto de líneas de productos y proponemos un método de aislamiento de fallos. Esto permite localizar bugs no solo en sistemas cyber-físicos configurables sino también en cualquier línea de producto donde se utilicen modelos de características para gestionar la variabilidad.Cyber-Physical Systems (CPSs) integrate digital cyber technologies with physical processes. The variability of these systems is increasing in order to give solution to the different customers demands. As a result, CPSs are becoming configurable or even product lines, which means that they can be set into thousands or millions of configurations. Testing configurable CPSs is a time consuming process, mainly due to the large amount of configurations that need to be tested. The large amount of configurations that need to be tested makes it infeasible to use a prototype of the system. As a result, configurable CPSs are being tested using simulation. However, testing CPSs under simulation is still challenging. First, the simulation time is usually long, since apart of the software, the physical layer needs to be simulated. This physical layer is typically modeled with complex mathematical models, which is computationally very costly. Second, CPSs involve different domains, such as, mechanical and electrical. Engineers of different domains typically employ different tools for modeling their subsystems. As a result, co-simulation is being employed to interconnect different modeling and simulation tools. Despite co-simulation being an advantage in terms of engineers flexibility, the use of different simulation tools makes the simulation time
longer. Lastly, when testing CPSs employing simulation, different test levels exist (i.e., Model, Software and Hardware-in-the-Loop), what increases the time for executing test cases.
This thesis aims at advancing the current practice on testing configurable CPSs by proposing methods for automation, optimization and debugging. Regarding automation, first, we propose a tool supported methodology to automatically generate test system instances that permit automatically testing configurations of the configurable CPS (e.g., by employing test oracles). Second, we propose a test case generation approach based on multi-objective search algorithms that generate cost-effective test suites. As for optimization, we propose a test case selection and a test case prioritization approach, both of them based on search algorithms, to cost-effectively test configurable CPSs at different test levels. Regarding debugging, we adapt a technique
named Spectrum-Based Fault Localization to the product line engineering context and propose a fault isolation method. This permits localizing bugs not only in configurable CPSs but also in any product line where feature models are employed to model variability
- …