17 research outputs found

    mrstudyr: Retrospectively Studying the Effectiveness of Mutant Reduction Techniques

    Get PDF
    Mutation testing is a well-known method for measuring a test suite’s quality. However, due to its computational expense and intrinsic difficulties (e.g., detecting equivalent mutants and potentially checking a mutant’s status for each test), mutation testing is often challenging to practically use. To control the computational cost of mutation testing, many reduction strategies have been proposed (e.g., uniform random sampling over mutants). Yet, a stand-alone tool to compare the efficiency and effectiveness of these methods is heretofore unavailable. Since existing mutation testing tools are often complex and languagedependent, this paper presents a tool, called mrstudyr, that enables the “retrospective” study of mutant reduction methods using the data collected from a prior analysis of all mutants. Focusing on the mutation operators and the mutants that they produce, the presented tool allows developers to prototype and evaluate mutant reducers without being burdened by the implementation details of mutation testing tools. Along with describing mrstudyr’s design and overviewing the experimental results from using it, this paper inaugurates the public release of this open-source tool

    Automated Layout Failure Detection for Responsive Web Pages Without an Explicit Oracle

    Get PDF
    As the number and variety of devices being used to access the World Wide Web grows exponentially, ensuring the correct presentation of a web page, regardless of the device used to browse it, is an important and challenging task. When developers adopt responsive web design (RWD) techniques, web pages modify their appearance to accommodate a device’s display constraints. However, a current lack of automated support means that presentation failures may go undetected in a page’s layout when rendered for different viewport sizes. A central problem is the difficulty in providing an automated “oracle” to validate RWD layouts against, meaning that checking for failures is largely a manual process in practice, which results in layout failures in many live responsive web sites. This paper presents an automated failure detection technique that checks the consistency of a responsive page’s layout across a range of viewport widths, obviating the need for an explicit oracle. In an empirical study, this method found failures in 16 of 26 real-world production pages studied, detecting 33 distinct failures in total

    Hitchhikers Need Free Vehicles! Shared Repositories for Statistical Analysis in SBST

    Get PDF
    As a means for improving the maturity of the data analysis methods used in the search-based software testing field, this paper presents the need for shared repositories of well-documented statistical analysis code and replication data. In addition to explaining the benefits associated with using these repositories, the paper gives suggestions (e.g., the testing of analysis code) for improving the study of data arising from experiments with randomized algorithms

    Automated Search for Good Coverage Criteria: Moving from Code Coverage to Fault Coverage Through Search-Based Software Engineering

    Get PDF
    We propose to use Search-Based Software Engineering to automatically evolve coverage criteria that are well correlated with fault revelation, through the use of existing fault databases. We explain how problems of bloat and overfitting can be ameliorated in our approach, and show how this new method will yield insight into faults — as well as better guidance for Search-Based Software Testing

    Automatic detection and removal of ineffective mutants for the mutation analysis of relational database schemas

    Get PDF
    Data is one of an organization’s most valuable and strategic assets. Testing the relational database schema, which protects the integrity of this data, is of paramount importance. Mutation analysis is a means of estimating the fault-finding “strength” of a test suite. As with program mutation, however, relational database schema mutation results in many “ineffective” mutants that both degrade test suite quality estimates and make mutation analysis more time consuming. This paper presents a taxonomy of ineffective mutants for relational database schemas, summarizing the root causes of ineffectiveness with a series of key patterns evident in database schemas. On the basis of these, we introduce algorithms that automatically detect and remove ineffective mutants. In an experimental study involving the mutation analysis of 34 schemas used with three popular relational database management systems—HyperSQL, PostgreSQL, and SQLite—the results show that our algorithms can identify and discard large numbers of ineffective mutants that can account for up to 24% of mutants, leading to a change in mutation score for 33 out of 34 schemas. The tests for seven schemas were found to achieve 100% scores, indicating that they were capable of detecting and killing all non-equivalent mutants. The results also reveal that the execution cost of mutation analysis may be significantly reduced, especially with “heavyweight” DBMSs like PostgreSQL

    Automatically identifying potential regressions in the layout of responsive web pages

    Get PDF
    Providing a good user experience on the ever-increasing number and variety of devices being used to browse the web is a difficult, yet critical, task. With Responsive Web Design (RWD), front-end web developers design web pages so that they dynamically resize and rearrange content to best fit the dimensions of a device’s screen. However, when making code modifications to a responsive page, developers can easily introduce regressions from the correct layout that have detrimental effects at unpredictable screen sizes. For instance, the source code change that a developer makes to improve the layout at one screen size may obscure a page’s content at other sizes. Current approaches to testing are often insufficient because they rely on limited tools and error-prone manual inspections of a web page. As such, many unintended regressions in web page layout often go undetected and ultimately manifest in production web sites. To address the challenge of detecting regressions in responsive web pages, this paper presents an automated approach that extracts the responsive layout of two versions of a page and compares them, alerting developers to the differences in layout that they may wish to investigate further. We implemented the approach and empirically evaluated it on 15 real-world responsive web pages. Leveraging code mutations that a tool automatically injected into the pages as a systematic simulation of developer changes, the experiments show that the approach was highly effective. When compared with manual and automated baseline testing techniques, it detected 12.5% and 18.75% more injected changes, respectively. Along with identifying the best parameters for the method that extracts the responsive layout, the experiments show that the approach surpasses the baselines across changes that vary in their impact, but works particularly well for subtle, hard-to-detect mutants, showing the benefits of automatically identifying regressions in web page layout

    Automated visual classification of DOM-based presentation failure reports for responsive web pages

    Get PDF
    Since it is common for the users of a web page to access it through a wide variety of devices—including desktops, laptops, tablets and phones—web developers rely on responsive web design (RWD) principles and frameworks to create sites that are useful on all devices. A correctly implemented responsive web page adjusts its layout according to the viewport width of the device in use, thereby ensuring that its design suitably features the content. Since the use of complex RWD frameworks often leads to web pages with hard‐to‐detect responsive layout failures (RLFs), developers employ testing tools that generate reports of potential RLFs. Since testing tools for responsive web pages, like ReDeCheck, analyse a web page representation called the Document Object Model (DOM), they may inadvertently flag concerns that are not human visible, thereby requiring developers to manually confirm and classify each potential RLF as a true positive (TP), false positive (FP), or non‐observable issue (NOI)—a process that is time consuming and error prone. The conference version of this paper presented Viser, a tool that automatically classified three types of RLFs reported by ReDeCheck. Since Viser was not designed to automatically confirm and classify two types of RLFs that ReDeCheck's DOM‐based analysis could surface, this paper introduces Verve, a tool that automatically classifies all RLF types reported by ReDeCheck. Along with manipulating the opacity of HTML elements in a web page, as does Viser, the Verve tool also uses histogram‐based image comparison to classify RLFs in web pages. Incorporating both the 25 web pages used in prior experiments and 20 new pages not previously considered, this paper's empirical study reveals that Verve's classification of all five types of RLFs frequently agrees with classifications produced manually by humans. The experiments also reveal that Verve took on average about 4 s to classify any of the RLFs among the 469 reported by ReDeCheck. Since this paper demonstrates that classifying an RLF as a TP, FP, or NOI with Verve, a publicly available tool, is less subjective and error prone than the same manual process done by a human web developer, we argue that it is well‐suited for supporting the testing of complex responsive web pages

    What factors make SQL test cases understandable for testers? A human study of automated test data generation techniques

    Get PDF
    Since relational databases are a key component of software systems ranging from small mobile to large enterprise applications, there are well-studied methods that automatically generate test cases for database-related functionality. Yet, there has been no research to analyze how well testers - who must often serve as an "oracle" - both understand tests involving SQL and decide if they reveal flaws. This paper reports on a human study of test comprehension in the context of automatically generated tests that assess the correct specification of the integrity constraints in a relational database schema. In this domain, a tool generates INSERT statements with data values designed to either satisfy (i.e., be accepted into the database) or violate the schema (i.e., be rejected from the database). The study reveals two key findings. First, the choice of data values in INSERTs influences human understandability: the use of default values for elements not involved in the test (but necessary for adhering to SQL's syntax rules) aided participants, allowing them to easily identify and understand the important test values. Yet, negative numbers and "garbage" strings hindered this process. The second finding is more far reaching: humans found the outcome of test cases very difficult to predict when NULL was used in conjunction with foreign keys and CHECK constraints. This suggests that, while including NULLs can surface the confusing semantics of database schemas, their use makes tests less understandable for humans

    DOMINO: Fast and effective test data generation for relational database schemas

    Get PDF
    An organization's databases are often one of its most valuable assets. Data engineers commonly use a relational database because its schema ensures the validity and consistency of the stored data through the specification and enforcement of integrity constraints. To ensure their correct specification, industry advice recommends the testing of the integrity constraints in a relational schema. Since manual schema testing is labor-intensive and error-prone, this paper presents DOMINO, a new automated technique that generates test data according to a coverage criterion for integrity constraint testing. In contrast to more generalized search-based approaches, which represent the current state of the art for this task, DOMINO uses tailored, domain-specific operators to efficiently generate test data for relational database schemas. In an empirical study incorporating 34 relational database schemas hosted by three different database management systems, the results show that DOMINO can not only generate test suites faster than the state-of-the-art search-based method but that its test suites can also detect more schema faults

    SchemaAnalyst: Search-Based Test Data Generation for Relational Database Schemas

    Get PDF
    Data stored in relational databases plays a vital role in many aspects of society. When this data is incorrect, the services that depend on it may be compromised. The database schema is the artefact responsible for maintaining the integrity of stored data. Because of its critical function, the proper testing of the database schema is a task of great importance. Employing a search-based approach to generate high-quality test data for database schemas, SchemaAnalyst is a tool that supports testing this key software component. This presented tool is extensible and includes both an evaluation framework for assessing the quality of the generated tests and full-featured documentation. In addition to describing the design and implementation of SchemaAnalyst and overviewing its efficiency and effectiveness, this paper coincides with the tool’s public release, thereby enhancing practitioners’ ability to test relational database schemas
    corecore