178,756 research outputs found

    Testing web enabled simulation at scale using metamorphic testing

    Get PDF
    We report on Facebook's deployment of MIA (Metamorphic Interaction Automaton). MIA is used to test Facebook's Web Enabled Simulation, built on a web infrastructure of hundreds of millions of lines of code. MIA tackles the twin problems of test flakiness and the unknowable oracle problem. It uses metamorphic testing to automate continuous integration and regression test execution. MIA also plays the role of a test bot, automatically commenting on all relevant changes submitted for code review. It currently uses a suite of over 40 metamorphic test cases. Even at this extreme scale, a non-trivial metamorphic test suite subset yields outcomes within 20 minutes (sufficient for continuous integration and review processes). Furthermore, our offline mode simulation reduces test flakiness from approximately 50% (of all online tests) to 0% (offline). Metamorphic testing has been widely-studied for 22 years. This paper is the first reported deployment into an industrial continuous integration system

    Combining different validation techniques for continuous software improvement - Implications in the development of TRNSYS 16

    Get PDF
    Validation using published, high quality test suites can serve to identify different problems in simulation software: modeling and coding errors, missing features, frequent sources of user confusion. This paper discusses the application of different published validation procedures during the development of a new TRNSYS version: BESTEST/ASHRAE 140 (Building envelope), HVAC BESTEST (mechanical systems) and IEA ECBCS Annex 21 / SHC Task 12 empirical validation (performance of a test cell with a very simple mechanical system). It is shown that each validation suite has allowed to identify different types of problems. Those validation tools were also used to diagnose and fix some of the identified problems, and to assess the influence of code modifications. The paper also discusses some limitations of the selected validation tools

    The CMS Simulation Software

    Get PDF
    In this paper we present the features and the expected performance of the re-designed CMS simulation software, as well as the experience from the migration process. Today, the CMS simulation suite is based on the two principal components - Geant4 detector simulation toolkit and the new CMS offline Framework and Event Data Model. The simulation chain includes event generation, detector simulation, and digitization steps. With Geant4, we employ the full set of electromagnetic and hadronic physics processes and detailed particle tracking in the 4 Tesla magnetic field. The Framework provides "action on demand" mechanisms, to allow users to load dynamically the desired modules and to configure and tune the final application at the run time. The simulation suite is used to model the complete central CMS detector (over 1 million of geometrical volumes) and the forward systems, such as Castor calorimeter and Zero Degree Calorimeter, the Totem telescopes, Roman Pots, and the Luminosity Monitor. The designs also previews the use of the electromagnetic and hadronic showers parametrization, instead of full modelling of high energy particles passage through a complex hierarchy of volumes and materials, allowing significant gain in speed while tuning the simulation to test beam and collider data. Physics simulation has been extensively validated by comparison with test beam data and previous simulation results. The redesigned and upgraded simulation software was exercised for performance and robustness tests. It went into Production in July 2006, running in the US and EU grids, and has since delivered about 60 millions of events

    Towards Reliable AI: Adequacy Metrics for Ensuring the Quality of System-level Testing of Autonomous Vehicles

    Full text link
    AI-powered systems have gained widespread popularity in various domains, including Autonomous Vehicles (AVs). However, ensuring their reliability and safety is challenging due to their complex nature. Conventional test adequacy metrics, designed to evaluate the effectiveness of traditional software testing, are often insufficient or impractical for these systems. White-box metrics, which are specifically designed for these systems, leverage neuron coverage information. These coverage metrics necessitate access to the underlying AI model and training data, which may not always be available. Furthermore, the existing adequacy metrics exhibit weak correlations with the ability to detect faults in the generated test suite, creating a gap that we aim to bridge in this study. In this paper, we introduce a set of black-box test adequacy metrics called "Test suite Instance Space Adequacy" (TISA) metrics, which can be used to gauge the effectiveness of a test suite. The TISA metrics offer a way to assess both the diversity and coverage of the test suite and the range of bugs detected during testing. Additionally, we introduce a framework that permits testers to visualise the diversity and coverage of the test suite in a two-dimensional space, facilitating the identification of areas that require improvement. We evaluate the efficacy of the TISA metrics by examining their correlation with the number of bugs detected in system-level simulation testing of AVs. A strong correlation, coupled with the short computation time, indicates their effectiveness and efficiency in estimating the adequacy of testing AVs.Comment: 12 pages, 7 figure

    The Effectiveness of Using a Modified “Beat Frequent Pick” Algorithm in the First International RoShamBo Tournament

    Get PDF
    In this study, a bot is developed to compete in the first International RoShamBo Tournament test suite. The basic “Beat Frequent Pick (BFP)” algorithm was taken from the supplied test suite and was improved by adding a random choice tailored fit against the opponent\u27s distribution of picks. A training program was also developed that finds the best performing bot variant by changing the bot\u27s behavior in terms of the timing of the recomputation of the pick distribution. Simulation results demonstrate the significantly improved performance of the proposed variant over the original BFP. This indicates the potential of using the core technique (of the proposed variant) as an Artificial Intelligence bot to similarly applicable computer games
    • …
    corecore