1,383 research outputs found
Software for physics of tau lepton decay in LHC experiments
Software development in high energy physics experiments offers unique
experience with rapidly changing environment and variety of different standards
and frameworks that software must be adapted to. As such, regular methods of
software development are hard to use as they do not take into account how
greatly some of these changes influence the whole structure. The following
thesis summarizes development of TAUOLA C++ Interface introducing tau decays to
new event record standard. Documentation of the program is already published.
That is why it is not recalled here again. We focus on the development cycle
and methodology used in the project, starting from the definition of the
expectations through planning and designing the abstract model and concluding
with the implementation. In the last part of the paper we present installation
of the software within different experiments surrounding Large Hadron Collider
and the problems that emerged during this process.Comment: Thesis submitted to Applied Computer Science Department in partial
fulfillment of the requirements for the MSc degree. This work is partially
supported by EU Marie Curie Research Training Network grant under the
contract No. MRTN-CT-2006-0355505, Polish Government grant N202 06434
(2008-2011) and EU-RTN Programme: Contract No. MRTN-CT-2006-035482
'Flavianet
Sapienz: Multi-objective automated testing for android applications
We introduce Sapienz, an approach to Android testing that uses multi-objective search-based testing to automatically explore and optimise test sequences, minimising length, while simultaneously maximising coverage and fault revelation. Sapienz combines random fuzzing, systematic and search-based exploration, exploiting seeding and multi-level instrumentation. Sapienz significantly outperforms (with large effect size) both the state-of-the-art technique Dynodroid and the widely-used tool, Android Monkey, in 7/10 experiments for coverage, 7/10 for fault detection and 10/10 for fault-revealing sequence length. When applied to the top 1, 000 Google Play apps, Sapienz found 558 unique, previously unknown crashes. So far we have managed to make contact with the developers of 27 crashing apps. Of these, 14 have confirmed that the crashes are caused by real faults. Of those 14, six already have developer-confirmed fixes
Estimating Software Testing Complexity
Context: Complexity measures provide us some information about software artifacts. A measure of the
difficulty of testing a piece of code could be very useful to take control about the test phase.
Objective: The aim in this paper is the definition of a new measure of the difficulty for a computer to gen erate test cases, we call it Branch Coverage Expectation (BCE). We also analyze the most common com plexity measures and the most important features of a program. With this analysis we are trying to
discover whether there exists a relationship between them and the code coverage of an automatically
generated test suite.
Method: The definition of this measure is based on a Markov model of the program. This model is used
not only to compute the BCE, but also to provide an estimation of the number of test cases needed to
reach a given coverage level in the program. In order to check our proposal, we perform a theoretical val idation and we carry out an empirical validation study using 2600 test programs.
Results: The results show that the previously existing measures are not so useful to estimate the difficulty
of testing a program, because they are not highly correlated with the code coverage. Our proposed mea sure is much more correlated with the code coverage than the existing complexity measures.
Conclusion: The high correlation of our measure with the code coverage suggests that the BCE measure is
a very promising way of measuring the difficulty to automatically test a program. Our proposed measure
is useful for predicting the behavior of an automatic test case generator.This work has been partially funded by the Spanish Ministry of Science and Innovation and FEDER
under contract TIN2011-28194 (the roadME project
Controlled time series generation for automotive software-in-the-loop testing using GANs
Testing automotive mechatronic systems partly uses the software-in-the-loop
approach, where systematically covering inputs of the system-under-test remains
a major challenge. In current practice, there are two major techniques of input
stimulation. One approach is to craft input sequences which eases control and
feedback of the test process but falls short of exposing the system to
realistic scenarios. The other is to replay sequences recorded from field
operations which accounts for reality but requires collecting a well-labeled
dataset of sufficient capacity for widespread use, which is expensive. This
work applies the well-known unsupervised learning framework of Generative
Adversarial Networks (GAN) to learn an unlabeled dataset of recorded in-vehicle
signals and uses it for generation of synthetic input stimuli. Additionally, a
metric-based linear interpolation algorithm is demonstrated, which guarantees
that generated stimuli follow a customizable similarity relationship with
specified references. This combination of techniques enables controlled
generation of a rich range of meaningful and realistic input patterns,
improving virtual test coverage and reducing the need for expensive field
tests.Comment: Preprint of paper accepted at The Second IEEE International
Conference on Artificial Intelligence Testing, April 13-16, 2020, Oxford, U
Automatic Non-functional Testing of Code Generators Families
International audienceThe intensive use of generative programming techniques provides an elegant engineering solution to deal with the heterogeneity of platforms and technological stacks. The use of domain-specific languages for example, leads to the creation of numerous code generators that automatically translate highlevel system specifications into multi-target executable code. Producing correct and efficient code generator is complex and error-prone. Although software designers provide generally high-level test suites to verify the functional outcome of generated code, it remains challenging and tedious to verify the behavior of produced code in terms of non-functional properties. This paper describes a practical approach based on a runtime monitoring infrastructure to automatically check the potential inefficient code generators. This infrastructure, based on system containers as execution platforms, allows code-generator developers to evaluate the generated code performance. We evaluate our approach by analyzing the performance of Haxe, a popular high-level programming language that involves a set of cross-platform code generators. Experimental results show that our approach is able to detect some performance inconsistencies that reveal real issues in Haxe code generators
Towards ServMark, an Architecture for Testing Grid Services
Technical University of Delft - Technical Report ServMark-2006-002, July 2006Grid computing provides a natural way to aggregate resources from different administrative domains for building large scale distributed environments. The Web Services paradigm proposes a way by which virtual services can be seamlessly integrated into global-scale solutions to complex problems. While the usage of Grid technology ranges from academia and research to business world and production, two issues must be considered: that the promised functionality can be accurately quantified and that the performance can be evaluated based on well defined means. Without adequate functionality demonstrators, systems cannot be tuned or adequately configured, and Web services cannot be stressed adequately in production environment. Without performance evaluation systems, the system design and procurement processes are limp, and the performance of Web Services in production cannot be assessed. In this paper, we present ServMark, a carefully researched tool for Grid performance evaluation. While we acknowledge that a lot of ground must be covered to fulfill the requirements of a system for testing Grid environments, and Web (and Grid) Services, we believe that ServMark addresses the minimal set of critical issues
RaidEnv: Exploring New Challenges in Automated Content Balancing for Boss Raid Games
The balance of game content significantly impacts the gaming experience.
Unbalanced game content diminishes engagement or increases frustration because
of repetitive failure. Although game designers intend to adjust the difficulty
of game content, this is a repetitive, labor-intensive, and challenging
process, especially for commercial-level games with extensive content. To
address this issue, the game research community has explored automated game
balancing using artificial intelligence (AI) techniques. However, previous
studies have focused on limited game content and did not consider the
importance of the generalization ability of playtesting agents when
encountering content changes. In this study, we propose RaidEnv, a new game
simulator that includes diverse and customizable content for the boss raid
scenario in MMORPG games. Additionally, we design two benchmarks for the boss
raid scenario that can aid in the practical application of game AI. These
benchmarks address two open problems in automatic content balancing, and we
introduce two evaluation metrics to provide guidance for AI in automatic
content balancing. This novel game research platform expands the frontiers of
automatic game balancing problems and offers a framework within a realistic
game production pipeline.Comment: 14 pages, 6 figures, 6 tables, 2 algorithm
Falsification of Signal-Based Specifications for Cyber-Physical Systems
In the development of software for modern Cyber-Physical Systems, testing is an integral part that is rightfully given a lot of attention. Testing is done on many different abstraction levels, and especially for large-scale industrial systems, it can be difficult to know when the testing should conclude and the software can be considered correct enough for making its way into production. This thesis proposes new methods for analyzing and generating test cases as a means of being more certain that proper testing has been performed for the system under test. For analysis, the proposed approach includes automatically finding how much a given test suite has executed the physical properties of the simulated system. For test case generation, an up-and-coming approach to find errors in Cyber-Physical Systems is simulation-based falsification. While falsification is suitable also for some large-scale industrial systems, sometimes there is a gap between what has been researched and what problems need to be solved to make the approach tractable in the industry. This thesis attempts to close this gap by applying falsification techniques to real-world models from Volvo Car Corporation, and adapting the falsification procedure where it has shortcomings for certain classes of systems. Specifically, the thesis includes a method for automatically transforming a signal-based specification into a formal specification in temporal logic, as well as a modification to the underlying optimization problem that makes falsification more viable in an industrial setting. The proposed methods have been evaluated for both academic benchmark examples and real-world industrial models. One of the main conclusions is that the proposed additions and changes to analysis and generation of tests can be useful, given that one has enough information about the system under test. It is difficult to provide a general solution that will always work best -- instead, the challenge lies in identifying which properties of the given system should be taken into account when trying to find potential errors in the system
- …