25,002 research outputs found

    Input Prioritization for Testing Neural Networks

    Full text link
    Deep neural networks (DNNs) are increasingly being adopted for sensing and control functions in a variety of safety and mission-critical systems such as self-driving cars, autonomous air vehicles, medical diagnostics, and industrial robotics. Failures of such systems can lead to loss of life or property, which necessitates stringent verification and validation for providing high assurance. Though formal verification approaches are being investigated, testing remains the primary technique for assessing the dependability of such systems. Due to the nature of the tasks handled by DNNs, the cost of obtaining test oracle data---the expected output, a.k.a. label, for a given input---is high, which significantly impacts the amount and quality of testing that can be performed. Thus, prioritizing input data for testing DNNs in meaningful ways to reduce the cost of labeling can go a long way in increasing testing efficacy. This paper proposes using gauges of the DNN's sentiment derived from the computation performed by the model, as a means to identify inputs that are likely to reveal weaknesses. We empirically assessed the efficacy of three such sentiment measures for prioritization---confidence, uncertainty, and surprise---and compare their effectiveness in terms of their fault-revealing capability and retraining effectiveness. The results indicate that sentiment measures can effectively flag inputs that expose unacceptable DNN behavior. For MNIST models, the average percentage of inputs correctly flagged ranged from 88% to 94.8%

    Visualizing test diversity to support test optimisation

    Full text link
    Diversity has been used as an effective criteria to optimise test suites for cost-effective testing. Particularly, diversity-based (alternatively referred to as similarity-based) techniques have the benefit of being generic and applicable across different Systems Under Test (SUT), and have been used to automatically select or prioritise large sets of test cases. However, it is a challenge to feedback diversity information to developers and testers since results are typically many-dimensional. Furthermore, the generality of diversity-based approaches makes it harder to choose when and where to apply them. In this paper we address these challenges by investigating: i) what are the trade-off in using different sources of diversity (e.g., diversity of test requirements or test scripts) to optimise large test suites, and ii) how visualisation of test diversity data can assist testers for test optimisation and improvement. We perform a case study on three industrial projects and present quantitative results on the fault detection capabilities and redundancy levels of different sets of test cases. Our key result is that test similarity maps, based on pair-wise diversity calculations, helped industrial practitioners identify issues with their test repositories and decide on actions to improve. We conclude that the visualisation of diversity information can assist testers in their maintenance and optimisation activities

    CGIAR Excellence in Breeding Platform - Plan of Work and Budget 2020

    Get PDF
    At the end of 2019, all CGIAR centers had submitted improvement plans based on an EiB template and in close collaboration with EiB staff while – in a parallel process with breeding programs, funders and private sector representatives – a vision for breeding program modernization was developed and presented to CGIAR breeding leadership at the EiB Annual Meeting. This vision represents an evolution of EiB in the context of the Crops to End Hunger Initiative (CtEH) beyond the initial scope of providing tools, services and expert advice, and serves as a guide for Center leadership to drive changes with EiB support. In addition, EiB has taken the role of managing and disbursing funding, made available by Funders via CtEH to modernize breeding and enable CGIAR breeding programs to implement the vision provided by EiB

    The Progress, Challenges, and Perspectives of Directed Greybox Fuzzing

    Full text link
    Most greybox fuzzing tools are coverage-guided as code coverage is strongly correlated with bug coverage. However, since most covered codes may not contain bugs, blindly extending code coverage is less efficient, especially for corner cases. Unlike coverage-guided greybox fuzzers who extend code coverage in an undirected manner, a directed greybox fuzzer spends most of its time allocation on reaching specific targets (e.g., the bug-prone zone) without wasting resources stressing unrelated parts. Thus, directed greybox fuzzing (DGF) is particularly suitable for scenarios such as patch testing, bug reproduction, and specialist bug hunting. This paper studies DGF from a broader view, which takes into account not only the location-directed type that targets specific code parts, but also the behaviour-directed type that aims to expose abnormal program behaviours. Herein, the first in-depth study of DGF is made based on the investigation of 32 state-of-the-art fuzzers (78% were published after 2019) that are closely related to DGF. A thorough assessment of the collected tools is conducted so as to systemise recent progress in this field. Finally, it summarises the challenges and provides perspectives for future research.Comment: 16 pages, 4 figure

    A new test framework for communications-critical large scale systems

    Get PDF
    None of today’s large scale systems could function without the reliable availability of a varied range of network communications capabilities. Whilst software, hardware and communications technologies have been advancing throughout the past two decades, the methods commonly used by industry for testing large scale systems which incorporate critical communications interfaces have not kept pace. This paper argues for the need for a specifically tailored framework to achieve effective and precise testing of communications-critical large scale systems (CCLSSs). The paper briefly discusses how generic test approaches are leading to inefficient and costly test activities in industry. The paper then outlines the features of an alternative CCLSS domain-specific test framework, and then provides an example based on a real case study. The paper concludes with an evaluation of the benefits observed during the case study and an outline of the available evidence that such benefits can be realized with other comparable systems

    Sustainable Strategic Urban Planning: Methodology for Urban Renovation At District Level

    Get PDF
    Sustainable urban renovation is characterized by multiple factors (e.g. technical, socio-economic, environmental and ethical perspectives), different spatial scales and a number of administrative structures that should address the evaluation of alternative scenarios or solutions. This defines a complex decision problem that includes different stakeholders where several aspects need to be considered simultaneously. In spite of the knowledge and experiences during the recent years, there is a need of methods that lead the decision-making processes. In response, a methodology based on the global idea and implications of working towards a more sustainable and energy efficient cities as a holistic procedure for urban renovation at district level is proposed in the European Smart City project CITyFiED. The methodology has the energy efficiency as main pillar and the local authorities as client. It is composed of seven phases that ensures an effective dialogue among all the stakeholders, aiming to understand the objectives and needs of the city to define a set of Strategies for Sustainable Urban Renovation and their integration within the Strategic Urban Planning of the cities.This project has received funding from the European Union’s Seventh Programme for research, technological development and demonstration under grant agreement N° 609129. The authors would like to thank the rest of the partners of the CITyFiED project for their help and support

    Technical Debt Prioritization: State of the Art. A Systematic Literature Review

    Get PDF
    Background. Software companies need to manage and refactor Technical Debt issues. Therefore, it is necessary to understand if and when refactoring Technical Debt should be prioritized with respect to developing features or fixing bugs. Objective. The goal of this study is to investigate the existing body of knowledge in software engineering to understand what Technical Debt prioritization approaches have been proposed in research and industry. Method. We conducted a Systematic Literature Review among 384 unique papers published until 2018, following a consolidated methodology applied in Software Engineering. We included 38 primary studies. Results. Different approaches have been proposed for Technical Debt prioritization, all having different goals and optimizing on different criteria. The proposed measures capture only a small part of the plethora of factors used to prioritize Technical Debt qualitatively in practice. We report an impact map of such factors. However, there is a lack of empirical and validated set of tools. Conclusion. We observed that technical Debt prioritization research is preliminary and there is no consensus on what are the important factors and how to measure them. Consequently, we cannot consider current research conclusive and in this paper, we outline different directions for necessary future investigations

    Development and Validation of Clinical Whole-Exome and Whole-Genome Sequencing for Detection of Germline Variants in Inherited Disease

    Get PDF
    Context.-With the decrease in the cost of sequencing, the clinical testing paradigm has shifted from single gene to gene panel and now whole-exome and whole-genome sequencing. Clinical laboratories are rapidly implementing next-generation sequencing-based whole-exome and whole-genome sequencing. Because a large number of targets are covered by whole-exome and whole-genome sequencing, it is critical that a laboratory perform appropriate validation studies, develop a quality assurance and quality control program, and participate in proficiency testing. Objective.-To provide recommendations for wholeexome and whole-genome sequencing assay design, validation, and implementation for the detection of germline variants associated in inherited disorders. Data Sources.-An example of trio sequencing, filtration and annotation of variants, and phenotypic consideration to arrive at clinical diagnosis is discussed. Conclusions.-It is critical that clinical laboratories planning to implement whole-exome and whole-genome sequencing design and validate the assay to specifications and ensure adequate performance prior to implementation. Test design specifications, including variant filtering and annotation, phenotypic consideration, guidance on consenting options, and reporting of incidental findings, are provided. These are important steps a laboratory must take to validate and implement whole-exome and whole-genome sequencing in a clinical setting for germline variants in inherited disorders
    • …
    corecore