25,002 research outputs found
Input Prioritization for Testing Neural Networks
Deep neural networks (DNNs) are increasingly being adopted for sensing and
control functions in a variety of safety and mission-critical systems such as
self-driving cars, autonomous air vehicles, medical diagnostics, and industrial
robotics. Failures of such systems can lead to loss of life or property, which
necessitates stringent verification and validation for providing high
assurance. Though formal verification approaches are being investigated,
testing remains the primary technique for assessing the dependability of such
systems. Due to the nature of the tasks handled by DNNs, the cost of obtaining
test oracle data---the expected output, a.k.a. label, for a given input---is
high, which significantly impacts the amount and quality of testing that can be
performed. Thus, prioritizing input data for testing DNNs in meaningful ways to
reduce the cost of labeling can go a long way in increasing testing efficacy.
This paper proposes using gauges of the DNN's sentiment derived from the
computation performed by the model, as a means to identify inputs that are
likely to reveal weaknesses. We empirically assessed the efficacy of three such
sentiment measures for prioritization---confidence, uncertainty, and
surprise---and compare their effectiveness in terms of their fault-revealing
capability and retraining effectiveness. The results indicate that sentiment
measures can effectively flag inputs that expose unacceptable DNN behavior. For
MNIST models, the average percentage of inputs correctly flagged ranged from
88% to 94.8%
Visualizing test diversity to support test optimisation
Diversity has been used as an effective criteria to optimise test suites for
cost-effective testing. Particularly, diversity-based (alternatively referred
to as similarity-based) techniques have the benefit of being generic and
applicable across different Systems Under Test (SUT), and have been used to
automatically select or prioritise large sets of test cases. However, it is a
challenge to feedback diversity information to developers and testers since
results are typically many-dimensional. Furthermore, the generality of
diversity-based approaches makes it harder to choose when and where to apply
them. In this paper we address these challenges by investigating: i) what are
the trade-off in using different sources of diversity (e.g., diversity of test
requirements or test scripts) to optimise large test suites, and ii) how
visualisation of test diversity data can assist testers for test optimisation
and improvement. We perform a case study on three industrial projects and
present quantitative results on the fault detection capabilities and redundancy
levels of different sets of test cases. Our key result is that test similarity
maps, based on pair-wise diversity calculations, helped industrial
practitioners identify issues with their test repositories and decide on
actions to improve. We conclude that the visualisation of diversity information
can assist testers in their maintenance and optimisation activities
CGIAR Excellence in Breeding Platform - Plan of Work and Budget 2020
At the end of 2019, all CGIAR centers had submitted improvement plans based on an EiB template and in close collaboration with EiB staff while – in a parallel process with breeding programs, funders and private sector representatives – a vision for breeding program modernization was developed and presented to CGIAR breeding leadership at the EiB Annual Meeting. This vision represents an evolution of EiB in the context of the Crops to End Hunger Initiative (CtEH) beyond the initial scope of providing tools, services and expert advice, and serves as a guide for Center leadership to drive changes with EiB support. In addition, EiB has taken the role of managing and disbursing funding, made available by Funders via CtEH to modernize breeding and enable CGIAR breeding programs to implement the vision provided by EiB
The Progress, Challenges, and Perspectives of Directed Greybox Fuzzing
Most greybox fuzzing tools are coverage-guided as code coverage is strongly
correlated with bug coverage. However, since most covered codes may not contain
bugs, blindly extending code coverage is less efficient, especially for corner
cases. Unlike coverage-guided greybox fuzzers who extend code coverage in an
undirected manner, a directed greybox fuzzer spends most of its time allocation
on reaching specific targets (e.g., the bug-prone zone) without wasting
resources stressing unrelated parts. Thus, directed greybox fuzzing (DGF) is
particularly suitable for scenarios such as patch testing, bug reproduction,
and specialist bug hunting. This paper studies DGF from a broader view, which
takes into account not only the location-directed type that targets specific
code parts, but also the behaviour-directed type that aims to expose abnormal
program behaviours. Herein, the first in-depth study of DGF is made based on
the investigation of 32 state-of-the-art fuzzers (78% were published after
2019) that are closely related to DGF. A thorough assessment of the collected
tools is conducted so as to systemise recent progress in this field. Finally,
it summarises the challenges and provides perspectives for future research.Comment: 16 pages, 4 figure
A new test framework for communications-critical large scale systems
None of today’s large scale systems could function without the reliable availability of a varied range of network communications capabilities. Whilst software, hardware and communications technologies have been advancing throughout the past two decades, the methods commonly used by industry for testing large scale systems which incorporate critical communications interfaces have not kept pace. This paper argues for the need for a specifically tailored framework to achieve effective and precise testing of communications-critical large scale systems (CCLSSs). The paper briefly discusses how generic test approaches are leading to inefficient and costly test activities in industry. The paper then outlines the features of an alternative CCLSS domain-specific test framework, and then provides an example based on a real case study. The paper concludes with an evaluation of the benefits observed during the case study and an outline of the available evidence that such benefits can be realized with other comparable systems
Sustainable Strategic Urban Planning: Methodology for Urban Renovation At District Level
Sustainable urban renovation is characterized by multiple factors (e.g. technical, socio-economic, environmental and ethical perspectives), different spatial scales and a number of administrative structures that should address the evaluation of alternative scenarios or solutions. This defines a complex decision problem that includes different stakeholders where several aspects need to be considered simultaneously. In spite of the knowledge and experiences during the recent years, there is a need of methods that lead the decision-making processes. In response, a methodology based on the global idea and implications of working towards a more sustainable and energy efficient cities as a holistic procedure for urban renovation at district level is proposed in the European Smart City project CITyFiED. The methodology has the energy efficiency as main pillar and the local authorities as client. It is composed of seven phases that ensures an effective dialogue among all the stakeholders, aiming to understand the objectives and needs of the city to define a set of Strategies for Sustainable Urban Renovation and their integration within the Strategic Urban Planning of the cities.This project has received funding from the European Union’s Seventh Programme for
research, technological development and demonstration under grant agreement N° 609129. The authors would
like to thank the rest of the partners of the CITyFiED project for their help and support
Technical Debt Prioritization: State of the Art. A Systematic Literature Review
Background. Software companies need to manage and refactor Technical Debt
issues. Therefore, it is necessary to understand if and when refactoring
Technical Debt should be prioritized with respect to developing features or
fixing bugs. Objective. The goal of this study is to investigate the existing
body of knowledge in software engineering to understand what Technical Debt
prioritization approaches have been proposed in research and industry. Method.
We conducted a Systematic Literature Review among 384 unique papers published
until 2018, following a consolidated methodology applied in Software
Engineering. We included 38 primary studies. Results. Different approaches have
been proposed for Technical Debt prioritization, all having different goals and
optimizing on different criteria. The proposed measures capture only a small
part of the plethora of factors used to prioritize Technical Debt qualitatively
in practice. We report an impact map of such factors. However, there is a lack
of empirical and validated set of tools. Conclusion. We observed that technical
Debt prioritization research is preliminary and there is no consensus on what
are the important factors and how to measure them. Consequently, we cannot
consider current research conclusive and in this paper, we outline different
directions for necessary future investigations
Development and Validation of Clinical Whole-Exome and Whole-Genome Sequencing for Detection of Germline Variants in Inherited Disease
Context.-With the decrease in the cost of sequencing, the clinical testing paradigm has shifted from single gene to gene panel and now whole-exome and whole-genome sequencing. Clinical laboratories are rapidly implementing next-generation sequencing-based whole-exome and whole-genome sequencing. Because a large number of targets are covered by whole-exome and whole-genome sequencing, it is critical that a laboratory perform appropriate validation studies, develop a quality assurance and quality control program, and participate in proficiency testing. Objective.-To provide recommendations for wholeexome and whole-genome sequencing assay design, validation, and implementation for the detection of germline variants associated in inherited disorders. Data Sources.-An example of trio sequencing, filtration and annotation of variants, and phenotypic consideration to arrive at clinical diagnosis is discussed. Conclusions.-It is critical that clinical laboratories planning to implement whole-exome and whole-genome sequencing design and validate the assay to specifications and ensure adequate performance prior to implementation. Test design specifications, including variant filtering and annotation, phenotypic consideration, guidance on consenting options, and reporting of incidental findings, are provided. These are important steps a laboratory must take to validate and implement whole-exome and whole-genome sequencing in a clinical setting for germline variants in inherited disorders
- …