9 research outputs found
Towards a Measure of Trustworthiness to Evaluate CNNs During Operation
Due to black box nature of Convolutional neural networks (CNNs), the
continuous validation of CNN classifiers' during operation is infeasible. As a
result this makes it difficult for developers or regulators to gain confidence
in the deployment of autonomous systems employing CNNs. We introduce the
trustworthiness in classification score (TCS), a metric to assist with
overcoming this challenge. The metric quantifies the trustworthiness in a
prediction by checking for the existence of certain features in the predictions
made by the CNN. A case study on persons detection is used to to demonstrate
our method and the usage of TCS
An Agency-Directed Approach to Test Generation for Simulation-based Autonomous Vehicle Verification
Simulation-based verification is beneficial for assessing otherwise dangerous
or costly on-road testing of autonomous vehicles (AV). This paper addresses the
challenge of efficiently generating effective tests for simulation-based AV
verification using software testing agents. The multi-agent system (MAS)
programming paradigm offers rational agency, causality and strategic planning
between multiple agents. We exploit these aspects for test generation, focusing
in particular on the generation of tests that trigger the precondition of an
assertion. On the example of a key assertion we show that, by encoding a
variety of different behaviours respondent to the agent's perceptions of the
test environment, the agency-directed approach generates twice as many
effective tests than pseudo-random test generation, while being both efficient
and robust. Moreover, agents can be encoded to behave naturally without
compromising the effectiveness of test generation. Our results suggest that
generating tests using agency-directed testing significantly improves upon
random and simultaneously provides more realistic driving scenarios.Comment: 18 pages, 8 figure
DIRA: Dynamic Domain Incremental Regularised Adaptation
Autonomous systems (AS) often use Deep Neural Network (DNN) classifiers to
allow them to operate in complex, high-dimensional, non-linear, and dynamically
changing environments. Due to the complexity of these environments, DNN
classifiers may output misclassifications during operation when they face
domains not identified during development. Removing a system from operation for
retraining becomes impractical as the number of such AS increases. To increase
AS reliability and overcome this limitation, DNN classifiers need to have the
ability to adapt during operation when faced with different operational domains
using a few samples (e.g. 100 samples). However, retraining DNNs on a few
samples is known to cause catastrophic forgetting. In this paper, we introduce
Dynamic Incremental Regularised Adaptation (DIRA), a framework for operational
domain adaption of DNN classifiers using regularisation techniques to overcome
catastrophic forgetting and achieve adaptation when retraining using a few
samples of the target domain. Our approach shows improvements on different
image classification benchmarks aimed at evaluating robustness to distribution
shifts (e.g.CIFAR-10C/100C, ImageNet-C), and produces state-of-the-art
performance in comparison with other frameworks from the literature
An Agency-Directed Approach to Test Generation for Simulation-based Autonomous Vehicle Verification
Simulation-based verification is beneficial for assessing otherwise dangerous
or costly on-road testing of autonomous vehicles (AV). This paper addresses the
challenge of efficiently generating effective tests for simulation-based AV
verification using software testing agents. The multi-agent system (MAS)
programming paradigm offers rational agency, causality and strategic planning
between multiple agents. We exploit these aspects for test generation, focusing
in particular on the generation of tests that trigger the precondition of an
assertion. On the example of a key assertion we show that, by encoding a
variety of different behaviours respondent to the agent's perceptions of the
test environment, the agency-directed approach generates twice as many
effective tests than pseudo-random test generation, while being both efficient
and robust. Moreover, agents can be encoded to behave naturally without
compromising the effectiveness of test generation. Our results suggest that
generating tests using agency-directed testing significantly improves upon
random and simultaneously provides more realistic driving scenarios.Comment: 18 pages, 8 figure
On Determinism of Game Engines used for Simulation-based Autonomous Vehicle Verification
Game engines are increasingly used as simulation platforms by the autonomous
vehicle (AV) community to develop vehicle control systems and test
environments. A key requirement for simulation-based development and
verification is determinism, since a deterministic process will always produce
the same output given the same initial conditions and event history. Thus, in a
deterministic simulation environment, tests are rendered repeatable and yield
simulation results that are trustworthy and straightforward to debug. However,
game engines are seldom deterministic. This paper reviews and identifies the
potential causes of non-deterministic behaviours in game engines. A case study
using CARLA, an open-source autonomous driving simulation environment powered
by Unreal Engine, is presented to highlight its inherent shortcomings in
providing sufficient precision in experimental results. Different
configurations and utilisations of the software and hardware are explored to
determine an operational domain where the simulation precision is sufficiently
low i.e.\ variance between repeated executions becomes negligible for
development and testing work. Finally, a method of a general nature is
proposed, that can be used to find the domains of permissible variance in game
engine simulations for any given system configuration.Comment: 17 pages, 9 figures, 1 tabl
Evaluation Metrics for DNNs Compression
There is a lot of research effort into developing different techniques for
neural networks compression. However, the community lacks standardised
evaluation metrics, which are key to identifying the most suitable compression
technique for different applications. This paper reviews existing neural
network compression evaluation metrics and implements them into a
standardisation framework called NetZIP. We introduce two novel metrics to
cover existing gaps of evaluation in the literature: 1) Compression and
Hardware Agnostic Theoretical Speed (CHATS) and 2) Overall Compression Success
(OCS). We demonstrate the use of NetZIP using two case studies focusing on
object classification and object detection
On Determinism of Game Engines used for Simulation-based Autonomous Vehicle Verification
Game engines are increasingly used as simulation platforms by the autonomous
vehicle (AV) community to develop vehicle control systems and test
environments. A key requirement for simulation-based development and
verification is determinism, since a deterministic process will always produce
the same output given the same initial conditions and event history. Thus, in a
deterministic simulation environment, tests are rendered repeatable and yield
simulation results that are trustworthy and straightforward to debug. However,
game engines are seldom deterministic. This paper reviews and identifies the
potential causes of non-deterministic behaviours in game engines. A case study
using CARLA, an open-source autonomous driving simulation environment powered
by Unreal Engine, is presented to highlight its inherent shortcomings in
providing sufficient precision in experimental results. Different
configurations and utilisations of the software and hardware are explored to
determine an operational domain where the simulation precision is sufficiently
low i.e.\ variance between repeated executions becomes negligible for
development and testing work. Finally, a method of a general nature is
proposed, that can be used to find the domains of permissible variance in game
engine simulations for any given system configuration.Comment: 16 pages, 9 figures, 1 tabl