9 research outputs found

    Towards a Measure of Trustworthiness to Evaluate CNNs During Operation

    Full text link
    Due to black box nature of Convolutional neural networks (CNNs), the continuous validation of CNN classifiers' during operation is infeasible. As a result this makes it difficult for developers or regulators to gain confidence in the deployment of autonomous systems employing CNNs. We introduce the trustworthiness in classification score (TCS), a metric to assist with overcoming this challenge. The metric quantifies the trustworthiness in a prediction by checking for the existence of certain features in the predictions made by the CNN. A case study on persons detection is used to to demonstrate our method and the usage of TCS

    An Agency-Directed Approach to Test Generation for Simulation-based Autonomous Vehicle Verification

    Get PDF
    Simulation-based verification is beneficial for assessing otherwise dangerous or costly on-road testing of autonomous vehicles (AV). This paper addresses the challenge of efficiently generating effective tests for simulation-based AV verification using software testing agents. The multi-agent system (MAS) programming paradigm offers rational agency, causality and strategic planning between multiple agents. We exploit these aspects for test generation, focusing in particular on the generation of tests that trigger the precondition of an assertion. On the example of a key assertion we show that, by encoding a variety of different behaviours respondent to the agent's perceptions of the test environment, the agency-directed approach generates twice as many effective tests than pseudo-random test generation, while being both efficient and robust. Moreover, agents can be encoded to behave naturally without compromising the effectiveness of test generation. Our results suggest that generating tests using agency-directed testing significantly improves upon random and simultaneously provides more realistic driving scenarios.Comment: 18 pages, 8 figure

    DIRA: Dynamic Domain Incremental Regularised Adaptation

    Full text link
    Autonomous systems (AS) often use Deep Neural Network (DNN) classifiers to allow them to operate in complex, high-dimensional, non-linear, and dynamically changing environments. Due to the complexity of these environments, DNN classifiers may output misclassifications during operation when they face domains not identified during development. Removing a system from operation for retraining becomes impractical as the number of such AS increases. To increase AS reliability and overcome this limitation, DNN classifiers need to have the ability to adapt during operation when faced with different operational domains using a few samples (e.g. 100 samples). However, retraining DNNs on a few samples is known to cause catastrophic forgetting. In this paper, we introduce Dynamic Incremental Regularised Adaptation (DIRA), a framework for operational domain adaption of DNN classifiers using regularisation techniques to overcome catastrophic forgetting and achieve adaptation when retraining using a few samples of the target domain. Our approach shows improvements on different image classification benchmarks aimed at evaluating robustness to distribution shifts (e.g.CIFAR-10C/100C, ImageNet-C), and produces state-of-the-art performance in comparison with other frameworks from the literature

    An Agency-Directed Approach to Test Generation for Simulation-based Autonomous Vehicle Verification

    Full text link
    Simulation-based verification is beneficial for assessing otherwise dangerous or costly on-road testing of autonomous vehicles (AV). This paper addresses the challenge of efficiently generating effective tests for simulation-based AV verification using software testing agents. The multi-agent system (MAS) programming paradigm offers rational agency, causality and strategic planning between multiple agents. We exploit these aspects for test generation, focusing in particular on the generation of tests that trigger the precondition of an assertion. On the example of a key assertion we show that, by encoding a variety of different behaviours respondent to the agent's perceptions of the test environment, the agency-directed approach generates twice as many effective tests than pseudo-random test generation, while being both efficient and robust. Moreover, agents can be encoded to behave naturally without compromising the effectiveness of test generation. Our results suggest that generating tests using agency-directed testing significantly improves upon random and simultaneously provides more realistic driving scenarios.Comment: 18 pages, 8 figure

    On Determinism of Game Engines used for Simulation-based Autonomous Vehicle Verification

    Full text link
    Game engines are increasingly used as simulation platforms by the autonomous vehicle (AV) community to develop vehicle control systems and test environments. A key requirement for simulation-based development and verification is determinism, since a deterministic process will always produce the same output given the same initial conditions and event history. Thus, in a deterministic simulation environment, tests are rendered repeatable and yield simulation results that are trustworthy and straightforward to debug. However, game engines are seldom deterministic. This paper reviews and identifies the potential causes of non-deterministic behaviours in game engines. A case study using CARLA, an open-source autonomous driving simulation environment powered by Unreal Engine, is presented to highlight its inherent shortcomings in providing sufficient precision in experimental results. Different configurations and utilisations of the software and hardware are explored to determine an operational domain where the simulation precision is sufficiently low i.e.\ variance between repeated executions becomes negligible for development and testing work. Finally, a method of a general nature is proposed, that can be used to find the domains of permissible variance in game engine simulations for any given system configuration.Comment: 17 pages, 9 figures, 1 tabl

    Evaluation Metrics for DNNs Compression

    Full text link
    There is a lot of research effort into developing different techniques for neural networks compression. However, the community lacks standardised evaluation metrics, which are key to identifying the most suitable compression technique for different applications. This paper reviews existing neural network compression evaluation metrics and implements them into a standardisation framework called NetZIP. We introduce two novel metrics to cover existing gaps of evaluation in the literature: 1) Compression and Hardware Agnostic Theoretical Speed (CHATS) and 2) Overall Compression Success (OCS). We demonstrate the use of NetZIP using two case studies focusing on object classification and object detection

    On Determinism of Game Engines used for Simulation-based Autonomous Vehicle Verification

    Get PDF
    Game engines are increasingly used as simulation platforms by the autonomous vehicle (AV) community to develop vehicle control systems and test environments. A key requirement for simulation-based development and verification is determinism, since a deterministic process will always produce the same output given the same initial conditions and event history. Thus, in a deterministic simulation environment, tests are rendered repeatable and yield simulation results that are trustworthy and straightforward to debug. However, game engines are seldom deterministic. This paper reviews and identifies the potential causes of non-deterministic behaviours in game engines. A case study using CARLA, an open-source autonomous driving simulation environment powered by Unreal Engine, is presented to highlight its inherent shortcomings in providing sufficient precision in experimental results. Different configurations and utilisations of the software and hardware are explored to determine an operational domain where the simulation precision is sufficiently low i.e.\ variance between repeated executions becomes negligible for development and testing work. Finally, a method of a general nature is proposed, that can be used to find the domains of permissible variance in game engine simulations for any given system configuration.Comment: 16 pages, 9 figures, 1 tabl
    corecore