12 research outputs found
Formal Analysis and Redesign of a Neural Network-Based Aircraft Taxiing System with VerifAI
We demonstrate a unified approach to rigorous design of safety-critical
autonomous systems using the VerifAI toolkit for formal analysis of AI-based
systems. VerifAI provides an integrated toolchain for tasks spanning the design
process, including modeling, falsification, debugging, and ML component
retraining. We evaluate all of these applications in an industrial case study
on an experimental autonomous aircraft taxiing system developed by Boeing,
which uses a neural network to track the centerline of a runway. We define
runway scenarios using the Scenic probabilistic programming language, and use
them to drive tests in the X-Plane flight simulator. We first perform
falsification, automatically finding environment conditions causing the system
to violate its specification by deviating significantly from the centerline (or
even leaving the runway entirely). Next, we use counterexample analysis to
identify distinct failure cases, and confirm their root causes with specialized
testing. Finally, we use the results of falsification and debugging to retrain
the network, eliminating several failure cases and improving the overall
performance of the closed-loop system.Comment: Full version of a CAV 2020 pape
Parallel and Multi-Objective Falsification with Scenic and VerifAI
Falsification has emerged as an important tool for simulation-based
verification of autonomous systems. In this paper, we present extensions to the
Scenic scenario specification language and VerifAI toolkit that improve the
scalability of sampling-based falsification methods by using parallelism and
extend falsification to multi-objective specifications. We first present a
parallelized framework that is interfaced with both the simulation and sampling
capabilities of Scenic and the falsification capabilities of VerifAI, reducing
the execution time bottleneck inherently present in simulation-based testing.
We then present an extension of VerifAI's falsification algorithms to support
multi-objective optimization during sampling, using the concept of rulebooks to
specify a preference ordering over multiple metrics that can be used to guide
the counterexample search process. Lastly, we evaluate the benefits of these
extensions with a comprehensive set of benchmarks written in the Scenic
language
Formal Scenario-Based Testing of Autonomous Vehicles: From Simulation to the Real World
We present a new approach to automated scenario-based testing of the safety
of autonomous vehicles, especially those using advanced artificial
intelligence-based components, spanning both simulation-based evaluation as
well as testing in the real world. Our approach is based on formal methods,
combining formal specification of scenarios and safety properties, algorithmic
test case generation using formal simulation, test case selection for track
testing, executing test cases on the track, and analyzing the resulting data.
Experiments with a real autonomous vehicle at an industrial testing facility
support our hypotheses that (i) formal simulation can be effective at
identifying test cases to run on the track, and (ii) the gap between simulated
and real worlds can be systematically evaluated and bridged.Comment: 9 pages, 6 figures. Full version of an ITSC 2020 pape
3D Environment Modeling for Falsification and Beyond with Scenic 3.0
We present a major new version of Scenic, a probabilistic programming
language for writing formal models of the environments of cyber-physical
systems. Scenic has been successfully used for the design and analysis of CPS
in a variety of domains, but earlier versions are limited to environments which
are essentially two-dimensional. In this paper, we extend Scenic with native
support for 3D geometry, introducing new syntax which provides expressive ways
to describe 3D configurations while preserving the simplicity and readability
of the language. We replace Scenic's simplistic representation of objects as
boxes with precise modeling of complex shapes, including a ray tracing-based
visibility system that accounts for object occlusion. We also extend the
language to support arbitrary temporal requirements expressed in LTL, and build
an extensible Scenic parser generated from a formal grammar of the language.
Finally, we illustrate the new application domains these features enable with
case studies that would have been impossible to accurately model in Scenic 2.Comment: 13 pages, 6 figures. Full version of a CAV 2023 tool paper, to appear
in the Springer Lecture Notes in Computer Science serie
Refining Perception Contracts: Case Studies in Vision-based Safe Auto-landing
Perception contracts provide a method for evaluating safety of control
systems that use machine learning for perception. A perception contract is a
specification for testing the ML components, and it gives a method for proving
end-to-end system-level safety requirements. The feasibility of contract-based
testing and assurance was established earlier in the context of straight lane
keeping: a 3-dimensional system with relatively simple dynamics. This paper
presents the analysis of two 6 and 12-dimensional flight control systems that
use multi-stage, heterogeneous, ML-enabled perception. The paper advances
methodology by introducing an algorithm for constructing data and requirement
guided refinement of perception contracts (DaRePC). The resulting analysis
provides testable contracts which establish the state and environment
conditions under which an aircraft can safety touchdown on the runway and a
drone can safely pass through a sequence of gates. It can also discover
conditions (e.g., low-horizon sun) that can possibly violate the safety of the
vision-based control system
Towards quantum enhanced adversarial robustness in machine learning
Machine learning algorithms are powerful tools for data driven tasks such as
image classification and feature detection, however their vulnerability to
adversarial examples - input samples manipulated to fool the algorithm -
remains a serious challenge. The integration of machine learning with quantum
computing has the potential to yield tools offering not only better accuracy
and computational efficiency, but also superior robustness against adversarial
attacks. Indeed, recent work has employed quantum mechanical phenomena to
defend against adversarial attacks, spurring the rapid development of the field
of quantum adversarial machine learning (QAML) and potentially yielding a new
source of quantum advantage. Despite promising early results, there remain
challenges towards building robust real-world QAML tools. In this review we
discuss recent progress in QAML and identify key challenges. We also suggest
future research directions which could determine the route to practicality for
QAML approaches as quantum computing hardware scales up and noise levels are
reduced.Comment: 10 Pages, 4 Figure
Deep ConvNet: Non-random weight initialization for repeatable determinism, examined with FSGM
A repeatable and deterministic non-random weight initialization method in convolutional layers of neural networks examined with the Fast Gradient Sign Method (FSGM). Using the FSGM approach as a technique to measure the initialization effect with controlled distortions in transferred learning, varying the dataset numerical similarity. The focus is on convolutional layers with induced earlier learning through the use of striped forms for image classification. Which provided a higher performing accuracy in the first epoch, with improvements of between 3–5% in a well known benchmark model, and also ~10% in a color image dataset (MTARSI2), using a dissimilar model architecture. The proposed method is robust to limit optimization approaches like Glorot/Xavier and He initialization. Arguably the approach is within a new category of weight initialization methods, as a number sequence substitution of random numbers, without a tether to the dataset. When examined under the FGSM approach with transferred learning, the proposed method when used with higher distortions (numerically dissimilar datasets), is less compromised against the original cross-validation dataset, at ~31% accuracy instead of ~9%. This is an indication of higher retention of the original fitting in transferred learning
How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review
Context: Machine Learning (ML) has been at the heart of many innovations over
the past years. However, including it in so-called 'safety-critical' systems
such as automotive or aeronautic has proven to be very challenging, since the
shift in paradigm that ML brings completely changes traditional certification
approaches.
Objective: This paper aims to elucidate challenges related to the
certification of ML-based safety-critical systems, as well as the solutions
that are proposed in the literature to tackle them, answering the question 'How
to Certify Machine Learning Based Safety-critical Systems?'.
Method: We conduct a Systematic Literature Review (SLR) of research papers
published between 2015 to 2020, covering topics related to the certification of
ML systems. In total, we identified 217 papers covering topics considered to be
the main pillars of ML certification: Robustness, Uncertainty, Explainability,
Verification, Safe Reinforcement Learning, and Direct Certification. We
analyzed the main trends and problems of each sub-field and provided summaries
of the papers extracted.
Results: The SLR results highlighted the enthusiasm of the community for this
subject, as well as the lack of diversity in terms of datasets and type of
models. It also emphasized the need to further develop connections between
academia and industries to deepen the domain study. Finally, it also
illustrated the necessity to build connections between the above mention main
pillars that are for now mainly studied separately.
Conclusion: We highlighted current efforts deployed to enable the
certification of ML based software systems, and discuss some future research
directions.Comment: 60 pages (92 pages with references and complements), submitted to a
journal (Automated Software Engineering). Changes: Emphasizing difference
traditional software engineering / ML approach. Adding Related Works, Threats
to Validity and Complementary Materials. Adding a table listing papers
reference for each section/subsection
Artificial intelligence methods for security and cyber security systems
This research is in threat analysis and countermeasures employing Artificial Intelligence (AI) methods within the civilian domain, where safety and mission-critical aspects are essential. AI has challenges of repeatable determinism and decision explanation. This research proposed methods for dense and convolutional networks that provided repeatable determinism. In dense networks, the proposed alternative method had an equal performance with more structured learnt weights. The proposed method also had earlier learning and higher accuracy in the Convolutional networks. When demonstrated in colour image classification, the accuracy improved in the first epoch to 67%, from 29% in the existing scheme. Examined in transferred learning with the Fast Sign Gradient Method (FSGM) as an analytical method to control distortion of dissimilarity, a finding was that the proposed method had more significant retention of the learnt model, with 31% accuracy instead of 9%. The research also proposed a threat analysis method with set-mappings and first principle analytical steps applied to a Symbolic AI method using an algebraic expert system with virtualized neurons. The neural expert system method demonstrated the infilling of parameters by calculating beamwidths with variations in the uncertainty of the antenna type. When combined with a proposed formula extraction method, it provides the potential for machine learning of new rules as a Neuro-Symbolic AI method. The proposed method uses extra weights allocated to neuron input value ranges as activation strengths. The method simplifies the learnt representation reducing model depth, thus with less significant dropout potential. Finally, an image classification method for emitter identification is proposed with a synthetic dataset generation method and shows the accurate identification between fourteen radar emission modes with high ambiguity between them (and achieved 99.8% accuracy). That method would be a mechanism to recognize non-threat civil radars aimed at threat alert when deviations from those civilian emitters are detected