7 research outputs found
Formal Verification of Input-Output Mappings of Tree Ensembles
Recent advances in machine learning and artificial intelligence are now being
considered in safety-critical autonomous systems where software defects may
cause severe harm to humans and the environment. Design organizations in these
domains are currently unable to provide convincing arguments that their systems
are safe to operate when machine learning algorithms are used to implement
their software.
In this paper, we present an efficient method to extract equivalence classes
from decision trees and tree ensembles, and to formally verify that their
input-output mappings comply with requirements. The idea is that, given that
safety requirements can be traced to desirable properties on system
input-output patterns, we can use positive verification outcomes in safety
arguments. This paper presents the implementation of the method in the tool
VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case
studies presented in current literature.
We demonstrate that our method is practical for tree ensembles trained on
low-dimensional data with up to 25 decision trees and tree depths of up to 20.
Our work also studies the limitations of the method with high-dimensional data
and preliminarily investigates the trade-off between large number of trees and
time taken for verification
Genetic Adversarial Training of Decision Trees
We put forward a novel learning methodology for ensembles of decision trees
based on a genetic algorithm which is able to train a decision tree for
maximizing both its accuracy and its robustness to adversarial perturbations.
This learning algorithm internally leverages a complete formal verification
technique for robustness properties of decision trees based on abstract
interpretation, a well known static program analysis technique. We implemented
this genetic adversarial training algorithm in a tool called Meta-Silvae (MS)
and we experimentally evaluated it on some reference datasets used in
adversarial training. The experimental results show that MS is able to train
robust models that compete with and often improve on the current
state-of-the-art of adversarial training of decision trees while being much
more compact and therefore interpretable and efficient tree models
Verifiable Learning for Robust Tree Ensembles
Verifying the robustness of machine learning models against evasion attacks at test time is an important research problem. Unfortunately, prior work established that this problem is NP-hard for decision tree ensembles, hence bound to be intractable for specific inputs. In this paper, we identify a restricted class of decision tree ensembles, called large-spread ensembles, which admit a security verification algorithm running in polynomial time. We then propose a new approach called verifiable learning, which advocates the training of such restricted model classes which are amenable for efficient verification. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data, thus enabling its security verification in polynomial time. Experimental results on public datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds, using standard commercial hardware. Moreover, large-spread ensembles are more robust than traditional ensembles against evasion attacks, at the cost of an acceptable loss of accuracy in the non-adversarial setting
Improving Machine Learning Pipeline Creation using Visual Programming and Static Analysis
Tese de mestrado, Engenharia Informática (Engenharia de Software), Universidade de Lisboa, Faculdade de Ciências, 2021ML pipelines are composed of several steps that load data, clean it, process it, apply learning algorithms and produce either reports or deploy inference systems into production. In real-world scenarios, pipelines can take days, weeks, or months to train with large quantities of data. Unfortunately, current tools to design and orchestrate ML pipelines are oblivious to the semantics of each step, allowing developers to easily introduce errors when connecting two components that might not work together, either syntactically or semantically. Data scientists and engineers often find these bugs during or after the lengthy execution, which decreases their productivity. We propose a Visual Programming Language (VPL) enriched with semantic constraints regarding the behavior of each component and a verification methodology that verifies entire pipelines to detect common ML bugs that existing visual and textual programming languages do not. We evaluate this methodology on a set of six bugs taken from a data science company focused on preventing financial fraud on big data. We were able detect these data engineering and data balancing bugs, as well as detect unnecessary computation in the pipelines
How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review
Context: Machine Learning (ML) has been at the heart of many innovations over
the past years. However, including it in so-called 'safety-critical' systems
such as automotive or aeronautic has proven to be very challenging, since the
shift in paradigm that ML brings completely changes traditional certification
approaches.
Objective: This paper aims to elucidate challenges related to the
certification of ML-based safety-critical systems, as well as the solutions
that are proposed in the literature to tackle them, answering the question 'How
to Certify Machine Learning Based Safety-critical Systems?'.
Method: We conduct a Systematic Literature Review (SLR) of research papers
published between 2015 to 2020, covering topics related to the certification of
ML systems. In total, we identified 217 papers covering topics considered to be
the main pillars of ML certification: Robustness, Uncertainty, Explainability,
Verification, Safe Reinforcement Learning, and Direct Certification. We
analyzed the main trends and problems of each sub-field and provided summaries
of the papers extracted.
Results: The SLR results highlighted the enthusiasm of the community for this
subject, as well as the lack of diversity in terms of datasets and type of
models. It also emphasized the need to further develop connections between
academia and industries to deepen the domain study. Finally, it also
illustrated the necessity to build connections between the above mention main
pillars that are for now mainly studied separately.
Conclusion: We highlighted current efforts deployed to enable the
certification of ML based software systems, and discuss some future research
directions.Comment: 60 pages (92 pages with references and complements), submitted to a
journal (Automated Software Engineering). Changes: Emphasizing difference
traditional software engineering / ML approach. Adding Related Works, Threats
to Validity and Complementary Materials. Adding a table listing papers
reference for each section/subsection