Search CORE

7,707 research outputs found

Systematic literature review of validation methods for AI systems

Author: Mikkonen Tommi
Myllyaho Lalli Santeri
Männistö Tomi
Nurminen Jukka K
Raatikainen Mikko
Publication venue
Publication date: 26/07/2021
Field of study

Context: Artificial intelligence (AI) has made its way into everyday activities, particularly through new techniques such as machine learning (ML). These techniques are implementable with little domain knowledge. This, combined with the difficulty of testing AI systems with traditional methods, has made system trustworthiness a pressing issue. Objective: This paper studies the methods used to validate practical AI systems reported in the literature. Our goal is to classify and describe the methods that are used in realistic settings to ensure the dependability of AI systems. Method: A systematic literature review resulted in 90 papers. Systems presented in the papers were analysed based on their domain, task, complexity, and applied validation methods. Results: The validation methods were synthesized into a taxonomy consisting of trial, simulation, model-centred validation, and expert opinion. Failure monitors, safety channels, redundancy, voting, and input and output restrictions are methods used to continuously validate the systems after deployment. Conclusions: Our results clarify existing strategies applied to validation. They form a basis for the synthesization, assessment, and refinement of AI system validation in research and guidelines for validating individual systems in practice. While various validation strategies have all been relatively widely applied, only few studies report on continuous validation.Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

Acceptance in Incomplete Argumentation Frameworks

Author: Baumeister Dorothea
Järvisalo Matti
Neugebauer Daniel
Niskanen Andreas
Rothe Jörg
Publication venue
Publication date: 01/06/2021
Field of study

A Abstract argumentation frameworks (AFs), originally proposed by Dung, constitute a central formal model for the study of computational aspects of argumentation in AI. Credulous and skeptical acceptance of arguments in a given AF are well-studied problems both in terms of theoretical analysis-especially computational complexity-and the development of practical decision procedures for the problems. However, AFs make the assumption that all attacks between arguments are certain (i.e., present attacks are known to exist, and missing attacks are known to not exist), which can in various settings be a restrictive assumption. A generalization of AFs to incomplete AFs was recently proposed as a formalism that allows the representation of both uncertain attacks and uncertain arguments in AFs. In this article, we explore the impact of allowing for modeling such uncertainties in AFs on the computational complexity of natural generalizations of acceptance problems to incomplete AFs under various central AF semantics. Complementing the complexity-theoretic analysis, we also develop the first practical decision procedures for all of the NP-hard variants of acceptance in incomplete AFs. In terms of complexity analysis, we establish a full complexity landscape, showing that depending on the variant of acceptance and property/semantics, the complexity of acceptance in incomplete AFs ranges from polynomial-time decidable to completeness for Sigma(p)(3). In terms of algorithms, we show through an extensive empirical evaluation that an implementation of the proposed decision procedures, based on boolean satisfiability (SAT) solving, is effective in deciding variants of acceptance under uncertainties. We also establish conditions for what type of atomic changes are guaranteed to be redundant from the perspective of preserving extensions of completions of incomplete AFs, and show that the results allow for considerably improving the empirical efficiency of the proposed SAT-based counterexample-guided abstraction refinement algorithms for acceptance in incomplete AFs for problem variants with complexity beyond NP. (C) 2021 The Authors. Published by Elsevier B.V.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary

Author: Lala J. H.
Smith T. B., III
Publication venue
Publication date
Field of study

The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts

NASA Technical Reports Server

Advanced flight control system study

Author: Hartmann G. L.
Lee H. P.
Ng W. K.
Rang E. R.
Schulte R. W.
Wall J. E., Jr.
Publication venue
Publication date
Field of study

A fly by wire flight control system architecture designed for high reliability includes spare sensor and computer elements to permit safe dispatch with failed elements, thereby reducing unscheduled maintenance. A methodology capable of demonstrating that the architecture does achieve the predicted performance characteristics consists of a hierarchy of activities ranging from analytical calculations of system reliability and formal methods of software verification to iron bird testing followed by flight evaluation. Interfacing this architecture to the Lockheed S-3A aircraft for flight test is discussed. This testbed vehicle can be expanded to support flight experiments in advanced aerodynamics, electromechanical actuators, secondary power systems, flight management, new displays, and air traffic control concepts

NASA Technical Reports Server

Recommended from our members

Application of Advanced Early Warning Systems with Adaptive Protection

Author: Ashrafi Frank
Babski-Reeves Kari
Blumstein Carl
Centeno Virgilio
Cibulka Lloyd
King Roger
Madani Vahid
Thorp James
Publication venue: eScholarship, University of California
Publication date: 01/01/2015
Field of study

This project developed and field-tested two methods of Adaptive Protection systems utilizing synchrophasor data. One method detects conditions of system stress that can lead to unintended relay operation, and initiates a supervisory signal to modify relay response in real time to avoid false trips. The second method detects the possibility of false trips of impedance relays as stable system swings “encroach” on the relays’ impedance zones, and produces an early warning so that relay engineers can re-evaluate relay settings. In addition, real-time synchrophasor data produced by this project was used to develop advanced visualization techniques for display of synchrophasor data to utility operators and engineers

eScholarship - University of California

Software defect prediction: do different classifiers find the same defects?

Author: AT Mısırlı
B Turhan
C Catal
C Seiffert
C Soares
D Gray
D Gray
David Bowes
DH Wolpert
E Arisholm
H Chen
I Witten
IH Laradji
Jean Petrić
K Elish
L Briand
L Madeyski
M D’Ambros
M Shepperd
M Shepperd
M Shepperd
MA Hall
N Fenton
NV Chawla
R Malhotra
S Lessmann
T Hall
T Khoshgoftaar
T Menzies
Tracy Hall
U Fayyad
W Chen
Y Zhou
Z Sun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio

Crossref

Springer - Publisher Connector

Lancaster E-Prints

University of Hertfordshire Research Archive