Search CORE

587 research outputs found

Classification of Explainable Artificial Intelligence Methods through Their Output Formats

Author: Longo Luca
Vilone Giulia
Publication venue: Technological University Dublin
Publication date: 01/08/2021
Field of study

Machine and deep learning have proven their utility to generate data-driven models with high accuracy and precision. However, their non-linear, complex structures are often difficult to interpret. Consequently, many scholars have developed a plethora of methods to explain their functioning and the logic of their inferences. This systematic review aimed to organise these methods into a hierarchical classification system that builds upon and extends existing taxonomies by adding a significant dimension—the output formats. The reviewed scientific papers were retrieved by conducting an initial search on Google Scholar with the keywords “explainable artificial intelligence”; “explainable machine learning”; and “interpretable machine learning”. A subsequent iterative search was carried out by checking the bibliography of these articles. The addition of the dimension of the explanation format makes the proposed classification system a practical tool for scholars, supporting them to select the most suitable type of explanation format for the problem at hand. Given the wide variety of challenges faced by researchers, the existing XAI methods provide several solutions to meet the requirements that differ considerably between the users, problems and application fields of artificial intelligence (AI). The task of identifying the most appropriate explanation can be daunting, thus the need for a classification system that helps with the selection of methods. This work concludes by critically identifying the limitations of the formats of explanations and by providing recommendations and possible future research directions on how to build a more generally applicable XAI method. Future work should be flexible enough to meet the many requirements posed by the widespread use of AI in several fields, and the new regulation

Arrow@TUDublin

Directory of Open Access Journals

xxAI - Beyond Explainable AI

Author: Fong Ruth, Ed.
Goebel Randy, Ed.
Holzinger Andreas, Ed.
Moon Taesup, Ed.
Muller Klaus-Robert, Ed.
Samek Wojciech, Ed.
Tsai Chun-Hua
Publication venue: DigitalCommons@UNO
Publication date: 16/04/2022
Field of study

This is an open access book. Statistical machine learning (ML) has triggered a renaissance of artificial intelligence (AI). While the most successful ML models, including Deep Neural Networks (DNN), have developed better predictivity, they have become increasingly complex, at the expense of human interpretability (correlation vs. causality). The field of explainable AI (xAI) has emerged with the goal of creating tools and models that are both predictive and interpretable and understandable for humans. Explainable AI is receiving huge interest in the machine learning and AI research communities, across academia, industry, and government, and there is now an excellent opportunity to push towards successful explainable AI applications. This volume will help the research community to accelerate this process, to promote a more systematic use of explainable AI to improve models in diverse applications, and ultimately to better understand how current explainable AI methods need to be improved and what kind of theory of explainable AI is needed. After overviews of current methods and challenges, the editors include chapters that describe new developments in explainable AI. The contributions are from leading researchers in the field, drawn from both academia and industry, and many of the chapters take a clear interdisciplinary approach to problem-solving. The concepts discussed include explainability, causability, and AI interfaces with humans, and the applications include image processing, natural language, law, fairness, and climate science.https://digitalcommons.unomaha.edu/isqafacbooks/1000/thumbnail.jp

The University of Nebraska, Omaha

Recommended from our members

Knowledge Representation and Reasoning with Deep Neural Networks

Author: Neelakantan Arvind Ramanathan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/11/2017
Field of study

Knowledge representation and reasoning is one of the central challenges of artificial intelligence, and has important implications in many fields including natural language understanding and robotics. Representing knowledge with symbols, and reasoning via search and logic has been the dominant paradigm for many decades. In this work, we use deep neural networks to learn to both represent symbols and perform reasoning end-to-end from data. By learning powerful non-linear models, our approach generalizes to massive amounts of knowledge and works well with messy real-world data using minimal human effort. First, we show that recurrent neural networks with an attention mechanism achieve state-of-the-art reasoning on a large structured knowledge graph. Next, we develop Neural Programmer, a neural network augmented with discrete operations that can be learned to induce latent programs with backpropagation. We apply Neural Programmer to induce short programs on two datasets: a synthetic dataset requiring arithmetic and logic reasoning, and a natural language question answering dataset that requires reasoning on semi-structured Wikipedia tables. We present what is to our awareness the first weakly supervised, end-to-end neural network model to induce such programs on a real-world dataset. Unlike previous learning approaches to program induction, the model does not require domain-specific grammars, rules, or annotations. Finally, we discuss methods to scale Neural Programmer training to large databases

ScholarWorks@UMass Amherst

xxAI - Beyond Explainable AI

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/05/2022
Field of study

Directory of Open Access Books (DOAB)

Syntactic inductive biases for deep learning methods

Author: Shen Yikang
Publication venue
Publication date: 01/08/2022
Field of study

Le débat entre connexionnisme et symbolisme est l'une des forces majeures qui animent le développement de l'Intelligence Artificielle. L'apprentissage profond et la linguistique théorique sont les domaines d'études les plus représentatifs pour les deux écoles respectivement. Alors que la méthode d'apprentissage profond a fait des percées impressionnantes et est devenue la principale raison de la récente prospérité de l'IA pour l'industrie et les universités, la linguistique et le symbolisme occupent quelque domaines importantes, notamment l'interprétabilité et la fiabilité. Dans cette thèse, nous essayons de construire une connexion entre les deux écoles en introduisant des biais inductifs linguistiques pour les modèles d'apprentissage profond. Nous proposons deux familles de biais inductifs, une pour la structure de circonscription et une autre pour la structure de dépendance. Le biais inductif de circonscription encourage les modèles d'apprentissage profond à utiliser différentes unités (ou neurones) pour traiter séparément les informations à long terme et à court terme. Cette séparation fournit un moyen pour les modèles d'apprentissage profond de construire les représentations hiérarchiques latentes à partir d'entrées séquentielles, dont une représentation de niveau supérieur est composée et peut être décomposée en une série de représentations de niveau inférieur. Par exemple, sans connaître la structure de vérité fondamentale, notre modèle proposé apprend à traiter l'expression logique en composant des représentations de variables et d'opérateurs en représentations d'expressions selon sa structure syntaxique. D'autre part, le biais inductif de dépendance encourage les modèles à trouver les relations latentes entre les mots dans la séquence d'entrée. Pour le langage naturel, les relations latentes sont généralement modélisées sous la forme d'un graphe de dépendance orienté, où un mot a exactement un nœud parent et zéro ou plusieurs nœuds enfants. Après avoir appliqué cette contrainte à un modèle de type transformateur, nous constatons que le modèle est capable d'induire des graphes orientés proches des annotations d'experts humains, et qu'il surpasse également le modèle de transformateur standard sur différentes tâches. Nous pensons que ces résultats expérimentaux démontrent une alternative intéressante pour le développement futur de modèles d'apprentissage profond.The debate between connectionism and symbolism is one of the major forces that drive the development of Artificial Intelligence. Deep Learning and theoretical linguistics are the most representative fields of study for the two schools respectively. While the deep learning method has made impressive breakthroughs and became the major reason behind the recent AI prosperity for industry and academia, linguistics and symbolism still holding some important grounds including reasoning, interpretability and reliability. In this thesis, we try to build a connection between the two schools by introducing syntactic inductive biases for deep learning models. We propose two families of inductive biases, one for constituency structure and another one for dependency structure. The constituency inductive bias encourages deep learning models to use different units (or neurons) to separately process long-term and short-term information. This separation provides a way for deep learning models to build the latent hierarchical representations from sequential inputs, that a higher-level representation is composed of and can be decomposed into a series of lower-level representations. For example, without knowing the ground-truth structure, our proposed model learns to process logical expression through composing representations of variables and operators into representations of expressions according to its syntactic structure. On the other hand, the dependency inductive bias encourages models to find the latent relations between entities in the input sequence. For natural language, the latent relations are usually modeled as a directed dependency graph, where a word has exactly one parent node and zero or several children nodes. After applying this constraint to a transformer-like model, we find the model is capable of inducing directed graphs that are close to human expert annotations, and it also outperforms the standard transformer model on different tasks. We believe that these experimental results demonstrate an interesting alternative for the future development of deep learning models

Dépôt Institutionnel Numérique

Human reasoning and cognitive science

Author: Stenning Keith
van Lambalgen Michiel
Publication venue
Publication date: 01/01/2008
Field of study

In the late summer of 1998, the authors, a cognitive scientist and a logician, started talking about the relevance of modern mathematical logic to the study of human reasoning, and we have been talking ever since. This book is an interim report of that conversation. It argues that results such as those on the Wason selection task, purportedly showing the irrelevance of formal logic to actual human reasoning, have been widely misinterpreted, mainly because the picture of logic current in psychology and cognitive science is completely mistaken. We aim to give the reader a more accurate picture of mathematical logic and, in doing so, hope to show that logic, properly conceived, is still a very helpful tool in cognitive science. The main thrust of the book is therefore constructive. We give a number of examples in which logical theorizing helps in understanding and modeling observed behavior in reasoning tasks, deviations of that behavior in a psychiatric disorder (autism), and even the roots of that behavior in the evolution of the brain

PhilPapers

International Migration, Integration and Social Cohesion online publications

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

Author: Alistarh Dan-Adrian
Ben-Nun Tal
Dryden Nikoli
Hoefler Torsten
Peste Elena-Alexandra
Publication venue: Journal of Machine Learning Research
Publication date: 01/01/2021
Field of study

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, sometimes even better than, the original dense networks. Sparsity promises to reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field

IST Austria: PubRep (Institute of Science and Technology)

Attention is more than prediction precision [Commentary on target article]

Author: Bowman Howard
Filetti Marco
Olivers Christian
Wyble Brad
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2013
Field of study

A cornerstone of the target article is that, in a predictive coding framework, attention can be modelled by weighting prediction error with a measure of precision. We argue that this is not a complete explanation, especially in the light of ERP (event-related potentials) data showing large evoked responses for frequently presented target stimuli, which thus are predicted

VU Research Portal

University of Birmingham Research Portal

Kent Academic Repository

Comparing Learned Representations between Unpruned and Pruned Deep Convolutional Neural Networks

Author: Mitchell Parker
Publication venue: DigitalCommons@CalPoly
Publication date: 01/06/2022
Field of study

While deep neural networks have shown impressive performance in computer vision tasks, natural language processing, and other domains, the sizes and inference times of these models can often prevent them from being used on resource-constrained systems. Furthermore, as these networks grow larger in size and complexity, it can become even harder to understand the learned representations of the input data that these networks form through training. These issues of growing network size, increasing complexity and runtime, and ambiguity in the understanding of internal representations serve as guiding points for this work. In this thesis, we create a neural network that is capable of predicting up to three path waypoints given an input image. This network will be used in conjunction with other networks to help guide an autonomous robotic vehicle. Since this neural network will be deployed to an embedded system, it is important that our network is efficient. As such, we use a network compression technique known as L1 norm pruning to reduce the size of the network and speed up the inference time, while retaining similar loss. Furthermore, we investigate the effects that pruning has on the internal learned representations of models by comparing unpruned and pruned network layers using projection weighted canonical correlation analysis (PWCCA). Our results show that for deep convolutional neural networks (CNN), PWCCA similarity scores between early convolutional layers start low and then gradually increase towards the final layers of the network, with some peaks in the intermediate layers. We also show that for our deep CNN, linear layers at the end of the network also exhibit very high similarity, serving to guide the dissimilar representations from intermediate convolutional layers to a common representation that yields similar network performance between unpruned and pruned networks

DigitalCommons@CalPoly

xxAI - Beyond Explainable AI

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library