36 research outputs found
Gradient-Based Competitive Learning: Theory
Deep learning has been recently used to extract the relevant features for representing input data also in the unsupervised setting. However, state-of-the-art techniques focus mostly on algorithmic efficiency and accuracy rather than mimicking the input manifold. On the contrary, competitive learning is a powerful tool for replicating the input distribution topology. It is cognitive/biologically inspired as it is founded on Hebbian learning, a neuropsychological theory claiming that neurons can increase their specialization by competing for the right to respond to/represent a subset of the input data. This paper introduces a novel perspective by combining these two techniques: unsupervised gradient-based and competitive learning. The theory is based on the intuition that neural networks can learn topological structures by working directly on the transpose of the input matrix. At this purpose, the vanilla competitive layer and its dual are presented. The former is representative of a standard competitive layer for deep clustering, while the latter is trained on the transposed matrix. The equivalence of the layers is extensively proven both theoretically and experimentally. The dual competitive layer has better properties. Unlike the vanilla layer, it directly outputs the prototypes of the data inputs, while still allowing learning by backpropagation. More importantly, this paper proves theoretically that the dual layer is better suited for handling high-dimensional data (e.g., for biological applications), because the estimation of the weights is driven by a constraining subspace which does not depend on the input dimensionality, but only on the dataset cardinality. This paper has introduced a novel approach for unsupervised gradient-based competitive learning. This approach is very promising both in the case of small datasets of high-dimensional data and for better exploiting the advantages of a deep architecture: the dual layer perfectly integrates with the deep layers. A theoretical justification is also given by using the analysis of the gradient flow for both vanilla and dual layers
Concept-based Explainable Artificial Intelligence: A Survey
The field of explainable artificial intelligence emerged in response to the
growing need for more transparent and reliable models. However, using raw
features to provide explanations has been disputed in several works lately,
advocating for more user-understandable explanations. To address this issue, a
wide range of papers proposing Concept-based eXplainable Artificial
Intelligence (C-XAI) methods have arisen in recent years. Nevertheless, a
unified categorization and precise field definition are still missing. This
paper fills the gap by offering a thorough review of C-XAI approaches. We
define and identify different concepts and explanation types. We provide a
taxonomy identifying nine categories and propose guidelines for selecting a
suitable category based on the development context. Additionally, we report
common evaluation strategies including metrics, human evaluations and dataset
employed, aiming to assist the development of future methods. We believe this
survey will serve researchers, practitioners, and domain experts in
comprehending and advancing this innovative field
Encryption-agnostic classifiers of traffic originators and their application to anomaly detection
This paper presents an approach that leverages classical machine learning techniques to identify the tools from the packets sniffed, both for clear-text and encrypted traffic. This research aims to overcome the limitations to security monitoring systems posed by the widespread adoption of encrypted communications. By training three distinct classifiers, this paper shows that it is possible to detect, with excellent accuracy, the category of tools that generated the analyzed traffic (e.g., browsers vs. network stress tools), the actual tools (e.g., Firefox vs. Chrome vs. Edge), and the individual tool versions (e.g., Chrome 48 vs. Chrome 68). The paper provides hints that the classifiers are helpful for early detection of Distributed Denial of Service (DDoS) attacks, duplication of entire websites, and identification of sudden changes in users’ behavior, which might be the consequence of malware infection or data exfiltration
Topological Gradient-based Competitive Learning
Topological learning is a wide research area aiming at uncovering the mutual spatial relationships between the elements of a set. Some of the most common and oldest approaches involve the use of unsupervised competitive neural networks. However, these methods are not based on gradient optimization which has been proven to provide striking results in feature extraction also in unsupervised learning. Unfortunately, by focusing mostly on algorithmic efficiency and accuracy, deep clustering techniques are composed of overly complex feature extractors, while using trivial algorithms in their top layer. The aim of this work is to present a novel comprehensive theory aspiring at bridging competitive learning with gradient-based learning, thus allowing the use of extremely powerful deep neural networks for feature extraction and projection combined with the remarkable flexibility and expressiveness of competitive learning. In this paper we fully demonstrate the theoretical equivalence of two novel gradient-based competitive layers. Preliminary experiments show how the dual approach, trained on the transpose of the input matrix i.e. X T , lead to faster convergence rate and higher training accuracy both in low and high-dimensional scenarios
Entropy-Based Logic Explanations of Neural Networks
Explainable artificial intelligence has rapidly emerged since
lawmakers have started requiring interpretable models for
safety-critical domains. Concept-based neural networks have
arisen as explainable-by-design methods as they leverage
human-understandable symbols (i.e. concepts) to predict
class memberships. However, most of these approaches focus on the identification of the most relevant concepts but do
not provide concise, formal explanations of how such concepts are leveraged by the classifier to make predictions. In
this paper, we propose a novel end-to-end differentiable approach enabling the extraction of logic explanations from
neural networks using the formalism of First-Order Logic.
The method relies on an entropy-based criterion which automatically identifies the most relevant concepts. We consider four different case studies to demonstrate that: (i) this
entropy-based criterion enables the distillation of concise
logic explanations in safety-critical domains from clinical
data to computer vision; (ii) the proposed approach outperforms state-of-the-art white-box models in terms of classification accuracy and matches black box performances
Entropy-based Logic Explanations of Neural Networks
Explainable artificial intelligence has rapidly emerged since lawmakers have
started requiring interpretable models for safety-critical domains.
Concept-based neural networks have arisen as explainable-by-design methods as
they leverage human-understandable symbols (i.e. concepts) to predict class
memberships. However, most of these approaches focus on the identification of
the most relevant concepts but do not provide concise, formal explanations of
how such concepts are leveraged by the classifier to make predictions. In this
paper, we propose a novel end-to-end differentiable approach enabling the
extraction of logic explanations from neural networks using the formalism of
First-Order Logic. The method relies on an entropy-based criterion which
automatically identifies the most relevant concepts. We consider four different
case studies to demonstrate that: (i) this entropy-based criterion enables the
distillation of concise logic explanations in safety-critical domains from
clinical data to computer vision; (ii) the proposed approach outperforms
state-of-the-art white-box models in terms of classification accuracy
Entropy-Based Logic Explanations of Neural Networks
Explainable artificial intelligence has rapidly emerged since
lawmakers have started requiring interpretable models for
safety-critical domains. Concept-based neural networks have
arisen as explainable-by-design methods as they leverage
human-understandable symbols (i.e. concepts) to predict
class memberships. However, most of these approaches focus on the identification of the most relevant concepts but do
not provide concise, formal explanations of how such concepts are leveraged by the classifier to make predictions. In
this paper, we propose a novel end-to-end differentiable approach enabling the extraction of logic explanations from
neural networks using the formalism of First-Order Logic.
The method relies on an entropy-based criterion which automatically identifies the most relevant concepts. We consider four different case studies to demonstrate that: (i) this
entropy-based criterion enables the distillation of concise
logic explanations in safety-critical domains from clinical
data to computer vision; (ii) the proposed approach outperforms state-of-the-art white-box models in terms of classification accuracy and matches black box performances
Interpretable Neural-Symbolic Concept Reasoning
Deep learning methods are highly accurate, yet their opaque decision process
prevents them from earning full human trust. Concept-based models aim to
address this issue by learning tasks based on a set of human-understandable
concepts. However, state-of-the-art concept-based models rely on
high-dimensional concept embedding representations which lack a clear semantic
meaning, thus questioning the interpretability of their decision process. To
overcome this limitation, we propose the Deep Concept Reasoner (DCR), the first
interpretable concept-based model that builds upon concept embeddings. In DCR,
neural networks do not make task predictions directly, but they build syntactic
rule structures using concept embeddings. DCR then executes these rules on
meaningful concept truth degrees to provide a final interpretable and
semantically-consistent prediction in a differentiable manner. Our experiments
show that DCR: (i) improves up to +25% w.r.t. state-of-the-art interpretable
concept-based models on challenging benchmarks (ii) discovers meaningful logic
rules matching known ground truths even in the absence of concept supervision
during training, and (iii), facilitates the generation of counterfactual
examples providing the learnt rules as guidance