Search CORE

113 research outputs found

A linear approach for sparse coding by a two-layer neural network

Author: Montalto Alessandro
Prevete Roberto
Tessitore Giovanni
Publication venue
Publication date: 01/01/2015
Field of study

Many approaches to transform classification problems from non-linear to linear by feature transformation have been recently presented in the literature. These notably include sparse coding methods and deep neural networks. However, many of these approaches require the repeated application of a learning process upon the presentation of unseen data input vectors, or else involve the use of large numbers of parameters and hyper-parameters, which must be chosen through cross-validation, thus increasing running time dramatically. In this paper, we propose and experimentally investigate a new approach for the purpose of overcoming limitations of both kinds. The proposed approach makes use of a linear auto-associative network (called SCNN) with just one hidden layer. The combination of this architecture with a specific error function to be minimized enables one to learn a linear encoder computing a sparse code which turns out to be as similar as possible to the sparse coding that one obtains by re-training the neural network. Importantly, the linearity of SCNN and the choice of the error function allow one to achieve reduced running time in the learning phase. The proposed architecture is evaluated on the basis of two standard machine learning tasks. Its performances are compared with those of recently proposed non-linear auto-associative neural networks. The overall results suggest that linear encoders can be profitably used to obtain sparse data representations in the context of machine learning problems, provided that an appropriate error function is used during the learning phase

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

A survey on modern trainable activation functions

Author: Apicella Andrea
Donnarumma Francesco
Isgrò Francesco
Prevete Roberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers

Author: Apicella Andrea
Isgrò Francesco
Prevete Roberto
Publication venue
Publication date: 18/11/2023
Field of study

In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation "somewhat more linearly separable" than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the appropriate neural network parameters that can perform these transformations is a critical problem. In this paper, we investigate the impact on deep network classifier performances of a training approach favouring solutions where data representations at the hidden layers have a higher degree of linear separability between the classes with respect to standard methods. To this aim, we propose a neural network architecture which induces an error function involving the outputs of all the network layers. Although similar approaches have already been partially discussed in the past literature, here we propose a new architecture with a novel error function and an extensive experimental analysis. This experimental analysis was made in the context of image classification tasks considering four widely used datasets. The results show that our approach improves the accuracy on the test set in all the considered cases.Comment: Paper accepted on Pattern Recognition Letters journal in Open Access with doi https://www.sciencedirect.com/science/article/pii/S0167865523003264 . Please refer to the published versio

arXiv.org e-Print Archive

Neural Networks with Non-Uniform Embedding and Explicit Validation Phase to Assess Granger Causality

Author: Faes Luca
Marinazzo Daniele
Montalto Alessandro
Prevete Roberto
Stramaglia Sebastiano
Tessitore Giovanni
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

A challenging problem when studying a dynamical system is to find the interdependencies among its individual components. Several algorithms have been proposed to detect directed dynamical influences between time series. Two of the most used approaches are a model-free one (transfer entropy) and a model-based one (Granger causality). Several pitfalls are related to the presence or absence of assumptions in modeling the relevant features of the data. We tried to overcome those pitfalls using a neural network approach in which a model is built without any a priori assumptions. In this sense this method can be seen as a bridge between model-free and model-based approaches. The experiments performed will show that the method presented in this work can detect the correct dynamical information flows occurring in a system of time series. Additionally we adopt a non-uniform embedding framework according to which only the past states that actually help the prediction are entered into the model, improving the prediction and avoiding the risk of overfitting. This method also leads to a further improvement with respect to traditional Granger causality approaches when redundant variables (i.e. variables sharing the same information about the future of the system) are involved. Neural networks are also able to recognize dynamics in data sets completely different from the ones used during the training phase

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Ghent University Academic Bibliography

Archivio istituzionale della ricerca - Università di Bari

Archivio istituzionale della ricerca - Università di Palermo

Toward the application of XAI methods in EEG-based systems

Author: Apicella Andrea
Isgrò Francesco
Pollastro Andrea
Prevete Roberto
Publication venue
Publication date: 05/11/2022
Field of study

An interesting case of the well-known Dataset Shift Problem is the classification of Electroencephalogram (EEG) signals in the context of Brain-Computer Interface (BCI). The non-stationarity of EEG signals can lead to poor generalisation performance in BCI classification systems used in different sessions, also from the same subject. In this paper, we start from the hypothesis that the Dataset Shift problem can be alleviated by exploiting suitable eXplainable Artificial Intelligence (XAI) methods to locate and transform the relevant characteristics of the input for the goal of classification. In particular, we focus on an experimental analysis of explanations produced by several XAI methods on an ML system trained on a typical EEG dataset for emotion recognition. Results show that many relevant components found by XAI methods are shared across the sessions and can be used to build a system able to generalise better. However, relevant components of the input signal also appear to be highly dependent on the input itself.Comment: Accepted to be presented at XAI.it 2022 - Italian Workshop on Explainable Artificial Intelligenc

arXiv.org e-Print Archive

Middle-Level Features for the Explanation of Classification Systems by Sparse Dictionary Methods.

Author: Andrea Apicella
Francesco Isgrò
Guglielmo Tamburrini
Roberto Prevete
Publication venue
Publication date: 14/07/2020
Field of study

Machine learning (ML) systems are affected by a pervasive lack of transparency. The eXplainable Artificial Intelligence (XAI) research area addresses this problem and the related issue of explaining the behavior of ML systems in terms that are understandable to human beings. In many explanation of XAI approaches, the output of ML systems are explained in terms of low-level features of their inputs. However, these approaches leave a substantive explanatory burden with human users, insofar as the latter are required to map low-level properties into more salient and readily understandable parts of the input. To alleviate this cognitive burden, an alternative model-agnostic framework is proposed here. This framework is instantiated to address explanation problems in the context of ML image classification systems, without relying on pixel relevance maps and other low-level features of the input. More specifically, one obtains sets of middle-level properties of classification inputs that are perceptually salient by applying sparse dictionary learning techniques. These middle-level properties are used as building blocks for explanations of image classifications. The achieved explanations are parsimonious, for their reliance on a limited set of middle-level image properties. And they can be contrastive, because the set of middle-level image properties can be used to explain why the system advanced the proposed classification over other antagonist classifications. In view of its model-agnostic character, the proposed framework is adaptable to a variety of other ML systems and explanation problems

Open Access Repository

On The Effects Of Data Normalisation For Domain Adaptation On EEG Data

Author: Apicella Andrea
Isgrò Francesco
Pollastro Andrea
Prevete Roberto
Publication venue: 'Elsevier BV'
Publication date: 10/07/2023
Field of study

In the Machine Learning (ML) literature, a well-known problem is the Dataset Shift problem where, differently from the ML standard hypothesis, the data in the training and test sets can follow different probability distributions, leading ML systems toward poor generalisation performances. This problem is intensely felt in the Brain-Computer Interface (BCI) context, where bio-signals as Electroencephalographic (EEG) are often used. In fact, EEG signals are highly non-stationary both over time and between different subjects. To overcome this problem, several proposed solutions are based on recent transfer learning approaches such as Domain Adaption (DA). In several cases, however, the actual causes of the improvements remain ambiguous. This paper focuses on the impact of data normalisation, or standardisation strategies applied together with DA methods. In particular, using \textit{SEED}, \textit{DEAP}, and \textit{BCI Competition IV 2a} EEG datasets, we experimentally evaluated the impact of different normalization strategies applied with and without several well-known DA methods, comparing the obtained performances. It results that the choice of the normalisation strategy plays a key role on the classifier performances in DA scenarios, and interestingly, in several cases, the use of only an appropriate normalisation schema outperforms the DA technique.Comment: Published in its final version on Engineering Applications of Artificial Intelligence (EAAI) https://doi.org/10.1016/j.engappai.2023.10620

arXiv.org e-Print Archive

Semi-supervised detection of structural damage using Variational Autoencoder and a One-Class Support Vector Machine

Author: Bilotta Antonio
Pollastro Andrea
Prevete Roberto
Testa Giusiana
Publication venue
Publication date: 24/03/2023
Field of study

In recent years, Artificial Neural Networks (ANNs) have been introduced in Structural Health Monitoring (SHM) systems. A semi-supervised method with a data-driven approach allows the ANN training on data acquired from an undamaged structural condition to detect structural damages. In standard approaches, after the training stage, a decision rule is manually defined to detect anomalous data. However, this process could be made automatic using machine learning methods, whom performances are maximised using hyperparameter optimization techniques. The paper proposes a semi-supervised method with a data-driven approach to detect structural anomalies. The methodology consists of: (i) a Variational Autoencoder (VAE) to approximate undamaged data distribution and (ii) a One-Class Support Vector Machine (OC-SVM) to discriminate different health conditions using damage sensitive features extracted from VAE's signal reconstruction. The method is applied to a scale steel structure that was tested in nine damage's scenarios by IASC-ASCE Structural Health Monitoring Task Group

arXiv.org e-Print Archive