9 research outputs found
DeepRICH: Learning Deeply Cherenkov Detectors
Imaging Cherenkov detectors are largely used for particle identification
(PID) in nuclear and particle physics experiments, where developing fast
reconstruction algorithms is becoming of paramount importance to allow for near
real time calibration and data quality control, as well as to speed up offline
analysis of large amount of data. In this paper we present DeepRICH, a novel
deep learning algorithm for fast reconstruction which can be applied to
different imaging Cherenkov detectors. The core of our architecture is a
generative model which leverages on a custom Variational Auto-encoder (VAE)
combined to Maximum Mean Discrepancy (MMD), with a Convolutional Neural Network
(CNN) extracting features from the space of the latent variables for
classification. A thorough comparison with the simulation/reconstruction
package FastDIRC is discussed in the text. DeepRICH has the advantage to bypass
low-level details needed to build a likelihood, allowing for a sensitive
improvement in computation time at potentially the same reconstruction
performance of other established reconstruction algorithms. In the conclusions,
we address the implications and potentialities of this work, discussing
possible future extensions and generalization.Comment: 14 pages, 9 figures, preprin
Pixle: a fast and effective black-box attack based on rearranging pixels
Recent research has found that neural networks are vulnerable to several types of adversarial attacks, where the input samples are modified in such a way that the model produces a wrong prediction that misclassifies the adversarial sample. In this paper we focus on black-box adversarial attacks, that can be performed without knowing the inner structure of the attacked model, nor the training procedure, and we propose a novel attack that is capable of correctly attacking a high percentage of samples by rearranging a small number of pixels within the attacked image. We demonstrate that our attack works on a large number of datasets and models, that it requires a small number of iterations, and that the distance between the original sample and the adversarial one is negligible to the human eye
Bayesian Neural Networks With Maximum Mean Discrepancy Regularization
Bayesian Neural Networks (BNNs) are trained to optimize an entire
distribution over their weights instead of a single set, having significant
advantages in terms of, e.g., interpretability, multi-task learning, and
calibration. Because of the intractability of the resulting optimization
problem, most BNNs are either sampled through Monte Carlo methods, or trained
by minimizing a suitable Evidence Lower BOund (ELBO) on a variational
approximation. In this paper, we propose a variant of the latter, wherein we
replace the Kullback-Leibler divergence in the ELBO term with a Maximum Mean
Discrepancy (MMD) estimator, inspired by recent work in variational inference.
After motivating our proposal based on the properties of the MMD term, we
proceed to show a number of empirical advantages of the proposed formulation
over the state-of-the-art. In particular, our BNNs achieve higher accuracy on
multiple benchmarks, including several image classification tasks. In addition,
they are more robust to the selection of a prior over the weights, and they are
better calibrated. As a second contribution, we provide a new formulation for
estimating the uncertainty on a given prediction, showing it performs in a more
robust fashion against adversarial attacks and the injection of noise over
their inputs, compared to more classical criteria such as the differential
entropy
Continual Learning with Invertible Generative Models
Catastrophic forgetting (CF) happens whenever a neural network overwrites
past knowledge while being trained on new tasks. Common techniques to handle CF
include regularization of the weights (using, e.g., their importance on past
tasks), and rehearsal strategies, where the network is constantly re-trained on
past data. Generative models have also been applied for the latter, in order to
have endless sources of data. In this paper, we propose a novel method that
combines the strengths of regularization and generative-based rehearsal
approaches. Our generative model consists of a normalizing flow (NF), a
probabilistic and invertible neural network, trained on the internal embeddings
of the network. By keeping a single NF throughout the training process, we show
that our memory overhead remains constant. In addition, exploiting the
invertibility of the NF, we propose a simple approach to regularize the
network's embeddings with respect to past tasks. We show that our method
performs favorably with respect to state-of-the-art approaches in the
literature, with bounded computational power and memory overheads.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0244
NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks
Early Exit Neural Networks (EENNs) endow astandard Deep Neural Network (DNN)
with Early Exit Classifiers (EECs), to provide predictions at intermediate
points of the processing when enough confidence in classification is achieved.
This leads to many benefits in terms of effectiveness and efficiency.
Currently, the design of EENNs is carried out manually by experts, a complex
and time-consuming task that requires accounting for many aspects, including
the correct placement, the thresholding, and the computational overhead of the
EECs. For this reason, the research is exploring the use of Neural Architecture
Search (NAS) to automatize the design of EENNs. Currently, few comprehensive
NAS solutions for EENNs have been proposed in the literature, and a fully
automated, joint design strategy taking into consideration both the backbone
and the EECs remains an open problem. To this end, this work presents Neural
Architecture Search for Hardware Constrained Early Exit Neural Networks
(NACHOS), the first NAS framework for the design of optimal EENNs satisfying
constraints on the accuracy and the number of Multiply and Accumulate (MAC)
operations performed by the EENNs at inference time. In particular, this
provides the joint design of backbone and EECs to select a set of admissible
(i.e., respecting the constraints) Pareto Optimal Solutions in terms of best
tradeoff between the accuracy and number of MACs. The results show that the
models designed by NACHOS are competitive with the state-of-the-art EENNs.
Additionally, this work investigates the effectiveness of two novel
regularization terms designed for the optimization of the auxiliary classifiers
of the EEN
Avalanche: An end-to-end library for continual learning
Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms
A Probabilistic Re-Intepretation of Confidence Scores in Multi-Exit Models
In this paper, we propose a new approach to train a deep neural network with multiple intermediate auxiliary classifiers, branching from it. These ‘multi-exits’ models can be used to reduce the inference time by performing early exit on the intermediate branches, if the confidence of the prediction is higher than a threshold. They rely on the assumption that not all the samples require the same amount of processing to yield a good prediction. In this paper, we propose a way to train jointly all the branches of a multi-exit model without hyper-parameters, by weighting the predictions from each branch with a trained confidence score. Each confidence score is an approximation of the real one produced by the branch, and it is calculated and regularized while training the rest of the model. We evaluate our proposal on a set of image classification benchmarks, using different neural models and early-exit stopping criteria
Efficient Continual Learning in Neural Networks with Embedding Regularization
Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered either the progressive increase in the size of the networks, or have tried to regularize the network behavior to equalize it with respect to previously observed tasks. In the latter case, it is essential to understand what type of information best represents this past behavior. Common techniques include regularizing the past outputs, gradients, or individual weights. In this work, we propose a new, relatively simple and efficient method to perform continual learning by regularizing instead the network internal embeddings. To make the approach scalable, we also propose a dynamic sampling strategy to reduce the memory footprint of the required external storage. We show that our method performs favorably with respect to state-of-the-art approaches in the literature, while requiring significantly less space in memory and computational time. In addition, inspired inspired by to recent works, we evaluate the impact of selecting a more flexible model for the activation functions inside the network, evaluating the impact of catastrophic forgetting on the activation functions themselves