35,686 research outputs found
Machine and deep learning techniques in heavy-ion collisions with ALICE
Over the last years, machine learning tools have been successfully applied to
a wealth of problems in high-energy physics. A typical example is the
classification of physics objects. Supervised machine learning methods allow
for significant improvements in classification problems by taking into account
observable correlations and by learning the optimal selection from examples,
e.g. from Monte Carlo simulations. Even more promising is the usage of deep
learning techniques. Methods like deep convolutional networks might be able to
catch features from low-level parameters that are not exploited by default
cut-based methods.
These ideas could be particularly beneficial for measurements in heavy-ion
collisions, because of the very large multiplicities. Indeed, machine learning
methods potentially perform much better in systems with a large number of
degrees of freedom compared to cut-based methods. Moreover, many key heavy-ion
observables are most interesting at low transverse momentum where the
underlying event is dominant and the signal-to-noise ratio is quite low.
In this work, recent developments of machine- and deep learning applications
in heavy-ion collisions with ALICE will be presented, with focus on a deep
learning-based b-jet tagging approach and the measurement of low-mass
dielectrons. While the b-jet tagger is based on a mixture of shallow
fully-connected and deep convolutional networks, the low-mass dielectron
measurement uses gradient boosting and shallow neural networks. Both methods
are very promising compared to default cut-based methods.Comment: 7 pages, 5 figures, EPS HEP 2017 proceeding
Unsupervised segmentation of submarine recordings
The thesis focuses on the unsupervised segmentation of submarine recordings collected by the Norwegian Polar Institute (NPI) using hydrophones. These recordings consists of various mammal species, along with other phenomena like vessel engines, seismic activity, and moving sea ice. With sparse labeling of the data, a supervised learning approach is not feasible, necessitating the application of unsupervised learning techniques to uncover underlying patterns and structures.
The thesis begins by providing essential background theory, including the Fourier transform, spectrograms, and an introduction to clustering algorithms. It then covers the forward pass, parameter update and backpropagation of neural networks. Key components of neural networks and machine learning, such as different layers, activation functions, and loss functions, are also explained.
Furthermore, well-known deep learning architectures, namely convolutional neural networks, autoencoders, and recurrent neural networks, are introduced. The Temporal Neighborhood coding method, which encodes underlying states of multivariate, non-stationary time-series, and the clustering module that integrates a Gaussian mixture model into a loss function for deep autoencoders, are introduced.
The proposed data and methods are presented, followed by experimental results evaluating the performance of the two architectures for segmenting both a simulated dataset and the spectrogram of the submarine recordings. The thesis concludes with a discussion of the results and future directions, highlighting the promising outcomes of the segmentation methods while emphasizing the need for additional data information to further enhance model performance and for further evaluation of the methods performance
Mixture of Expert/Imitator Networks: Scalable Semi-supervised Learning Framework
The current success of deep neural networks (DNNs) in an increasingly broad
range of tasks involving artificial intelligence strongly depends on the
quality and quantity of labeled training data. In general, the scarcity of
labeled data, which is often observed in many natural language processing
tasks, is one of the most important issues to be addressed. Semi-supervised
learning (SSL) is a promising approach to overcoming this issue by
incorporating a large amount of unlabeled data. In this paper, we propose a
novel scalable method of SSL for text classification tasks. The unique property
of our method, Mixture of Expert/Imitator Networks, is that imitator networks
learn to "imitate" the estimated label distribution of the expert network over
the unlabeled data, which potentially contributes a set of features for the
classification. Our experiments demonstrate that the proposed method
consistently improves the performance of several types of baseline DNNs. We
also demonstrate that our method has the more data, better performance property
with promising scalability to the amount of unlabeled data.Comment: Accepted by AAAI 201
Context–aware Learning for Generative Models
This work studies the class of algorithms for learning with side-information that emerges by extending generative models with embedded context-related variables. Using finite mixture models (FMMs) as the prototypical Bayesian network, we show that maximum-likelihood estimation (MLE) of parameters through expectation-maximization (EM) improves over the regular unsupervised case and can approach the performances of supervised learning, despite the absence of any explicit ground-truth data labeling. By direct application of the missing information principle (MIP), the algorithms' performances are proven to range between the conventional supervised and unsupervised MLE extremities proportionally to the information content of the contextual assistance provided. The acquired benefits regard higher estimation precision, smaller standard errors, faster convergence rates, and improved classification accuracy or regression fitness shown in various scenarios while also highlighting important properties and differences among the outlined situations. Applicability is showcased with three real-world unsupervised classification scenarios employing Gaussian mixture models. Importantly, we exemplify the natural extension of this methodology to any type of generative model by deriving an equivalent context-aware algorithm for variational autoencoders (VAs), thus broadening the spectrum of applicability to unsupervised deep learning with artificial neural networks. The latter is contrasted with a neural-symbolic algorithm exploiting side information
Truncated Variational EM for Semi-Supervised Neural Simpletrons
Inference and learning for probabilistic generative networks is often very
challenging and typically prevents scalability to as large networks as used for
deep discriminative approaches. To obtain efficiently trainable, large-scale
and well performing generative networks for semi-supervised learning, we here
combine two recent developments: a neural network reformulation of hierarchical
Poisson mixtures (Neural Simpletrons), and a novel truncated variational EM
approach (TV-EM). TV-EM provides theoretical guarantees for learning in
generative networks, and its application to Neural Simpletrons results in
particularly compact, yet approximately optimal, modifications of learning
equations. If applied to standard benchmarks, we empirically find, that
learning converges in fewer EM iterations, that the complexity per EM iteration
is reduced, and that final likelihood values are higher on average. For the
task of classification on data sets with few labels, learning improvements
result in consistently lower error rates if compared to applications without
truncation. Experiments on the MNIST data set herein allow for comparison to
standard and state-of-the-art models in the semi-supervised setting. Further
experiments on the NIST SD19 data set show the scalability of the approach when
a manifold of additional unlabeled data is available
- …