35,686 research outputs found

    Machine and deep learning techniques in heavy-ion collisions with ALICE

    Full text link
    Over the last years, machine learning tools have been successfully applied to a wealth of problems in high-energy physics. A typical example is the classification of physics objects. Supervised machine learning methods allow for significant improvements in classification problems by taking into account observable correlations and by learning the optimal selection from examples, e.g. from Monte Carlo simulations. Even more promising is the usage of deep learning techniques. Methods like deep convolutional networks might be able to catch features from low-level parameters that are not exploited by default cut-based methods. These ideas could be particularly beneficial for measurements in heavy-ion collisions, because of the very large multiplicities. Indeed, machine learning methods potentially perform much better in systems with a large number of degrees of freedom compared to cut-based methods. Moreover, many key heavy-ion observables are most interesting at low transverse momentum where the underlying event is dominant and the signal-to-noise ratio is quite low. In this work, recent developments of machine- and deep learning applications in heavy-ion collisions with ALICE will be presented, with focus on a deep learning-based b-jet tagging approach and the measurement of low-mass dielectrons. While the b-jet tagger is based on a mixture of shallow fully-connected and deep convolutional networks, the low-mass dielectron measurement uses gradient boosting and shallow neural networks. Both methods are very promising compared to default cut-based methods.Comment: 7 pages, 5 figures, EPS HEP 2017 proceeding

    Unsupervised segmentation of submarine recordings

    Get PDF
    The thesis focuses on the unsupervised segmentation of submarine recordings collected by the Norwegian Polar Institute (NPI) using hydrophones. These recordings consists of various mammal species, along with other phenomena like vessel engines, seismic activity, and moving sea ice. With sparse labeling of the data, a supervised learning approach is not feasible, necessitating the application of unsupervised learning techniques to uncover underlying patterns and structures. The thesis begins by providing essential background theory, including the Fourier transform, spectrograms, and an introduction to clustering algorithms. It then covers the forward pass, parameter update and backpropagation of neural networks. Key components of neural networks and machine learning, such as different layers, activation functions, and loss functions, are also explained. Furthermore, well-known deep learning architectures, namely convolutional neural networks, autoencoders, and recurrent neural networks, are introduced. The Temporal Neighborhood coding method, which encodes underlying states of multivariate, non-stationary time-series, and the clustering module that integrates a Gaussian mixture model into a loss function for deep autoencoders, are introduced. The proposed data and methods are presented, followed by experimental results evaluating the performance of the two architectures for segmenting both a simulated dataset and the spectrogram of the submarine recordings. The thesis concludes with a discussion of the results and future directions, highlighting the promising outcomes of the segmentation methods while emphasizing the need for additional data information to further enhance model performance and for further evaluation of the methods performance

    Mixture of Expert/Imitator Networks: Scalable Semi-supervised Learning Framework

    Full text link
    The current success of deep neural networks (DNNs) in an increasingly broad range of tasks involving artificial intelligence strongly depends on the quality and quantity of labeled training data. In general, the scarcity of labeled data, which is often observed in many natural language processing tasks, is one of the most important issues to be addressed. Semi-supervised learning (SSL) is a promising approach to overcoming this issue by incorporating a large amount of unlabeled data. In this paper, we propose a novel scalable method of SSL for text classification tasks. The unique property of our method, Mixture of Expert/Imitator Networks, is that imitator networks learn to "imitate" the estimated label distribution of the expert network over the unlabeled data, which potentially contributes a set of features for the classification. Our experiments demonstrate that the proposed method consistently improves the performance of several types of baseline DNNs. We also demonstrate that our method has the more data, better performance property with promising scalability to the amount of unlabeled data.Comment: Accepted by AAAI 201

    Context–aware Learning for Generative Models

    Get PDF
    This work studies the class of algorithms for learning with side-information that emerges by extending generative models with embedded context-related variables. Using finite mixture models (FMMs) as the prototypical Bayesian network, we show that maximum-likelihood estimation (MLE) of parameters through expectation-maximization (EM) improves over the regular unsupervised case and can approach the performances of supervised learning, despite the absence of any explicit ground-truth data labeling. By direct application of the missing information principle (MIP), the algorithms' performances are proven to range between the conventional supervised and unsupervised MLE extremities proportionally to the information content of the contextual assistance provided. The acquired benefits regard higher estimation precision, smaller standard errors, faster convergence rates, and improved classification accuracy or regression fitness shown in various scenarios while also highlighting important properties and differences among the outlined situations. Applicability is showcased with three real-world unsupervised classification scenarios employing Gaussian mixture models. Importantly, we exemplify the natural extension of this methodology to any type of generative model by deriving an equivalent context-aware algorithm for variational autoencoders (VAs), thus broadening the spectrum of applicability to unsupervised deep learning with artificial neural networks. The latter is contrasted with a neural-symbolic algorithm exploiting side information

    Truncated Variational EM for Semi-Supervised Neural Simpletrons

    Full text link
    Inference and learning for probabilistic generative networks is often very challenging and typically prevents scalability to as large networks as used for deep discriminative approaches. To obtain efficiently trainable, large-scale and well performing generative networks for semi-supervised learning, we here combine two recent developments: a neural network reformulation of hierarchical Poisson mixtures (Neural Simpletrons), and a novel truncated variational EM approach (TV-EM). TV-EM provides theoretical guarantees for learning in generative networks, and its application to Neural Simpletrons results in particularly compact, yet approximately optimal, modifications of learning equations. If applied to standard benchmarks, we empirically find, that learning converges in fewer EM iterations, that the complexity per EM iteration is reduced, and that final likelihood values are higher on average. For the task of classification on data sets with few labels, learning improvements result in consistently lower error rates if compared to applications without truncation. Experiments on the MNIST data set herein allow for comparison to standard and state-of-the-art models in the semi-supervised setting. Further experiments on the NIST SD19 data set show the scalability of the approach when a manifold of additional unlabeled data is available
    • …
    corecore