2,308 research outputs found
Identifying Heavy-Flavor Jets Using Vectors of Locally Aggregated Descriptors
Jets of collimated particles serve a multitude of purposes in high energy
collisions. Recently, studies of jet interaction with the quark-gluon plasma
(QGP) created in high energy heavy ion collisions are of growing interest,
particularly towards understanding partonic energy loss in the QGP medium and
its related modifications of the jet shower and fragmentation. Since the QGP is
a colored medium, the extent of jet quenching and consequently, the transport
properties of the medium are expected to be sensitive to fundamental properties
of the jets such as the flavor of the parton that initiates the jet.
Identifying the jet flavor enables an extraction of the mass dependence in
jet-QGP interactions. We present a novel approach to tagging heavy-flavor jets
at collider experiments utilizing the information contained within jet
constituents via the \texttt{JetVLAD} model architecture. We show the
performance of this model in proton-proton collisions at center of mass energy
GeV as characterized by common metrics and showcase its
ability to extract high purity heavy-flavor jet sample at various jet momenta
and realistic production cross-sections including a brief discussion on the
impact of out-of-time pile-up. Such studies open new opportunities for future
high purity heavy-flavor measurements at jet energies accessible at current and
future collider experiments.Comment: 18 pages, 6 figures and 3 tables. Accepted by JINS
Classification without labels: Learning from mixed samples in high energy physics
Modern machine learning techniques can be used to construct powerful models
for difficult collider physics problems. In many applications, however, these
models are trained on imperfect simulations due to a lack of truth-level
information in the data, which risks the model learning artifacts of the
simulation. In this paper, we introduce the paradigm of classification without
labels (CWoLa) in which a classifier is trained to distinguish statistical
mixtures of classes, which are common in collider physics. Crucially, neither
individual labels nor class proportions are required, yet we prove that the
optimal classifier in the CWoLa paradigm is also the optimal classifier in the
traditional fully-supervised case where all label information is available.
After demonstrating the power of this method in an analytical toy example, we
consider a realistic benchmark for collider physics: distinguishing quark-
versus gluon-initiated jets using mixed quark/gluon training samples. More
generally, CWoLa can be applied to any classification problem where labels or
class proportions are unknown or simulations are unreliable, but statistical
mixtures of the classes are available.Comment: 18 pages, 5 figures; v2: intro extended and references added; v3:
additional discussion to match JHEP versio
Energy flow polynomials: A complete linear basis for jet substructure
We introduce the energy flow polynomials: a complete set of jet substructure
observables which form a discrete linear basis for all infrared- and
collinear-safe observables. Energy flow polynomials are multiparticle energy
correlators with specific angular structures that are a direct consequence of
infrared and collinear safety. We establish a powerful graph-theoretic
representation of the energy flow polynomials which allows us to design
efficient algorithms for their computation. Many common jet observables are
exact linear combinations of energy flow polynomials, and we demonstrate the
linear spanning nature of the energy flow basis by performing regression for
several common jet observables. Using linear classification with energy flow
polynomials, we achieve excellent performance on three representative jet
tagging problems: quark/gluon discrimination, boosted W tagging, and boosted
top tagging. The energy flow basis provides a systematic framework for complete
investigations of jet substructure using linear methods.Comment: 41+15 pages, 13 figures, 5 tables; v2: updated to match JHEP versio
Learning to Classify from Impure Samples with High-Dimensional Data
A persistent challenge in practical classification tasks is that labeled
training sets are not always available. In particle physics, this challenge is
surmounted by the use of simulations. These simulations accurately reproduce
most features of data, but cannot be trusted to capture all of the complex
correlations exploitable by modern machine learning methods. Recent work in
weakly supervised learning has shown that simple, low-dimensional classifiers
can be trained using only the impure mixtures present in data. Here, we
demonstrate that complex, high-dimensional classifiers can also be trained on
impure mixtures using weak supervision techniques, with performance comparable
to what could be achieved with pure samples. Using weak supervision will
therefore allow us to avoid relying exclusively on simulations for
high-dimensional classification. This work opens the door to a new regime
whereby complex models are trained directly on data, providing direct access to
probe the underlying physics.Comment: 6 pages, 2 tables, 2 figures. v2: updated to match PRD versio
Machine and deep learning techniques in heavy-ion collisions with ALICE
Over the last years, machine learning tools have been successfully applied to
a wealth of problems in high-energy physics. A typical example is the
classification of physics objects. Supervised machine learning methods allow
for significant improvements in classification problems by taking into account
observable correlations and by learning the optimal selection from examples,
e.g. from Monte Carlo simulations. Even more promising is the usage of deep
learning techniques. Methods like deep convolutional networks might be able to
catch features from low-level parameters that are not exploited by default
cut-based methods.
These ideas could be particularly beneficial for measurements in heavy-ion
collisions, because of the very large multiplicities. Indeed, machine learning
methods potentially perform much better in systems with a large number of
degrees of freedom compared to cut-based methods. Moreover, many key heavy-ion
observables are most interesting at low transverse momentum where the
underlying event is dominant and the signal-to-noise ratio is quite low.
In this work, recent developments of machine- and deep learning applications
in heavy-ion collisions with ALICE will be presented, with focus on a deep
learning-based b-jet tagging approach and the measurement of low-mass
dielectrons. While the b-jet tagger is based on a mixture of shallow
fully-connected and deep convolutional networks, the low-mass dielectron
measurement uses gradient boosting and shallow neural networks. Both methods
are very promising compared to default cut-based methods.Comment: 7 pages, 5 figures, EPS HEP 2017 proceeding
- …