2,308 research outputs found

    Identifying Heavy-Flavor Jets Using Vectors of Locally Aggregated Descriptors

    Full text link
    Jets of collimated particles serve a multitude of purposes in high energy collisions. Recently, studies of jet interaction with the quark-gluon plasma (QGP) created in high energy heavy ion collisions are of growing interest, particularly towards understanding partonic energy loss in the QGP medium and its related modifications of the jet shower and fragmentation. Since the QGP is a colored medium, the extent of jet quenching and consequently, the transport properties of the medium are expected to be sensitive to fundamental properties of the jets such as the flavor of the parton that initiates the jet. Identifying the jet flavor enables an extraction of the mass dependence in jet-QGP interactions. We present a novel approach to tagging heavy-flavor jets at collider experiments utilizing the information contained within jet constituents via the \texttt{JetVLAD} model architecture. We show the performance of this model in proton-proton collisions at center of mass energy s=200\sqrt{s} = 200 GeV as characterized by common metrics and showcase its ability to extract high purity heavy-flavor jet sample at various jet momenta and realistic production cross-sections including a brief discussion on the impact of out-of-time pile-up. Such studies open new opportunities for future high purity heavy-flavor measurements at jet energies accessible at current and future collider experiments.Comment: 18 pages, 6 figures and 3 tables. Accepted by JINS

    Classification without labels: Learning from mixed samples in high energy physics

    Get PDF
    Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimal classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.Comment: 18 pages, 5 figures; v2: intro extended and references added; v3: additional discussion to match JHEP versio

    Energy flow polynomials: A complete linear basis for jet substructure

    Get PDF
    We introduce the energy flow polynomials: a complete set of jet substructure observables which form a discrete linear basis for all infrared- and collinear-safe observables. Energy flow polynomials are multiparticle energy correlators with specific angular structures that are a direct consequence of infrared and collinear safety. We establish a powerful graph-theoretic representation of the energy flow polynomials which allows us to design efficient algorithms for their computation. Many common jet observables are exact linear combinations of energy flow polynomials, and we demonstrate the linear spanning nature of the energy flow basis by performing regression for several common jet observables. Using linear classification with energy flow polynomials, we achieve excellent performance on three representative jet tagging problems: quark/gluon discrimination, boosted W tagging, and boosted top tagging. The energy flow basis provides a systematic framework for complete investigations of jet substructure using linear methods.Comment: 41+15 pages, 13 figures, 5 tables; v2: updated to match JHEP versio

    Learning to Classify from Impure Samples with High-Dimensional Data

    Get PDF
    A persistent challenge in practical classification tasks is that labeled training sets are not always available. In particle physics, this challenge is surmounted by the use of simulations. These simulations accurately reproduce most features of data, but cannot be trusted to capture all of the complex correlations exploitable by modern machine learning methods. Recent work in weakly supervised learning has shown that simple, low-dimensional classifiers can be trained using only the impure mixtures present in data. Here, we demonstrate that complex, high-dimensional classifiers can also be trained on impure mixtures using weak supervision techniques, with performance comparable to what could be achieved with pure samples. Using weak supervision will therefore allow us to avoid relying exclusively on simulations for high-dimensional classification. This work opens the door to a new regime whereby complex models are trained directly on data, providing direct access to probe the underlying physics.Comment: 6 pages, 2 tables, 2 figures. v2: updated to match PRD versio

    Machine and deep learning techniques in heavy-ion collisions with ALICE

    Full text link
    Over the last years, machine learning tools have been successfully applied to a wealth of problems in high-energy physics. A typical example is the classification of physics objects. Supervised machine learning methods allow for significant improvements in classification problems by taking into account observable correlations and by learning the optimal selection from examples, e.g. from Monte Carlo simulations. Even more promising is the usage of deep learning techniques. Methods like deep convolutional networks might be able to catch features from low-level parameters that are not exploited by default cut-based methods. These ideas could be particularly beneficial for measurements in heavy-ion collisions, because of the very large multiplicities. Indeed, machine learning methods potentially perform much better in systems with a large number of degrees of freedom compared to cut-based methods. Moreover, many key heavy-ion observables are most interesting at low transverse momentum where the underlying event is dominant and the signal-to-noise ratio is quite low. In this work, recent developments of machine- and deep learning applications in heavy-ion collisions with ALICE will be presented, with focus on a deep learning-based b-jet tagging approach and the measurement of low-mass dielectrons. While the b-jet tagger is based on a mixture of shallow fully-connected and deep convolutional networks, the low-mass dielectron measurement uses gradient boosting and shallow neural networks. Both methods are very promising compared to default cut-based methods.Comment: 7 pages, 5 figures, EPS HEP 2017 proceeding