Search CORE

12 research outputs found

Efficiently Moving Instead of Reweighting Collider Events with Machine Learning

Author: Mastandrea Radha
Nachman Benjamin
Publication venue
Publication date: 12/12/2022
Field of study

There are many cases in collider physics and elsewhere where a calibration dataset is used to predict the known physics and / or noise of a target region of phase space. This calibration dataset usually cannot be used out-of-the-box but must be tweaked, often with conditional importance weights, to be maximally realistic. Using resonant anomaly detection as an example, we compare a number of alternative approaches based on transporting events with normalizing flows instead of reweighting them. We find that the accuracy of the morphed calibration dataset depends on the degree to which the transport task is set up to carry out optimal transport, which motivates future research into this area.Comment: 7 pages, 3 figures. Presented at the Machine Learning and the Physical Sciences Workshop at the 36th conference on Neural Information Processing Systems (NeurIPS

arXiv.org e-Print Archive

FETA: Flow-Enhanced Transportation for Anomaly Detection

Author: Golling Tobias
Klein Samuel
Mastandrea Radha
Nachman Benjamin
Publication venue: 'American Physical Society (APS)'
Publication date: 21/12/2022
Field of study

Resonant anomaly detection is a promising framework for model-independent searches for new particles. Weakly supervised resonant anomaly detection methods compare data with a potential signal against a template of the Standard Model (SM) background inferred from sideband regions. We propose a means to generate this background template that uses a flow-based model to create a mapping between high-fidelity SM simulations and the data. The flow is trained in sideband regions with the signal region blinded, and the flow is conditioned on the resonant feature (mass) such that it can be interpolated into the signal region. To illustrate this approach, we use simulated collisions from the Large Hadron Collider (LHC) Olympics Dataset. We find that our flow-constructed background method has competitive sensitivity with other recent proposals and can therefore provide complementary information to improve future searches.Comment: 13 pages, 11 figure

arXiv.org e-Print Archive

Archive ouverte UNIGE

Flows for Flows: Morphing one Dataset into another with Maximum Likelihood Estimation

Author: Golling Tobias
Klein Samuel
Mastandrea Radha
Nachman Benjamin
Raine John Andrew
Publication venue
Publication date: 12/09/2023
Field of study

Many components of data analysis in high energy physics and beyond require morphing one dataset into another. This is commonly solved via reweighting, but there are many advantages of preserving weights and shifting the data points instead. Normalizing flows are machine learning models with impressive precision on a variety of particle physics tasks. Naively, normalizing flows cannot be used for morphing because they require knowledge of the probability density of the starting dataset. In most cases in particle physics, we can generate more examples, but we do not know densities explicitly. We propose a protocol called flows for flows for training normalizing flows to morph one dataset into another even if the underlying probability density of neither dataset is known explicitly. This enables a morphing strategy trained with maximum likelihood estimation, a setup that has been shown to be highly effective in related tasks. We study variations on this protocol to explore how far the data points are moved to statistically match the two datasets. Furthermore, we show how to condition the learned flows on particular features in order to create a morphing function for every value of the conditioning feature. For illustration, we demonstrate flows for flows for toy examples as well as a collider physics example involving dijet eventsComment: 15 pages, 17 figures. This work is a merger of arXiv:2211.02487 and arXiv:2212.0615

arXiv.org e-Print Archive

Exploring the Space of Jets with CMS Open Data

Author: Komiske Patrick T.
Mastandrea Radha
Metodiev Eric M.
Naik Preksha
Thaler Jesse
Publication venue: 'American Physical Society (APS)'
Publication date: 06/02/2020
Field of study

We explore the metric space of jets using public collider data from the CMS experiment. Starting from 2.3/fb of 7 TeV proton-proton collisions collected at the Large Hadron Collider in 2011, we isolate a sample of 1,690,984 central jets with transverse momentum above 375 GeV. To validate the performance of the CMS detector in reconstructing the energy flow of jets, we compare the CMS Open Data to corresponding simulated data samples for a variety of jet kinematic and substructure observables. Even without detector unfolding, we find very good agreement for track-based observables after using charged hadron subtraction to mitigate the impact of pileup. We perform a range of novel analyses, using the "energy mover's distance" (EMD) to measure the pairwise difference between jet energy flows. The EMD allows us to quantify the impact of detector effects, visualize the metric space of jets, extract correlation dimensions, and identify the most and least typical jet configurations. To facilitate future jet studies with CMS Open Data, we make our datasets and analysis code available, amounting to around two gigabytes of distilled data and one hundred gigabytes of simulation files.Comment: 37 pages, 25 figures, 5 tables; v2: updated to match PRD version; code available at https://energyflow.networ

arXiv.org e-Print Archive

DSpace@MIT

The Interplay of Machine Learning--based Resonant Anomaly Detection Methods

Author: Golling Tobias
Kasieczka Gregor
Krause Claudius
Mastandrea Radha
Nachman Benjamin
Raine John Andrew
Sengupta Debajyoti
Shih David
Sommerhalder Manuel
Publication venue
Publication date: 20/07/2023
Field of study

Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal that make use of simulated or detected data in different ways, there has not yet been a study of the methods' complementarity. To this end, we address two questions. First, in the absence of any signal, do different methods pick the same events as signal-like? If not, then we can significantly reduce the false-positive rate by comparing different methods on the same dataset. Second, if there is a signal, are different methods fully correlated? Even if their maximum performance is the same, since we do not know how much signal is present, it may be beneficial to combine approaches. Using the Large Hadron Collider (LHC) Olympics dataset, we provide quantitative answers to these questions. We find that there are significant gains possible by combining multiple methods, which will strengthen the search program at the LHC and beyond.Comment: 23 pages, 17 figure

arXiv.org e-Print Archive

Self-supervised Anomaly Detection for New Physics

Author: Dillon Barry M.
Mastandrea Radha
Nachman Benjamin
Publication venue
Publication date: 20/05/2022
Field of study

We investigate a method of model-agnostic anomaly detection through studying jets, collimated sprays of particles produced in high-energy collisions. We train a transformer neural network to encode simulated QCD "event space" dijets into a low-dimensional "latent space" representation. We optimize the network using the self-supervised contrastive loss, which encourages the preservation of known physical symmetries of the dijets. We then train a binary classifier to discriminate a BSM resonant dijet signal from a QCD dijet background both in the event space and the latent space representations. We find the classifier performances on the event and latent spaces to be comparable. We finally perform an anomaly detection search using a weakly supervised bump hunt on the latent space dijets, finding again a comparable performance to a search run on the physical space dijets. This opens the door to using low-dimensional latent representations as a computationally efficient space for resonant anomaly detection in generic particle collision events.Comment: 12 pages, 11 figure

arXiv.org e-Print Archive

eScholarship - University of California

Recommended from our members

Flow-enhanced transportation for anomaly detection

Author: Golling Tobias
Klein Samuel
Mastandrea Radha
Nachman Benjamin
Publication venue: eScholarship, University of California
Publication date: 01/05/2023
Field of study

eScholarship - University of California

Recommended from our members

Morphing one dataset into another with maximum likelihood estimation

Author: Golling Tobias
Klein Samuel
Mastandrea Radha
Nachman Benjamin
Raine John Andrew
Publication venue: eScholarship, University of California
Publication date: 01/11/2023
Field of study

eScholarship - University of California

Recommended from our members

Anomaly detection under coordinate transformations

Author: Kasieczka Gregor
Mastandrea Radha
Mikuni Vinicius
Nachman Benjamin
Pettee Mariel
Shih David
Publication venue: eScholarship, University of California
Publication date: 01/01/2023
Field of study

There is a growing need for machine-learning-based anomaly detection strategies to broaden the search for beyond-the-Standard-Model physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density. It is a well-known fact that probability densities are not invariant under coordinate transformations, so the sensitivity can depend on the initial choice of coordinates. The broader machine learning community has recently connected coordinate sensitivity with anomaly detection and our goal is to bring awareness of this issue to the growing high-energy physics literature on anomaly detection. In addition to analytical explanations, we provide numerical examples from simple random variables and from the LHC Olympics dataset that show how using probability density as an anomaly score can lead to events being classified as anomalous or not depending on the coordinate frame

eScholarship - University of California

Recommended from our members

Anomaly Detection under Coordinate Transformations

Author: Kasieczka Gregor
Mastandrea Radha
Mikuni Vinicius
Nachman Benjamin
Pettee Mariel
Shih David
Publication venue: eScholarship, University of California
Publication date: 13/09/2022
Field of study

There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density. It is a well-known fact that probability densities are not invariant under coordinate transformations, so the sensitivity can depend on the initial choice of coordinates. The broader machine learning community has recently connected coordinate sensitivity with anomaly detection and our goal is to bring awareness of this issue to the growing high energy physics literature on anomaly detection. In addition to analytical explanations, we provide numerical examples from simple random variables and from the LHC Olympics Dataset that show how using probability density as an anomaly score can lead to events being classified as anomalous or not depending on the coordinate frame

eScholarship - University of California