2,707 research outputs found
Performance and optimization of support vector machines in high-energy physics classification problems
In this paper we promote the use of Support Vector Machines (SVM) as a
machine learning tool for searches in high-energy physics. As an example for a
new- physics search we discuss the popular case of Supersymmetry at the Large
Hadron Collider. We demonstrate that the SVM is a valuable tool and show that
an automated discovery- significance based optimization of the SVM
hyper-parameters is a highly efficient way to prepare an SVM for such
applications. A new C++ LIBSVM interface called SVM-HINT is developed and
available on Github.Comment: 20 pages, 6 figure
What is the Machine Learning?
Applications of machine learning tools to problems of physical interest are
often criticized for producing sensitivity at the expense of transparency. To
address this concern, we explore a data planing procedure for identifying
combinations of variables -- aided by physical intuition -- that can
discriminate signal from background. Weights are introduced to smooth away the
features in a given variable(s). New networks are then trained on this modified
data. Observed decreases in sensitivity diagnose the variable's discriminating
power. Planing also allows the investigation of the linear versus non-linear
nature of the boundaries between signal and background. We demonstrate the
efficacy of this approach using a toy example, followed by an application to an
idealized heavy resonance scenario at the Large Hadron Collider. By unpacking
the information being utilized by these algorithms, this method puts in context
what it means for a machine to learn.Comment: 6 pages, 3 figures. Version published in PRD, discussion adde
Applying Rule Ensembles to the Search for Super-Symmetry at the Large Hadron Collider
In this note we give an example application of a recently presented
predictive learning method called Rule Ensembles. The application we present is
the search for super-symmetric particles at the Large Hadron Collider. In
particular, we consider the problem of separating the background coming from
top quark production from the signal of super-symmetric particles. The method
is based on an expansion of base learners, each learner being a rule, i.e. a
combination of cuts in the variable space describing signal and background.
These rules are generated from an ensemble of decision trees. One of the
results of the method is a set of rules (cuts) ordered according to their
importance, which gives useful tools for diagnosis of the model. We also
compare the method to a number of other multivariate methods, in particular
Artificial Neural Networks, the likelihood method and the recently presented
boosted decision tree method. We find better performance of Rule Ensembles in
all cases. For example for a given significance the amount of data needed to
claim SUSY discovery could be reduced by 15 % using Rule Ensembles as compared
to using a likelihood method.Comment: 24 pages, 7 figures, replaced to match version accepted for
publication in JHE
- …