Search CORE

26,162 research outputs found

Applying Rule Ensembles to the Search for Super-Symmetry at the Large Hadron Collider

Author: A.J. Smola
ALICE collaboration
ATLAS collaboration
ATLAS collaboration
B.P. Roe
CMS collaboration
Fredrik Tegenfeldt
H.B. Prosper
H.P. Nilles
J.H. Friedman
J.H. Friedman
J.H. Friedman
J.H. Friedman
Jan Conrad
L. Breiman
LHCb collaboration
O. Bruning
Y. Freund
Publication venue: 'IOP Publishing'
Publication date: 10/05/2006
Field of study

In this note we give an example application of a recently presented predictive learning method called Rule Ensembles. The application we present is the search for super-symmetric particles at the Large Hadron Collider. In particular, we consider the problem of separating the background coming from top quark production from the signal of super-symmetric particles. The method is based on an expansion of base learners, each learner being a rule, i.e. a combination of cuts in the variable space describing signal and background. These rules are generated from an ensemble of decision trees. One of the results of the method is a set of rules (cuts) ordered according to their importance, which gives useful tools for diagnosis of the model. We also compare the method to a number of other multivariate methods, in particular Artificial Neural Networks, the likelihood method and the recently presented boosted decision tree method. We find better performance of Rule Ensembles in all cases. For example for a given significance the amount of data needed to claim SUSY discovery could be reduced by 15 % using Rule Ensembles as compared to using a likelihood method.Comment: 24 pages, 7 figures, replaced to match version accepted for publication in JHE

arXiv.org e-Print Archive

Crossref

CERN Document Server

Popular Ensemble Methods: An Empirical Study

Author: Maclin R.
Opitz D.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

An ensemble consists of a set of individually trained classifiers (such as neural networks or decision trees) whose predictions are combined when classifying novel instances. Previous research has shown that an ensemble is often more accurate than any of the single classifiers in the ensemble. Bagging (Breiman, 1996c) and Boosting (Freund and Shapire, 1996; Shapire, 1990) are two relatively new but popular methods for producing ensembles. In this paper we evaluate these methods on 23 data sets using both neural networks and decision trees as our classification algorithm. Our results clearly indicate a number of conclusions. First, while Bagging is almost always more accurate than a single classifier, it is sometimes much less accurate than Boosting. On the other hand, Boosting can create ensembles that are less accurate than a single classifier -- especially when using neural networks. Analysis indicates that the performance of the Boosting methods is dependent on the characteristics of the data set being examined. In fact, further results show that Boosting ensembles may overfit noisy data sets, thus decreasing its performance. Finally, consistent with previous studies, our work suggests that most of the gain in an ensemble's performance comes in the first few classifiers combined; however, relatively large gains can be seen up to 25 classifiers when Boosting decision trees

arXiv.org e-Print Archive

Crossref

Recommended from our members

Neurons and symbols: a manifesto

Author: Garcez A.
Publication venue
Publication date: 01/07/2010
Field of study

We discuss the purpose of neural-symbolic integration including its principles, mechanisms and applications. We outline a cognitive computational model for neural-symbolic integration, position the model in the broader context of multi-agent systems, machine learning and automated reasoning, and list some of the challenges for the area of neural-symbolic computation to achieve the promise of effective integration of robust learning and expressive reasoning under uncertainty

City Research Online

Formal Verification of Input-Output Mappings of Tree Ensembles

Author: Nadjm-Tehrani Simin
Törnblom John
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Recent advances in machine learning and artificial intelligence are now being considered in safety-critical autonomous systems where software defects may cause severe harm to humans and the environment. Design organizations in these domains are currently unable to provide convincing arguments that their systems are safe to operate when machine learning algorithms are used to implement their software. In this paper, we present an efficient method to extract equivalence classes from decision trees and tree ensembles, and to formally verify that their input-output mappings comply with requirements. The idea is that, given that safety requirements can be traced to desirable properties on system input-output patterns, we can use positive verification outcomes in safety arguments. This paper presents the implementation of the method in the tool VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case studies presented in current literature. We demonstrate that our method is practical for tree ensembles trained on low-dimensional data with up to 25 decision trees and tree depths of up to 20. Our work also studies the limitations of the method with high-dimensional data and preliminarily investigates the trade-off between large number of trees and time taken for verification

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line