18,317 research outputs found
Applying Rule Ensembles to the Search for Super-Symmetry at the Large Hadron Collider
In this note we give an example application of a recently presented
predictive learning method called Rule Ensembles. The application we present is
the search for super-symmetric particles at the Large Hadron Collider. In
particular, we consider the problem of separating the background coming from
top quark production from the signal of super-symmetric particles. The method
is based on an expansion of base learners, each learner being a rule, i.e. a
combination of cuts in the variable space describing signal and background.
These rules are generated from an ensemble of decision trees. One of the
results of the method is a set of rules (cuts) ordered according to their
importance, which gives useful tools for diagnosis of the model. We also
compare the method to a number of other multivariate methods, in particular
Artificial Neural Networks, the likelihood method and the recently presented
boosted decision tree method. We find better performance of Rule Ensembles in
all cases. For example for a given significance the amount of data needed to
claim SUSY discovery could be reduced by 15 % using Rule Ensembles as compared
to using a likelihood method.Comment: 24 pages, 7 figures, replaced to match version accepted for
publication in JHE
Identifying Real Estate Opportunities using Machine Learning
The real estate market is exposed to many fluctuations in prices because of
existing correlations with many variables, some of which cannot be controlled
or might even be unknown. Housing prices can increase rapidly (or in some
cases, also drop very fast), yet the numerous listings available online where
houses are sold or rented are not likely to be updated that often. In some
cases, individuals interested in selling a house (or apartment) might include
it in some online listing, and forget about updating the price. In other cases,
some individuals might be interested in deliberately setting a price below the
market price in order to sell the home faster, for various reasons. In this
paper, we aim at developing a machine learning application that identifies
opportunities in the real estate market in real time, i.e., houses that are
listed with a price substantially below the market price. This program can be
useful for investors interested in the housing market. We have focused in a use
case considering real estate assets located in the Salamanca district in Madrid
(Spain) and listed in the most relevant Spanish online site for home sales and
rentals. The application is formally implemented as a regression problem that
tries to estimate the market price of a house given features retrieved from
public online listings. For building this application, we have performed a
feature engineering stage in order to discover relevant features that allows
for attaining a high predictive performance. Several machine learning
algorithms have been tested, including regression trees, k-nearest neighbors,
support vector machines and neural networks, identifying advantages and
handicaps of each of them.Comment: 24 pages, 13 figures, 5 table
- …