4,013 research outputs found
PDE-Foam - a probability-density estimation method using self-adapting phase-space binning
Probability Density Estimation (PDE) is a multivariate discrimination
technique based on sampling signal and background densities defined by event
samples from data or Monte-Carlo (MC) simulations in a multi-dimensional phase
space. In this paper, we present a modification of the PDE method that uses a
self-adapting binning method to divide the multi-dimensional phase space in a
finite number of hyper-rectangles (cells). The binning algorithm adjusts the
size and position of a predefined number of cells inside the multi-dimensional
phase space, minimising the variance of the signal and background densities
inside the cells. The implementation of the binning algorithm PDE-Foam is based
on the MC event-generation package Foam. We present performance results for
representative examples (toy models) and discuss the dependence of the obtained
results on the choice of parameters. The new PDE-Foam shows improved
classification capability for small training samples and reduced classification
time compared to the original PDE method based on range searching.Comment: 19 pages, 11 figures; replaced with revised version accepted for
publication in NIM A and corrected typos in description of Fig. 7 and
Effective Discriminative Feature Selection with Non-trivial Solutions
Feature selection and feature transformation, the two main ways to reduce
dimensionality, are often presented separately. In this paper, a feature
selection method is proposed by combining the popular transformation based
dimensionality reduction method Linear Discriminant Analysis (LDA) and sparsity
regularization. We impose row sparsity on the transformation matrix of LDA
through -norm regularization to achieve feature selection, and
the resultant formulation optimizes for selecting the most discriminative
features and removing the redundant ones simultaneously. The formulation is
extended to the -norm regularized case: which is more likely to
offer better sparsity when . Thus the formulation is a better
approximation to the feature selection problem. An efficient algorithm is
developed to solve the -norm based optimization problem and it is
proved that the algorithm converges when . Systematical experiments
are conducted to understand the work of the proposed method. Promising
experimental results on various types of real-world data sets demonstrate the
effectiveness of our algorithm
How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging
We present the results of applying new object classification techniques to
difference images in the context of the Nearby Supernova Factory supernova
search. Most current supernova searches subtract reference images from new
images, identify objects in these difference images, and apply simple threshold
cuts on parameters such as statistical significance, shape, and motion to
reject objects such as cosmic rays, asteroids, and subtraction artifacts.
Although most static objects subtract cleanly, even a very low false positive
detection rate can lead to hundreds of non-supernova candidates which must be
vetted by human inspection before triggering additional followup. In comparison
to simple threshold cuts, more sophisticated methods such as Boosted Decision
Trees, Random Forests, and Support Vector Machines provide dramatically better
object discrimination. At the Nearby Supernova Factory, we reduced the number
of non-supernova candidates by a factor of 10 while increasing our supernova
identification efficiency. Methods such as these will be crucial for
maintaining a reasonable false positive rate in the automated transient alert
pipelines of upcoming projects such as PanSTARRS and LSST.Comment: 25 pages; 6 figures; submitted to Ap
- …