715 research outputs found
An Interval-based Multiobjective Approach to Feature Subset Selection Using Joint Modeling of Objectives and Variables
This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed
algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a
subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better
performance on the tested datasets
Maximum Volume Subset Selection for Anchored Boxes
Let B be a set of n axis-parallel boxes in d-dimensions such that each box has a corner at the origin and the other corner in the positive quadrant, and let k be a positive integer. We study the problem of selecting k boxes in B that maximize the volume of the union of the selected boxes. The research is motivated by applications in skyline queries for databases and in multicriteria optimization, where the problem is known as the hypervolume subset selection problem. It is known that the problem can be solved in polynomial time in the plane, while the best known algorithms in any dimension d>2 enumerate all size-k subsets. We show that:
* The problem is NP-hard already in 3 dimensions.
* In 3 dimensions, we break the enumeration of all size-k subsets, by providing an n^O(sqrt(k)) algorithm.
* For any constant dimension d, we give an efficient polynomial-time approximation scheme
Model-independent determination of the strong phase difference between and amplitudes
For the first time, the strong phase difference between and
amplitudes is determined in bins of the
decay phase space. The measurement uses of
collision data that is taken at the resonance and collected by the
CLEO-c experiment. The measurement is important for the determination of the -violating phase in (and similar) decays ,
where the meson (which represents a superposition of and )
subsequently decays to . To obtain optimal sensitivity to
, the phase space of the decay is divided
into bins based on a recent amplitude model of the decay. Although an amplitude
model is used to define the bins, the measurements obtained are
model-independent. The -even fraction of the
decay is determined to be , where the
uncertainties are statistical and systematic, respectively. Using simulated
decays, it is estimated that
by the end of the current LHC run, the LHCb experiment could determine
from this decay mode with an uncertainty of , where the
first uncertainty is statistical based on estimated LHCb event yields, and the
second is due to the uncertainties on the parameters determined in this paper
- …