715 research outputs found

    An Interval-based Multiobjective Approach to Feature Subset Selection Using Joint Modeling of Objectives and Variables

    Get PDF
    This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets

    Maximum Volume Subset Selection for Anchored Boxes

    Get PDF
    Let B be a set of n axis-parallel boxes in d-dimensions such that each box has a corner at the origin and the other corner in the positive quadrant, and let k be a positive integer. We study the problem of selecting k boxes in B that maximize the volume of the union of the selected boxes. The research is motivated by applications in skyline queries for databases and in multicriteria optimization, where the problem is known as the hypervolume subset selection problem. It is known that the problem can be solved in polynomial time in the plane, while the best known algorithms in any dimension d>2 enumerate all size-k subsets. We show that: * The problem is NP-hard already in 3 dimensions. * In 3 dimensions, we break the enumeration of all size-k subsets, by providing an n^O(sqrt(k)) algorithm. * For any constant dimension d, we give an efficient polynomial-time approximation scheme

    Model-independent determination of the strong phase difference between D0D^0 and Dˉ0→π+π−π+π−\bar{D}^0 \to\pi^+\pi^-\pi^+\pi^- amplitudes

    Get PDF
    For the first time, the strong phase difference between D0D^0 and Dˉ0→π+π−π+π−\bar{D}^0\to\pi^+\pi^-\pi^+\pi^- amplitudes is determined in bins of the decay phase space. The measurement uses 818 pb−1818\,\mathrm{pb}^{-1} of e+e−e^+e^- collision data that is taken at the ψ(3770)\psi(3770) resonance and collected by the CLEO-c experiment. The measurement is important for the determination of the CPC P-violating phase γ\gamma in B±→DK±B^{\pm}\to D K^{\pm} (and similar) decays , where the DD meson (which represents a superposition of D0D^0 and Dˉ0\bar{D}^0) subsequently decays to π+π−π+π−\pi^+\pi^-\pi^+\pi^-. To obtain optimal sensitivity to γ\gamma, the phase space of the D→π+π−π+π−D \to \pi^+\pi^-\pi^+\pi^- decay is divided into bins based on a recent amplitude model of the decay. Although an amplitude model is used to define the bins, the measurements obtained are model-independent. The CPCP-even fraction of the D→π+π−π+π−D \to \pi^+\pi^-\pi^+\pi^- decay is determined to be F+4π=0.769±0.021±0.010F_{+}^{4\pi} = 0.769 \pm 0.021 \pm 0.010, where the uncertainties are statistical and systematic, respectively. Using simulated B±→DK±,D→π+π−π+π−B^{\pm}\to D K^{\pm}, D \to \pi^+\pi^-\pi^+\pi^- decays, it is estimated that by the end of the current LHC run, the LHCb experiment could determine γ\gamma from this decay mode with an uncertainty of (±10±7)∘(\pm10\pm7)^\circ, where the first uncertainty is statistical based on estimated LHCb event yields, and the second is due to the uncertainties on the parameters determined in this paper
    • …
    corecore