3,626 research outputs found

    A Unified Mathematical Programming Formulation for the Discriminant Problem

    Get PDF
    In recent years, much research has been done on the application of mathematical programming (MP) techniques to the discriminant problem. While very promising results have been obtained, many of these techniques are plagued by a number of problems associated with the model formulation including unbounded, improper and unacceptable solutions as well as solution instability under linear transformation of the data. Some have attempted to prevent these problems by suggesting overly complex formulations which can be difficult to solve. Others have suggested formulations which solve certain problems but which create new ones. In this paper we develop a simple MP model which unifies many features of previous formulations and appears to avoid any solution problems. This approach also considers a classification gap often encountered in the related statistical techniques

    Mathematical Programming Formulations for Two-group Classification with Binary Variables

    Get PDF
    In this paper, we introduce a nonparametric mathematical programming (MP) approach for solving the binary variable classification problem. In practice, there exists a substantial interest in the binary variable classification problem. For instance, medical diagnoses are often based on the presence or absence of relevant symptoms, and binary variable classification has long been used as a means to predict (diagnose) the nature of the medical condition of patients. Our research is motivated by the fact that none of the existing statistical methods for binary variable classification -- parametric and nonparametric alike -- are fully satisfactory. The general class of MP classification methods facilitates a geometric interpretation, and MP-based classification rules have intuitive appeal because of their potentially robust properties. These intuitive arguments appear to have merit, and a number of research studies have confirmed that MP methods can indeed yield effective classification rules under certain non-normal data conditions, for instance if the data set is outlier-contaminated or highly skewed. However, the MP-based approach in general lacks a probabilistic foundation, an ad hoc assessment of its classification performance. Our proposed nonparametric mixed integer programming (MIP) formulation for the binary variable classification problem not only has a geometric interpretation, but also is consistent with the Bayes decision theoretic approach. Therefore, our proposed formulation possesses a strong probabilistic foundation. We also introduce a linear programming (LP) formulation which parallels the concepts underlying the MIP formulation, but does not possess the decision theoretic justification. An additional advantage of both our LP and MIP formulations is that, due to the fact that the attribute variables are binary, the training sample observations can be partitioned into multinomial cells, allowing for a substantial reduction in the number of binary and deviational variables, so that our formulation can be used to analyze training samples of almost any size. We illustrate our formulations using an example problem, and use three real data sets to compare its classification performance with a variety of parametric and nonparametric statistical methods. For each of these data sets, our proposed formulation yields the minimum possible number of misclassifications, both using the resubstitution and the leave-one-out method

    Motivations and experiences of UK students studying abroad

    Get PDF
    This report summarises the findings of research aimed at improving understanding of the motivations behind the international diploma mobility of UK student

    An Efficient Mixed Integer Programming Algorithm for Minimizing the Training Sample Misclassification Cost in Two-group Classification

    Get PDF
    In this paper, we introduce the Divide and Conquer (D&C) algorithm, a computationally efficient algorithm for determining classification rules which minimize the training sample misclassification cost in two-group classification. This classification rule can be derived using mixed integer programming (MIP) techniques. However, it is well-documented that the complexity of MIP-based classification problems grows exponentially as a function of the size of the training sample and the number of attributes describing the observations, requiring special-purpose algorithms to solve even small size problems within a reasonable computational time. The D&C algorithm derives its name from the fact that it relies, a.o., on partitioning the problem in smaller, more easily handled subproblems, rendering it substantially faster than previously proposed algorithms. The D&C algorithm solves the problem to the exact optimal solution (i.e., it is not a heuristic that approximates the solution), and allows for the analysis of much larger training samples than previous methods. For instance, our computational experiments indicate that, on average, the D&C algorithm solves problems with 2 attributes and 500 observations more than 3 times faster, and problems with 5 attributes and 100 observations over 50 times faster than Soltysik and Yarnold's software, which may be the fastest existing algorithm. We believe that the D&C algorithm contributes significantly to the field of classification analysis, because it substantially widens the array of data sets that can be analyzed meaningfully using methods which require MIP techniques, in particular methods which seek to minimize the misclassification cost in the training sample. The programs implementing the D&C algorithm are available from the authors upon request

    Stochastic Judgments in the AHP: The Measurement of Rank Reversal Probabilities

    Get PDF
    Recently, the issue of rank reversal of alternatives in the Analytic Hierarchy Process (AHP) has captured the attention of a number of researchers. Most of the research on rank reversal has addressed the case where the pairwise comparisons of the alternatives are represented by single values, focusing on mathematical properties inherent to the AHP methodology that can lead to rank reversal if a new alternative is added or an existing one is deleted. A second situation, completely unrelated to the mathematical foundations of the AHP, in which rank reversal can occur is the case where the pairwise judgments are stochastic, rather than single values. If the relative preference ratings are uncertain, one has judgment intervals, and as a consequence there is a possibility that the rankings resulting from an AHP analysis are reversed, i.e., incorrect. It is important for modeler and decision maker alike to be aware of the likelihood that this situation of rank reversal will occur. In this paper, we introduce methods for assessing the relative preference of the alternatives in terms of their rankings, if the pairwise comparisons of the alternatives are stochastic. We develop multivariate statistical techniques to obtain point estimates and confidence intervals of the rank reversal probabilities, and show how simulation experiments can be used as an effective and accurate tool for analyzing the stability of the preference rankings under uncertainty. This information about the extent to which the ranking of the alternatives is sensitive to the stochastic nature of the pairwise judgments should be valuable information into the decision making process, much like variability and confidence intervals are crucial tools for statistical inference. Although the focus of our analysis is on stochastic preference judgments, our sampling method for estimating rank reversal probabilities can be extended to the case of non-stochastic imprecise fuzzy judgments. We provide simulation experiments and numerical examples comparing our method with that proposed previously by Saaty and Vargas (1987) for imprecise interval judgments

    Second Order Mathematical Formulations Programming for Discriminant Analysis

    Get PDF
    This paper introduces a nonparametric formulation-based mathematical programming (MP) for solving the classification problem in discriminant analysis. This method differs from previously proposed MP-based models in that, even though the final discriminant function is linear in terms of the parameters to be estimated, the formulation is quadratic in terms of the predictor (attribute) variables. By including second order (i.e., quadratic and cross-product) terms of the attribute variables, the model is similar in concept to the usual treatment of multiple predictor variables in statistical methods such as Fisher's linear discriminant analysis, and allows an analysis of how including nonlinear terms and interaction affect the predictive ability of the estimated classification function. Using simulation experiments involving data conditions for which nonlinear classifiers are appropriate, the classificatory performance of this class of second order MP models is compared with that of existing statistical (linear and quadratic) and first order MP-based formulations. The results of these experiments show that the proposed formulation appears to be a very attractive alternative to previously introduced linear and quadratic statistical and linear MP-based classification methods
    • …
    corecore