1,331 research outputs found

    Nonlinear Multiregressions Based on Choquet Integral for Data with both Numerical and Categorical Attributes.

    Get PDF
    Based on generalized Choquet integrals with respect to signed fuzzy measures, a model of nonlinear multiregression that can catch the interaction among predictive attributes toward the objective attribute can be established. In this model, some predictive attributes are numerical while the others are categorical. A numericalization technique is adopted to project each state of a categorical attribute that has more than two states to a multi-dimensional space optimally through a genetic algorithm, in which some regression coefficients are determined from data. To reduce the complexity of the genetic algorithm, the other regression coefficients such as the values of the signed fuzzy measure are determined by an algebraic method. In conclusion, this paper improves the previous relative work in several aspects: (1) Using a signed fuzzy measure to replace the generalized fuzzy measure such that the regression can more appropriately describe the relation among the objective attributes and the predictive attributes. (2) To reduce the complexity of the genetic algorithm that is used to search the optimal estimation of the regression coefficients, taking a part of the unknown regression coefficients, the values of the signed fuzzy measure, out from the chromosome involved in the genetic algorithm. (3) Optimally projecting the states of the categorical attribute(s) into a partial ordering space instead of a total ordering space as done in the previous work, to numericalize the categorical attribute(s) when there are more than two states for a predictive attribute

    Revisiting the optimal linear income tax with categorical transfers

    Get PDF
    This work was supported by AXA Research Fund.When individuals differ in both productivity and some categorical attribute, optimal linear/piecewise-linear tax expressions are written to capture cases where it is suboptimal to eliminate inequality in the average social marginal value of income between categorical groups. Simulations provide examples.PostprintPeer reviewe

    CUBOS: An Internal Cluster Validity Index for Categorical Data

    Get PDF
    Internal cluster validity index is a powerful tool for evaluating clustering performance. The study on internal cluster validity indices for categorical data has been a challenging task due to the difficulty in measuring distance between categorical attribute values. While some efforts have been made, they ignore the relationship between different categorical attribute values and the detailed distribution information between data objects. To solve these problems, we propose a novel index called Categorical data cluster Utility Based On Silhouette (CUBOS). Specifically, we first make clear the superiority of the paradigm of Silhouette index in exploring the details of clustering results. Then, we raise the Improved Distance metric for Categorical data (IDC) inspired by Category Distance to measure distance between categorical data exactly. Finally, the paradigm of Silhouette index and IDC are combined to construct the CUBOS, which can overcome the aforementioned shortcomings and produce more accurate evaluation results than other baselines, as shown by the experimental results on several UCI datasets
    corecore