1,590 research outputs found

    Supervised learning using a symmetric bilinear form for record linkage

    Get PDF
    Record Linkage is used to link records of two different files corresponding to the same individuals. These algorithms are used for database integration. In data privacy, these algorithms are used to evaluate the disclosure risk of a protected data set by linking records that belong to the same individual. The degree of success when linking the original (unprotected data) with the protected data gives an estimation of the disclosure risk. In this paper we propose a new parameterized aggregation operator and a supervised learning method for disclosure risk assessment. The parameterized operator is a symmetric bilinear form and the supervised learning method is formalized as an optimization problem. The target of the optimization problem is to find the values of the aggregation parameters that maximize the number of re-identification (or correct links). We evaluate and compare our proposal with other non-parametrized variations of record linkage, such as those using the Mahalanobis distance and the Euclidean distance (one of the most used approaches for this purpose). Additionally, we also compare it with other previously presented parameterized aggregation operators for record linkage such as the weighted mean and the Choquet integral. From these comparisons we show how the proposed aggregation operator is able to overcome or at least achieve similar results than the other parameterized operators. We also study which are the necessary optimization problem conditions to consider the described aggregation functions as metric functions

    Using Choquet integrals for kNN approximation and classification

    Full text link
    k-nearest neighbors (kNN) is a popular method for function approximation and classification. One drawback of this method is that the nearest neighbors can be all located on one side of the point in question x. An alternative natural neighbors method is expensive for more than three variables. In this paper we propose the use of the discrete Choquet integral for combining the values of the nearest neighbors so that redundant information is canceled out. We design a fuzzy measure based on location of the nearest neighbors, which favors neighbors located all around x. <br /

    A review of methods for capacity identification in Choquet integral based multi-attribute utility theory: Applications of the Kappalab R package

    Get PDF
    International audienceThe application of multi-attribute utility theory whose aggregation process is based on the Choquet integral requires the prior identification of a capacity. The main approaches to capacity identification proposed in the literature are reviewed and their advantages and inconveniences are discussed. All the reviewed methods have been implemented within the Kappalab R package. Their application is illustrated on a detailed example

    Using Non-Additive Measure for Optimization-Based Nonlinear Classification

    Get PDF
    Over the past few decades, numerous optimization-based methods have been proposed for solving the classification problem in data mining. Classic optimization-based methods do not consider attribute interactions toward classification. Thus, a novel learning machine is needed to provide a better understanding on the nature of classification when the interaction among contributions from various attributes cannot be ignored. The interactions can be described by a non-additive measure while the Choquet integral can serve as the mathematical tool to aggregate the values of attributes and the corresponding values of a non-additive measure. As a main part of this research, a new nonlinear classification method with non-additive measures is proposed. Experimental results show that applying non-additive measures on the classic optimization-based models improves the classification robustness and accuracy compared with some popular classification methods. In addition, motivated by well-known Support Vector Machine approach, we transform the primal optimization-based nonlinear classification model with the signed non-additive measure into its dual form by applying Lagrangian optimization theory and Wolfes dual programming theory. As a result, 2 – 1 parameters of the signed non-additive measure can now be approximated with m (number of records) Lagrangian multipliers by applying necessary conditions of the primal classification problem to be optimal. This method of parameter approximation is a breakthrough for solving a non-additive measure practically when there are a relatively small number of training cases available (). Furthermore, the kernel-based learning method engages the nonlinear classifiers to achieve better classification accuracy. The research produces practically deliverable nonlinear models with the non-additive measure for classification problem in data mining when interactions among attributes are considered

    On Global Conservation Laws at Null Infinity

    Get PDF
    The ``standard'' expressions for total energy, linear momentum and also angular momentum of asymptotically flat Bondi metrics at null infinity are also obtained from differential conservation laws on asymptotically flat backgrounds, derived from a quadratic Lagrangian density by methods currently used in classical field theory. It is thus a matter of taste and commodity to use or not to use a reference spacetime in defining these globally conserved quantities. Backgrounds lead to N\oe ther conserved currents; the use of backgrounds is in line with classical views on conservation laws. Moreover, the conserved quantities are in principle explicitly related to the sources of gravity through Einstein's equations, while standard definitions are not. The relations depend, however, on a rule for mapping spacetimes on backgrounds

    Modelling multi-scale microstructures with combined Boolean random sets: A practical contribution

    Get PDF
    Boolean random sets are versatile tools to match morphological and topological properties of real structures of materials and particulate systems. Moreover, they can be combined in any number of ways to produce an even wider range of structures that cover a range of scales of microstructures through intersection and union. Based on well-established theory of Boolean random sets, this work provides scientists and engineers with simple and readily applicable results for matching combinations of Boolean random sets to observed microstructures. Once calibrated, such models yield straightforward three-dimensional simulation of materials, a powerful aid for investigating microstructure property relationships. Application of the proposed results to a real case situation yield convincing realisations of the observed microstructure in two and three dimensions
    • 

    corecore