14 research outputs found

    Clustering an interval data set : are the main partitions similar to a priori partition?

    Get PDF
    This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.In this paper we compare the best partitions of data units (cities) obtained from different algorithms of Ascendant Hierarchical Cluster Analysis (AHCA) of a well-known data set of the literature on symbolic data analysis (“city temperature interval data set”) with a priori partition of cities given by a panel of human observers. The AHCA was based on the weighted generalised affinity with equal weights, and on the probabilistic coefficient associated with the asymptotic standardized weighted generalized affinity coefficient by the method of Wald and Wolfowitz. These similarity coefficients between elements were combined with three aggregation criteria, one classical, Single Linkage (SL), and the other ones probabilistic, AV1 and AVB, the last ones in the scope of the VL methodology. The evaluation of the partitions in order to find the partitioning that best fits the underlying data was carried out using some validation measures based on the similarity matrices. In general, global satisfactory results have been obtained using our methods, being the best partitions quite close (or even coinciding) with the a priori partition provided by the panel of human observers

    Méthode d'auto-fuzzyfication par analyse des typicalités sur des lots de données réduits

    Get PDF
    Cet article expose une méthode de fuzzyfication automatique pour un classificateur à base de règles linguistiques floues. Elle s'appuie sur l'analyse des scores de typicalité des attributs caractérisant les formes à classer. La méthode proposée est appliquée à la reconnaissance de couleur sur des avivés. L'utilisation d'un classificateur flou n'étant pas aisée pour des non experts, l'industrialisation d'une telle méthode nécessite une simplification des phases de réglages. En outre, le cadre applicatif spécifique de cette étude ne permet d'avoir à disposition qu'une quantité de données réduite pour réaliser la phase d'apprentissage. Les scores de typicalité des attributs présentent l'avantage de discriminer les plages de valeurs associées à chaque classe couleur de sortie. L'étude des corrélations de ces typicalités améliore la fuzzyfication des paramètres et les essais réalisés sur des lots de données « industrielles » montrent l'augmentation du taux de reconnaissance. Ces taux sont comparés à ceux obtenus à partir d'une fuzzyfication équirépartie. Par ailleurs, une diminution du nombre de règles floues générées dans le modèle est constatée. Les temps de traitements en généralisation sont ainsi réduits

    Pixon-Based Image Segmentation

    Get PDF

    Interval valued symbolic representation of writer dependent features for online signature verification

    Get PDF
    This work focusses on exploitation of the notion of writer dependent parameters for online signature verification. Writer dependent parameters namely features, decision threshold and feature dimension have been well exploited for effective verification. For each writer, a subset of the original set of features are selected using different filter based feature selection criteria. This is in contrast to writer independent approaches which work on a common set of features for all writers. Once features for each writer are selected, they are represented in the form of an interval valued symbolic feature vector. Number of features and the decision threshold to be used for each writer during verification are decided based on the equal error rate (EER) estimated with only the signatures considered for training the system. To demonstrate the effectiveness of the proposed approach, extensive experiments are conducted on both MCYT (DB1) and MCYT (DB2) benchmarking online signature datasets consisting of signatures of 100 and 330 individuals respectively using the available 100 global parametric features. © 2017 Elsevier Lt

    Fuzzy Rule Iterative Feature Selection (FRIFS) with Respect to the Choquet Integral Apply to Fabric Defect Recognition

    Get PDF
    ISBN 0.7803.9489.5International audienceAn iterative method to select suitable features in an industrial fabric defect recognition context is proposed in this paper. It combines a global feature selection method based on the Choquet integral and a fuzzy linguistic rule classifier. The experimental study shows the wanted behaviour of this approach: the feature number decreases whereas the recognition rate increases. Thus, the number of generated fuzzy rules is reduced

    Fuzzy clustering of spatial interval-valued data

    Get PDF
    In this paper, two fuzzy clustering methods for spatial intervalvalued data are proposed, i.e. the fuzzy C-Medoids clustering of spatial interval-valued data with and without entropy regularization. Both methods are based on the Partitioning Around Medoids (PAM) algorithm, inheriting the great advantage of obtaining non-fictitious representative units for each cluster. In both methods, the units are endowed with a relation of contiguity, represented by a symmetric binary matrix. This can be intended both as contiguity in a physical space and as a more abstract notion of contiguity. The performances of the methods are proved by simulation, testing the methods with different contiguity matrices associated to natural clusters of units. In order to show the effectiveness of the methods in empirical studies, three applications are presented: the clustering of municipalities based on interval-valued pollutants levels, the clustering of European fact-checkers based on interval-valued data on the average number of impressions received by their tweets and the clustering of the residential zones of the city of Rome based on the interval of price values

    Fuzzy clustering of spatial interval-valued data

    Get PDF
    In this paper, two fuzzy clustering methods for spatial interval-valued data are proposed, i.e. the fuzzy C-Medoids clustering of spatial interval-valued data with and without entropy regularization. Both methods are based on the Partitioning Around Medoids (PAM) algorithm, inheriting the great advantage of obtaining non-fictitious representative units for each cluster. In both methods, the units are endowed with a relation of contiguity, represented by a symmetric binary matrix. This can be intended both as contiguity in a physical space and as a more abstract notion of contiguity. The performances of the methods are proved by simulation, testing the methods with different contiguity matrices associated to natural clusters of units. In order to show the effectiveness of the methods in empirical studies, three applications are presented: the clustering of municipalities based on interval-valued pollutants levels, the clustering of European fact-checkers based on interval-valued data on the average number of impressions received by their tweets and the clustering of the residential zones of the city of Rome based on the interval of price values

    3rd Workshop in Symbolic Data Analysis: book of abstracts

    Get PDF
    This workshop is the third regular meeting of researchers interested in Symbolic Data Analysis. The main aim of the event is to favor the meeting of people and the exchange of ideas from different fields - Mathematics, Statistics, Computer Science, Engineering, Economics, among others - that contribute to Symbolic Data Analysis
    corecore