5 research outputs found

    Chapter Profiling visitors of a national park in Italy through unsupervised classification of mixed data

    Get PDF
    Cluster analysis has for long been an effective tool for analysing data. Thus, several disciplines, such as marketing, psychology and computer sciences, just to mention a few, did take advantage from its contribution over time. Traditionally, this kind of algorithm concentrates only on numerical or categorical data at a time. In this work, instead, we analyse a dataset composed of mixed data, namely both numerical than categorical ones. More precisely, we focus on profiling visitors of the National Park of Majella in the Abruzzo region of Italy, which observations are characterized by variables such as gender, age, profession, expectations and satisfaction rate on park services. Applying a standard clustering procedure would be wholly inappropriate in this case. Therefore, we hereby propose an unsupervised classification of mixed data, a specific procedure capable of processing both numerical than categorical variables simultaneously, releasing truly precious information. In conclusion, our application therefore emphasizes how cluster analysis for mixed data can lead to discover particularly informative patterns, allowing to lay the groundwork for an accurate customers profiling, starting point for a detailed marketing analysis

    An Affinity Propagation Clustering Algorithm for Mixed Numeric and Categorical Datasets

    Get PDF
    Clustering has been widely used in different fields of science, technology, social science, and so forth. In real world, numeric as well as categorical features are usually used to describe the data objects. Accordingly, many clustering methods can process datasets that are either numeric or categorical. Recently, algorithms that can handle the mixed data clustering problems have been developed. Affinity propagation (AP) algorithm is an exemplar-based clustering method which has demonstrated good performance on a wide variety of datasets. However, it has limitations on processing mixed datasets. In this paper, we propose a novel similarity measure for mixed type datasets and an adaptive AP clustering algorithm is proposed to cluster the mixed datasets. Several real world datasets are studied to evaluate the performance of the proposed algorithm. Comparisons with other clustering algorithms demonstrate that the proposed method works well not only on mixed datasets but also on pure numeric and categorical datasets

    Unsupervised Optimal Discriminant Vector Based Feature Selection Method

    Get PDF
    An efficient unsupervised feature selection method based on unsupervised optimal discriminant vector is developed to find the important features without using class labels. Features are ranked according to the feature importance measurement based on unsupervised optimal discriminant vector in the following steps. First, fuzzy Fisher criterion is adopted as objective function to derive the optimal discriminant vector in unsupervised pattern. Second, the feature importance measurement based on elements of unsupervised optimal discriminant vector is defined to determine the importance of each feature. The features with little importance measurement are removed from the feature subset. Experiments on UCI dataset and fault diagnosis are carried out to show that the proposed method is very efficient and able to deliver reliable results

    ASA 2021 Statistics and Information Systems for Policy Evaluation

    Get PDF
    This book includes 25 peer-reviewed short papers submitted to the Scientific Opening Conference titled “Statistics and Information Systems for Policy Evaluation”, aimed at promoting new statistical methods and applications for the evaluation of policies and organized by the Association for Applied Statistics (ASA) and the Department of Statistics, Computer Science, Applications DiSIA “G. Parenti” of the University of Florence, jointly with the partners AICQ (Italian Association for Quality Culture), AICQ-CN (Italian Association for Quality Culture North and Centre of Italy), AISS (Italian Academy for Six Sigma), ASSIRM (Italian Association for Marketing, Social and Opinion Research), Comune di Firenze, the SIS – Italian Statistical Society, Regione Toscana and Valmon – Evaluation & Monitoring

    Unsupervised feature selection using a neuro-fuzzy approach

    Get PDF
    A neuro-fuzzy methodology is described which involves connectionist minimization of a fuzzy feature evaluation index with unsupervised training. The concept of a flexible membership function incorporating weighed distance is introduced in the evaluation index to make the modeling of clusters more appropriate. A set of optimal weighing coefficients in terms of networks parameters representing individual feature importance is obtained through connectionist minimization. Besides, the investigation includes the development of another algorithm for ranking of different feature subsets using the aforesaid fuzzy evaluation index without neural networks. Results demonstrating the effectiveness of the algorithms for various real life data are provide
    corecore