Search CORE

7 research outputs found

Clustering Algorithms For High Dimensional Data – A Survey Of Issues And Existing Approaches

Author: Babu B.Hari
Chandra N.Subash
Gopal T. Venu
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 05/09/2020
Field of study

Clustering is the most prominent data mining technique used for grouping the data into clusters based on distance measures. With the advent growth of high dimensional data such as microarray gene expression data, and grouping high dimensional data into clusters will encounter the similarity between the objects in the full dimensional space is often invalid because it contains different types of data. The process of grouping into high dimensional data into clusters is not accurate and perhaps not up to the level of expectation when the dimension of the dataset is high. It is now focusing tremendous attention towards research and development. The performance issues of the data clustering in high dimensional data it is necessary to study issues like dimensionality reduction, redundancy elimination, subspace clustering, co-clustering and data labeling for clusters are to analyzed and improved. In this paper, we presented a brief comparison of the existing algorithms that were mainly focusing at clustering on high dimensional data

Interscience Research Network

Auto Insurance Business Analytics Approach for Customer Segmentation Using Multiple Mixed-Type Data Clustering Algorithms

Author: Kai Zhuang
Sen Wu
Xiaonan Gao
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2018
Field of study

Customer segmentation is critical for auto insurance companies to gain competitive advantage by mining useful customer related information. While some efforts have been made for customer segmentation to support auto insurance decision making, their customer segmentation results tend to be affected by the characteristics of the algorithm used and lack multiple validation from multiple algorithms. To this end, we propose an auto insurance business analytics approach that segments customers by using three mixed-type data clustering algorithms including k-prototypes, improved k-prototypes and similarity-based agglomerative clustering. The customer segmentation results of these algorithms can complement and reinforce each other and demonstrate as much information as possible to support decision-making. To confirm its practical value, the proposed approach extracts seven rules for an auto insurance company that may support the company to make customer related decisions and develop insurance products

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Rule Extraction on Numeric Datasets Using Hyper-rectangles

Author: de Giusti Armando Eduardo
Hasperué Waldo
Lanzarini Laura Cristina
Publication venue: Canadian Center of Science and Education
Publication date: 01/01/2012
Field of study

When there is a need to understand the data stored in a database, one of the main requirements is being able to extract knowledge in the form of rules. Classification strategies allow extracting rules almost naturally. In this paper, a new classification strategy is presented that uses hyper-rectangles as data descriptors to achieve a model that allows extracting knowledge in the form of classification rules. The participation of an expert for training the model is discussed. Finally, the results obtained using the databases from the UCI repository are presented and compared with other existing classification models, showing that the algorithm presented requires less computational resources and achieves the same accuracy level and number of extracted rules.Fil: Hasperué, Waldo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; ArgentinaFil: Lanzarini, Laura Cristina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; ArgentinaFil: de Giusti, Armando Eduardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentin

CiteSeerX

CONICET Digital

CLUIN – A new method for extracting rules for large databases

Author: Corbalán Leonardo César
Hasperué Waldo
Publication venue
Publication date: 01/10/2012
Field of study

Servicio de Difusión de la Creación Intelectual

Computer Science & Technology Series : XVIII Argentine Congress of Computer Science. Selected papers

Author: De Giusti Armando
Pesado Patricia
Simari Guillermo
Publication venue: 'Universidad Nacional de La Plata'
Publication date: 07/03/2017
Field of study

CACIC’12 was the eighteenth Congress in the CACIC series. It was organized by the School of Computer Science and Engineering at the Universidad Nacional del Sur. The Congress included 13 Workshops with 178 accepted papers, 5 Conferences, 2 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2012 was organized following the traditional Congress format, with 13 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of 3-5 chairs of different Universities. The call for papers attracted a total of 302 submissions. An average of 2.5 review reports were collected for each paper, for a grand total of 752 review reports that involved about 410 different reviewers. A total of 178 full papers, involving 496 authors and 83 Universities, were accepted and 27 of them were selected for this book.Red de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

On Data Labeling for Clustering Categorical Data

Author: Chen H.-L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/07/2011
Field of study

National Taiwan University Repository

On Data Labeling for Clustering Categorical Data

Author: Hung-Leng Chen
Kun-Ta Chuang
Ming-Syan Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref