Research of Clustering Analysis for Functional Data Based on Adaptive Weighting

Abstract

基于有限维离散数据的传统聚类分析并不能直接用于函数型数据的分类挖掘。本文针对函数型数据的稀疏性和无穷维特殊性展开讨论,在综合剖析现有函数型聚类方法优势与不足的基础上,依据聚类指标的信息量差异重构加权主成分距离为函数相似性测度,提出了一种函数型数据的自适应权重聚类分析。相对同类函数型聚类算法,新方法的核心优势在于:(1)自适应赋权的距离函数体现了聚类指标分类效率的差异,并且有充分的理论基础保证其必要性和客观合理性;(2)基于有限维离散数据的聚类实现了无限维连续函数的聚类,能够显著降低计算成本。实证检验表明,新方法的分类正确率明显提高,能够有效解决传统聚类算法极端情形下的失效问题,有着复杂函数型数据分类问题下的灵活性和普遍适用性。Traditional clustering analysis for finite dimensional discrete data cannot be generalized directly to functional data classification.Focusing on sparsity and infinite dimension of functional data,under thorough comparison on advantages and defects of existed functional clustering methods,this paper proposed an adaptive weighting functional clustering analysis by reconstructing weighted principal component distance as functional similarity according to variables' information difference.Contrast to existed functional clustering methods,the core advantages of new method are:(1) The adaptive weighted distance statistics fully reflect the different classification efficiency of clustering variables,and having sufficient theoretical basis to ensure its necessity and rationality;(2) The clustering of infinite dimensional continuous function is realized by clustering of finite dimensional discrete data,owning an advantage of significantly reducing computation cost.Empirical test results reveal the new method improved classification accuracy,so it deserve flexibility and universal adaptability to complex functional data classification for its effective capability of solving problems under extreme situations which is invalid by traditional functional data clustering analysis.国家社会科学基金重大项目(13&ZD148); 教育部新世纪优秀人才支持计划(NCET-12-0955); 国家自然科学基金项目(71071153

    Similar works