3 research outputs found

    Nonsmooth optimization models and algorithms for data clustering and visualization

    Get PDF
    Cluster analysis deals with the problem of organization of a collection of patterns into clusters based on a similarity measure. Various distance functions can be used to define this measure. Clustering problems with the similarity measure defined by the squared Euclidean distance have been studied extensively over the last five decades. However, problems with other Minkowski norms have attracted significantly less attention. The use of different similarity measures may help to identify different cluster structures of a data set. This in turn may help to significantly improve the decision making process. High dimensional data visualization is another important task in the field of data mining and pattern recognition. To date, the principal component analysis and the self-organizing maps techniques have been used to solve such problems. In this thesis we develop algorithms for solving clustering problems in large data sets using various similarity measures. Such similarity measures are based on the squared LDoctor of Philosoph

    Self-organizing topological tree for online vector quantization and data clustering

    No full text
    The self-organizing Maps (SOM) introduced by Kohonen implement two important operations: vector quantization (VQ) and a topology-preserving mapping. In this paper, an online self-organizing topological tree (SOTT) with faster learning is proposed. A new learning rule delivers the efficiency and topology preservation, which is superior of other structures of SOMs. The computational complexity of the proposed SOTT is O(logN ) rather than O(N) as for the basic SOM. The experimental results demonstrate that the reconstruction performance of SOTT is comparable to the full-search SOM and its computation time is much shorter than the full-search SOM and other vector quantizers. In addition, SOTT delivers the hierarchical mapping of codevectors and the progressive transmission and decoding property, which are rarely supported by other vector quantizers at the same time. To circumvent the shortcomings of clustering performance of classical partition clustering algorithms, a hybrid clustering algorithm that fully exploit the online learning and multiresolution characteristics of SOTT is devised. A new linkage metric is proposed which can be updated online to accelerate the time consuming agglomerative hierarchical clustering stage. Besides the enhanced clustering performance, due to the online learning capability, the memory requirement of the proposed SOTT hybrid clustering algorithm is independent of the size of the data set, making it attractive for large database.Published versio

    Self-Organizing Topological Tree for Online Vector Quantization and Data Clustering

    No full text
    corecore