101,242 research outputs found

    A Universal Similarity Model for Transactional Data Clustering

    Get PDF
    Data mining methods are used to extract hidden knowledge from large database. Data partitioning methods are used to group up the relevant data values. Similar data values are grouped under the same cluster. K - means and Partitioning Around Medoids (PAM ) clustering algorithms are used to cluster numerical data. Distance measures are used to estimate the transaction similarity. Data partitioning solutions are identified using the cluster ensembl e models . The ensemble information matrix presents only cluster data point relations. Ensembles based clustering techniques produces final data partition based on incomplete information. Link - based approach improves the conventional matrix by discovering unknown entries through cluster similarity in an ensemble. Link - based algorithm is used for the underlying similarity assessment. Pairwise similarity and binary cluster association matrices summarize the underlying ensemble information. A weighted bipartite graph is formulated from the refined matrix. The graph partitioning technique is applied on the weighted bipartite graph. The Particle Swarm Optimization (PSO) clustering algorithm is a optimization based clustering scheme. It is integrated with the clu ster ensemble model. Binary , categorical and continuous data clustering is supported in the system. The attribute connectivity analysis is optimized for all attributes. Refined cluster - association matrix (RM) is updated with all attribute relationships

    Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensemble: A Survey

    Get PDF
    Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods

    Construction and Refinement of Coarse-Grained Models

    Full text link
    A general scheme, which includes constructions of coarse-grained (CG) models, weighted ensemble dynamics (WED) simulations and cluster analyses (CA) of stable states, is presented to detect dynamical and thermodynamical properties in complex systems. In the scheme, CG models are efficiently and accurately optimized based on a directed distance from original to CG systems, which is estimated from ensemble means of lots of independent observable in two systems. Furthermore, WED independently generates multiple short molecular dynamics trajectories in original systems. The initial conformations of the trajectories are constructed from equilibrium conformations in CG models, and the weights of the trajectories can be estimated from the trajectories themselves in generating complete equilibrium samples in the original systems. CA calculates the directed distances among the trajectories and groups their initial conformations into some clusters, which correspond to stable states in the original systems, so that transition dynamics can be detected without requiring a priori knowledge of the states.Comment: 4 pages, no figure
    • …
    corecore