101,242 research outputs found
A Universal Similarity Model for Transactional Data Clustering
Data mining methods are used to extract hidden knowledge from large database. Data partitioning methods are used to group up the relevant data values. Similar data values are grouped under the same cluster. K - means and Partitioning Around Medoids (PAM ) clustering algorithms are used to cluster numerical data. Distance measures are used to estimate the transaction similarity. Data partitioning solutions are identified using the cluster ensembl e models . The ensemble information matrix presents only cluster data point relations. Ensembles based clustering techniques produces final data partition based on incomplete information. Link - based approach improves the conventional matrix by discovering unknown entries through cluster similarity in an ensemble. Link - based algorithm is used for the underlying similarity assessment. Pairwise similarity and binary cluster association matrices summarize the underlying ensemble information. A weighted bipartite graph is formulated from the refined matrix. The graph partitioning technique is applied on the weighted bipartite graph. The Particle Swarm Optimization (PSO) clustering algorithm is a optimization based clustering scheme. It is integrated with the clu ster ensemble model. Binary , categorical and continuous data clustering is supported in the system. The attribute connectivity analysis is optimized for all attributes. Refined cluster - association matrix (RM) is updated with all attribute relationships
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensemble: A Survey
Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods
Construction and Refinement of Coarse-Grained Models
A general scheme, which includes constructions of coarse-grained (CG) models,
weighted ensemble dynamics (WED) simulations and cluster analyses (CA) of
stable states, is presented to detect dynamical and thermodynamical properties
in complex systems. In the scheme, CG models are efficiently and accurately
optimized based on a directed distance from original to CG systems, which is
estimated from ensemble means of lots of independent observable in two systems.
Furthermore, WED independently generates multiple short molecular dynamics
trajectories in original systems. The initial conformations of the trajectories
are constructed from equilibrium conformations in CG models, and the weights of
the trajectories can be estimated from the trajectories themselves in
generating complete equilibrium samples in the original systems. CA calculates
the directed distances among the trajectories and groups their initial
conformations into some clusters, which correspond to stable states in the
original systems, so that transition dynamics can be detected without requiring
a priori knowledge of the states.Comment: 4 pages, no figure
- …