18 research outputs found

    High throughput powder diffraction: II Applications of clustering methods and multivariate data analysis

    Get PDF
    In high throughput crystallography is possible to accumulate over 1000 powder diffraction patterns on a series of related compounds, often polymorphs. We present a method that can analyse such data, automatically sort the patterns into related clusters or classes, characterise each cluster and identify any unusual samples containing, for example, unknown or unexpected polymorphs. Mixtures may be analysed quantitatively if a database of pure phases is available. A key component of the method is a set of visualisation tools based on dendrograms, cluster analysis, pie charts, principal component based score plots and metric multidimensional scaling. Applications are presented to pharmaceutical data, and inorganic compounds. The procedures have been incorporated into the PolySNAP commercial computer software

    Book reviews

    Get PDF

    SmallSteps : an adaptive distance-based clustering algorithm

    Get PDF
    In this article we propose a new distance-based clustering algorithm. Distance-based clustering methods operate on data sets that are in similarity space, where the similarities/dissimilarities between the objects are given by a matrix. These algorithms have at least O(n2) time complexity, where n is the number of objects. One of the latest distance-based method is Chameleon which, according to experiences, works well only on larger data sets and fails on relatively smaller ones. This contraditcs the fact that the O(n2) time complexity makes the distance-based algorithms unsuitable for huge data sets. Thus we developed a new distance-based method (SmallSteps), which can handle relatively small amount of objects too. In our solution we are looking for connected graphs which have edges with a maximum weight computed on the environments of the objects. The method is capable to detect clusters with different shapes, sizes or densities, it is able to automatically determine the number of clusters and has a special ability to divide clusters into subclusters

    An Algorithm for Detecting the Principal Allotment among Fuzzy Clusters and Its Application as a Technique of Reduction of Analyzed Features Space Dimensionality

    Get PDF
    This paper describes a modification of a possibilistic clustering method based on the concept of allotment among fuzzy clusters. Basic ideas of the method are considered and the concept of a principal allotment among fuzzy clusters is introduced. The paper provides the description of the plan of the algorithm for detection principal allotment. An analysis of experimental results of the proposed algorithm’s application to the Tamura’s portrait data in comparison with the basic version of the algorithm and with the NERFCM-algorithm is carried out. A methodology of the algorithm’s application to the dimensionality reduction problem is outlined and the application of the methodology is illustrated on the example of Anderson’s Iris data in comparison with the result of principal component analysis. Preliminary conclusions are formulated also

    Un modello multicriterio «fuzzy» per la valutazione degli interventi di riqualificazione urbana

    Get PDF
    Le tecniche multicriterio (Hwang C.L. e Yoon K., 1981; Nijkamp P. e Voogd H., 1989; Rizzo F., 1990) si presentano congruenti con il carattere multidimensionale della valutazione dei piani e dei progetti di riqualificazione urbana, dovendo essere considerata una pluralitĂ  di obiettivi derivanti da istanze di natura diversa -economica, sociale, etica, ecologica- e consentendo le tecniche medesime un'ampia rappresentazione delquadro socio-economico, istituzionale ed ambientale, ail'interno del quale il soggetto pubblico dovrĂ  assumere la decisione dell'intervento. Nelle operazioni di riquaiificazione urbana, l'analisi multicriterio interviene in un processo nel quale alla definizione -da parte della Pubblica Amministrazione- degli obiettivi e delle azioni, segue la predisposizione dei progetti che formano la materia delle valutazioni richieste per il confronto e la scelta dell'alternativa da realizzare

    Clustering uncertain data using voronoi diagrams and R-tree index

    Get PDF
    We study the problem of clustering uncertain objects whose locations are described by probability density functions (pdfs). We show that the UK-means algorithm, which generalizes the k-means algorithm to handle uncertain objects, is very inefficient. The inefficiency comes from the fact that UK-means computes expected distances (EDs) between objects and cluster representatives. For arbitrary pdfs, expected distances are computed by numerical integrations, which are costly operations. We propose pruning techniques that are based on Voronoi diagrams to reduce the number of expected distance calculations. These techniques are analytically proven to be more effective than the basic bounding-box-based technique previously known in the literature. We then introduce an R-tree index to organize the uncertain objects so as to reduce pruning overheads. We conduct experiments to evaluate the effectiveness of our novel techniques. We show that our techniques are additive and, when used in combination, significantly outperform previously known methods. © 2006 IEEE.published_or_final_versio

    Fuzzy clustering with spatial-temporal information

    Get PDF
    Clustering geographical units based on a set of quantitative features observed at several time occasions requires to deal with the complexity of both space and time information. In particular, one should consider (1) the spatial nature of the units to be clustered, (2) the characteristics of the space of multivariate time trajectories, and (3) the uncertainty related to the assignment of a geographical unit to a given cluster on the basis of the above com- plex features. This paper discusses a novel spatially constrained multivariate time series clustering for units characterised by different levels of spatial proximity. In particular, the Fuzzy Partitioning Around Medoids algorithm with Dynamic Time Warping dissimilarity measure and spatial penalization terms is applied to classify multivariate Spatial-Temporal series. The clustering method has been theoretically presented and discussed using both simulated and real data, highlighting its main features. In particular, the capability of embedding different levels of proximity among units, and the ability of considering time series with different length

    The Double Galaxy Cluster Abell 2465 I. Basic Properties: Optical Imaging and Spectroscopy

    Get PDF
    Optical imaging and spectroscopic observations of the z = 0.245 double galaxy cluster Abell 2465 are described. This object appears to be undergoing a major merger. It is a double X-ray source and is detected in the radio at 1.4 GHz. This paper investigates signatures of the interaction of the two components. Redshifts were measured to determine velocity dispersions and virial radii of each component. The technique of fuzzy clustering was used to assign membership weights to the galaxies in each clump. Using redshifts of 93 cluster members within 1.4 Mpc of the subcluster centres, the virial masses and anisotropy parameters are derived. 37% of the spectroscopically observed galaxies show emission lines and are predominantly star forming in the diagnostic diagram. No strong AGN sources were found. The emission line galaxies tend to lie between the two cluster centres with more near the SW clump. The luminosity functions of the two subclusters differ. The NE component is similar to many rich clusters, while the SW component has more faint galaxies. The NE clump's light profile follows a single NFW profile with c = 10 while the SW is better fit with an extended outer region and a compact inner core, consistent with available X-ray data indicating that the SW clump has a cooling core. The observed differences and properties of the two components of Abell 2465 are interpreted to have been caused by a collision 2-4 Gyr ago, after which they have moved apart and are now near their apocentres, although the start of a merger remains a possibility. The number of emission line galaxies gives weight to the idea that galaxy cluster collisions trigger star formation.Comment: 21 pages, 18 Figures Replaced typos, mostly in references To appear in MNRAS, Accepted 2010 December 16. Received 2010 December 15; in original form 2010 November 0
    corecore