13 research outputs found

    Methods of Hierarchical Clustering

    Get PDF
    We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments. We look at hierarchical self-organizing maps, and mixture models. We review grid-based clustering, focusing on hierarchical density-based approaches. Finally we describe a recently developed very efficient (linear time) hierarchical clustering algorithm, which can also be viewed as a hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference

    Characterisation of Condition Monitoring Information for Diagnosis and Prognosis using Advanced Statistical Models

    Get PDF
    This research focuses on classification of categorical events using advanced statistical models. Primarily utilised to detect and identify individual component faults and deviations from normal healthy operation of reciprocating compressors. Effective monitoring of condition ensuring optimal efficiency and reliability whilst maintaining the highest possible safety standards and reducing costs and inconvenience due to impaired performance. Variability of operating conditions being revealed through examination of vibration signals recorded at strategic points of the process. Analysis of these signals informing expectations with respect to tolerable degrees of imperfection in specific components. Isolating inherent process variability from extraneous variability affords reliable means of ascertaining system health and functionality. Vibration envelope spectra offering highly responsive model parameters for diagnostic purposes. This thesis examines novel approaches to alleviating the computational burdens of large data analysis through investigation of the potential input variables. Three methods are investigated as follows: Method one employs multivariate variable clustering to ascertain homogeneity amongst input variables. A series of heterogeneous groups being formed from each of which explanatory input variables are selected. Data reduction techniques, method two, offer an alternative means of constructing predictive classifiers. A reduced number of reconstructed explanatory variables provide enhanced modelling capabilities ensuring algorithmic convergence. The final novel approach proposed combines both these methods alongside wavelet data compression techniques. Simplifying number of input parameters and individual signal volume whilst retaining crucial information for deterministic supremacy

    A new framework for clustering

    Get PDF
    The difficulty of clustering and the variety of clustering methods suggest the need for a theoretical study of clustering. Using the idea of a standard statistical framework, we propose a new framework for clustering. For a well-defined clustering goal we assume that the data to be clustered come from an underlying distribution and we aim to find a high-density cluster tree. We regard this tree as a parameter of interest for the underlying distribution. However, it is not obvious how to determine a connected subset in a discrete distribution whose support is located in a Euclidean space. Building a cluster tree for such a distribution is an open problem and presents interesting conceptual and computational challenges. We solve this problem using graph-based approaches and further parameterize clustering using the high-density cluster tree and its extension. Motivated by the connection between clustering outcomes and graphs, we propose a graph family framework. This framework plays an important role in our clustering framework. A direct application of the graph family framework is a new cluster-tree distance measure. This distance measure can be written as an inner product or kernel. It makes our clustering framework able to perform statistical assessment of clustering via simulation. Other applications such as a method for integrating partitions into a cluster tree and methods for cluster tree averaging and bagging are also derived from the graph family framework

    Análise de clusters aplicada ao sucesso/insucesso em matemática

    Get PDF
    De acordo com [Mirkin B., 1996], classificação é um agrupamento existente ou ideal daqueles que se parecem (ou são semelhantes) e separação dos que são dissemelhantes. Sendo o objectivo/razão da classificação: (1) formar e adquirir conhecimento, (2) analizar a estrutura do fenómeno e (3) relacionar entre si diferentes aspectos do fenómeno em questão. No estudo do sucesso/insucesso da Matemática está de algum modo subjacente nos nossos objectivos “classificar” os alunos de acordo com os factores que se pretende que sejam determinantes nos resultados a Matemática. Por outro lado, voltamos a recorrer à classificação quando pretendemos estabelecer os tipos de factores determinantes nos resultados da Matemática. Os objectivos da Análise de Clusters são: (1) analisar a estrutura dos dados; (2) verificar/relacionar os aspectos dos dados entre si; (3) ajudar na concepção da classificação. Pensámos que esta técnica da análise exploratória de dados poderia representar uma ferramenta muito potente para o estudo do sucesso/insucesso da Matemática no Ensino Básico. O trabalho desenvolvido nesta dissertação prova que a Análise de Clusters responde adequadamente às questões que se podem formular quando se tenta enquadrar socialmente e pedagogicamente o sucesso/insucesso da Matemática.Rita Vasconcelo
    corecore