Search CORE

448 research outputs found

Partition Around Medoids Clustering on the Intel Xeon Phi Many-Core Coprocessor

Author: Rechkalov T. V.
Publication venue: Уральский федеральный университет
Publication date: 01/01/2015
Field of study

Abstract. The paper touches upon the problem of implementation Partition Around Medoids (PAM) clustering algorithm for the Intel Many Integrated Core architecture. PAM is a form of well-known k-Medoids clustering algorithm and is applied in various subject domains, e.g. bioinformatics, text analysis, intelligent transportation systems, etc. An optimized version of PAM for the Intel Xeon Phi coprocessor is introduced where OpenMP parallelizing technology, loop vectorization, tiling technique and efficient distance matrix computation for Euclidean metric are used. Experimental results for different data sets confirm the efficiency of the proposed algorithm

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Effective Spell Checking Methods Using Clustering Algorithms

Author: Cordeiro De Amorim Renato
Zampieri Marcos
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2013
Field of study

This paper presents a novel approach to spell checking using dictionary clustering. The main goal is to reduce the number of times distances have to be calculated when finding target words for misspellings. The method is unsupervised and combines the application of anomalous pattern initialization and partition around medoids (PAM). To evaluate the method, we used an English misspelling list compiled using real examples extracted from the Birkbeck spelling error corpus.Final Published versio

CiteSeerX

University of Hertfordshire Research Archive

MACOC: a medoid-based ACO clustering algorithm

Author: A.P. Dempster
C. Blum
D. Martens
E. Hruschka
F. Otero
F. Otero
F. Wilcoxon
F.O. França de
M. Borrotti
O.M. Jafar
R. Parpinelli
S. Schaeffer
X. Zhang
Y. Kao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The application of ACO-based algorithms in data mining is growing over the last few years and several supervised and unsupervised learning algorithms have been developed using this bio-inspired approach. Most recent works concerning unsupervised learning have been focused on clustering, showing great potential of ACO-based techniques. This work presents an ACO-based clustering algorithm inspired by the ACO Clustering (ACOC) algorithm. The proposed approach restructures ACOC from a centroid-based technique to a medoid-based technique, where the properties of the search space are not necessarily known. Instead, it only relies on the information about the distances amongst data. The new algorithm, called MACOC, has been compared against well-known algorithms (K-means and Partition Around Medoids) and with ACOC. The experiments measure the accuracy of the algorithm for both synthetic datasets and real-world datasets extracted from the UCI Machine Learning Repository

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Kent Academic Repository

Biblos-e Archivo

Modeling Wheezing Spells Identifies Phenotypes with Different Outcomes and Genetic Associates

Author: Cucco Alex
Curtin John
Custovic Adnan
Fontanella Sara
Granell Raquel
Haider Sadia
Publication venue: 'American Thoracic Society'
Publication date: 20/01/2022
Field of study

Explore Bristol Research

Automatic Dimension Selection for a Non-negative Factorization Approach to Clustering Multiple Random Graphs

Author: Lee Nam H.
Park Youngser
Priebe Care E.
Rosen Michael
Wang I-Jeng
Publication venue
Publication date: 09/09/2014
Field of study

We consider a problem of grouping multiple graphs into several clusters using singular value thesholding and non-negative factorization. We derive a model selection information criterion to estimate the number of clusters. We demonstrate our approach using "Swimmer data set" as well as simulated data set, and compare its performance with two standard clustering algorithms.Comment: This paper has been withdrawn by the author due to a newer version with overlapping content

arXiv.org e-Print Archive

CiteSeerX

Deep Gaussian Mixture Models

Author: McLachlan Geoffrey J.
Viroli Cinzia
Publication venue
Publication date: 18/11/2017
Field of study

Deep learning is a hierarchical inference method formed by subsequent multiple layers of learning able to more efficiently describe complex relationships. In this work, Deep Gaussian Mixture Models are introduced and discussed. A Deep Gaussian Mixture model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions. Thus, the deep mixture model consists of a set of nested mixtures of linear models, which globally provide a nonlinear model able to describe the data in a very flexible way. In order to avoid overparameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture thus resulting in deep mixtures of factor analysers.Comment: 19 pages, 4 figure

arXiv.org e-Print Archive

University of Queensland eSpace

Modeling Wheezing Spells Identifies Phenotypes with Different Outcomes and Genetic Associates

Author: Arshad Syed Hasan
Cucco Alex
Cullinan Paul
Curtin John
Custovic Adnan
Devereux Graham
Fontanella Sara
Granell Raquel
Haider Sadia
Holloway John W.
Murray Clare S.
Roberts Graham
Simpson Angela
Turner Stephen
Publication venue: 'American Thoracic Society'
Publication date: 20/01/2022
Field of study

Funding Information: Supported by the UK Medical Research Council (UK MRC) Programme grant MR/S025340/1 and grants G0601361 and MR/K002449/1. R.G. is in part funded through Wellcome Trust Strategic Award 108818/15/Z. The UK MRC and Wellcome (grant ref: 217065/Z/19/Z) and the University of Bristol provide core support for ALSPAC (Avon Longitudinal Study of Parents and Children). MAAS (Manchester Asthma and Allergy Study) was supported by the Asthma UK Grants No 301 (1995–1998), No 362 (1998–2001), No 01/012 (2001–2004), No 04/014 (2004–2007), British Medical Association James Trust (2005), and the JP Moulton Charitable Foundation (2004–2016), the North West Lung Centre Charity (1997–current), and the UK MRC grant MR/L012693/1 (2014–2018). Acknowledgment This article is dedicated to the memory of our wonderful colleague and friend Prof. John Henderson (1958–2019), whose contribution to the understanding of the heterogeneity of childhood asthma cannot be overstated. Rainbow chasers and UNICORN riders forever.Peer reviewedPublisher PD

Aberdeen University Research

Southampton (e-Prints Soton)

LSTM Online Archive

The University of Manchester - Institutional Repository

A New Partitioning Around Medoids Algorithm

Author: Bryan Jennifer
Pollard Katherine S.
van der Laan Mark J.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/02/2002
Field of study

Kaufman & Rousseeuw (1990) proposed a clustering algorithm Partitioning Around Medoids (PAM) which maps a distance matrix into a specified number of clusters. A particularly nice property is that PAM allows clustering with respect to any specified distance metric. In addition, the medoids are robust representations of the cluster centers, which is particularly important in the common context that many elements do not belong well to any cluster. Based on our experience in clustering gene expression data, we have noticed that PAM does have problems recognizing relatively small clusters in situations where good partitions around medoids clearly exist. In this note, we propose to partition around medoids by maximizing a criteria Average Silhouette\u27\u27 defined by Kaufman & Rousseeuw. We also propose a fast-to-compute approximation of Average Silhouette\u27\u27. We implement these two new partitioning around medoids algorithms and illustrate their performance relative to existing partitioning methods in simulations

Collection Of Biostatistics Research Archive

Disrupted Modularity and Local Connectivity of Brain Functional Networks in Childhood-Onset Schizophrenia

Author: Aaron F Alexander-Bloch
Aaron F Alexander-Bloch
Aaron F Alexander-Bloch
David Meunier
Edward T Bullmore
Edward T Bullmore
Francois eLalonde
Jay eGiedd
Liv Clasen
Nitin eGogtay
Rasmus eBirn
Rhoshel eLenroot
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2010
Field of study

Modularity is a fundamental concept in systems neuroscience, referring to the formation of local cliques or modules of densely intra-connected nodes that are sparsely inter-connected with nodes in other modules. Topological modularity of brain functional networks can quantify theoretically anticipated abnormality of brain network community structure – so-called dysmodularity – in developmental disorders such as childhood-onset schizophrenia (COS). We used graph theory to investigate topology of networks derived from resting-state fMRI data on 13 COS patients and 19 healthy volunteers. We measured functional connectivity between each pair of 100 regional nodes, focusing on wavelet correlation in the frequency interval 0.05–0.1 Hz, then applied global and local thresholding rules to construct graphs from each individual association matrix over the full range of possible connection densities. We show how local thresholding based on the minimum spanning tree facilitates group comparisons of networks by forcing the connectedness of sparse graphs. Threshold-dependent graph theoretical results are compatible with the results of a k-means unsupervised learning algorithm and a multi-resolution (spin glass) approach to modularity, both of which also find community structure but do not require thresholding of the association matrix. In general modularity of brain functional networks was significantly reduced in COS, due to a relatively reduced density of intra-modular connections between neighboring regions. Other network measures of local organization such as clustering were also decreased, while complementary measures of global efficiency and robustness were increased, in the COS group. The group differences in complex network properties were mirrored by differences in simpler statistical properties of the data, such as the variability of the global time series and the internal homogeneity of the time series within anatomical regions of interest

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector