Search CORE

50,857 research outputs found

A Survey on Soft Subspace Clustering

Author: Choi Kup-Sze
Deng Zhaohong
Jiang Yizhang
Wang Jun
Wang Shitong
Publication venue: 'Elsevier BV'
Publication date: 07/04/2016
Field of study

Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

arXiv.org e-Print Archive

The Hong Kong Polytechnic University Pao Yue-kong Library

PolyU Institutional Repository

Soft topographic map for clustering and classification of bacteria

Author: G.M. Garrity
H. Klock
I.T. Joliffe
J.D. Thompson
J.E. Clarridge III
K. Rose
M. Drancourt
M. Drancourt
M. Drancourt
M. Remm
P. Rice
S. Altschul
S. Dubnov
S. Kumar
S.B. Needleman
S.P. Luttrell
T. Graepel
T. Graepel
T. Hofmann
T. Hofmann
T. Kohonen
T. Kohonen
T.H. Jukes
W.S. Torgerson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification or erroneous annotations in the database

Central Archive at the University of Reading

CiteSeerX

Crossref

Archivio istituzionale della ricerca - Università di Palermo

Regulatory motif discovery using a population clustering evolutionary algorithm

Author: Lones Michael A.
Tyrrell Andy M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2007
Field of study

This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm's capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm's ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences

White Rose Research Online