Search CORE

7 research outputs found

Author index

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time

Author: Altschul
Bespamyatnikh
Borneman
Caporaso
Cole
Cormen
Dethlefsen
Eckburg
Edgar
Edgar
Eisen
Fabrice
Franti
Fred
Huber
Huse
Huse
Kleinberg
Krznaric
Li
Murtagh
Needleman
Peterson
Quince
Sait
Schloss
Schloss
Sogin
Sun
Sun
Sun
Turnbaugh
Wang
Ward
White
Yanagisawa
Yijun Sun
Yunpeng Cai
Zhang
Zhang
Publication venue: Oxford University Press
Publication date
Field of study

Taxonomy-independent analysis plays an essential role in microbial community analysis. Hierarchical clustering is one of the most widely employed approaches to finding operational taxonomic units, the basis for many downstream analyses. Most existing algorithms have quadratic space and computational complexities, and thus can be used only for small or medium-scale problems. We propose a new online learning-based algorithm that simultaneously addresses the space and computational issues of prior work. The basic idea is to partition a sequence space into a set of subspaces using a partition tree constructed using a pseudometric, then recursively refine a clustering structure in these subspaces. The technique relies on new methods for fast closest-pair searching and efficient dynamic insertion and deletion of tree nodes. To avoid exhaustive computation of pairwise distances between clusters, we represent each cluster of sequences as a probabilistic sequence, and define a set of operations to align these probabilistic sequences and compute genetic distances between them. We present analyses of space and computational complexity, and demonstrate the effectiveness of our new algorithm using a human gut microbiota data set with over one million sequences. The new algorithm exhibits a quasilinear time and space complexity comparable to greedy heuristic clustering algorithms, while achieving a similar accuracy to the standard hierarchical clustering algorithm

Crossref

PubMed Central

Master index

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector

Robust identification of Parkinson\u27s disease subtypes using radiomics and hybrid machine learning

Author: Hajianfar Ghasem
Rahmim Arman
Saberi Abdollah
Salmanpour Mohammad R
Shamsaei Mojtaba
Soltanian-Zadeh Hamid
Publication venue: Henry Ford Health Scholarly Commons
Publication date: 01/02/2021
Field of study

OBJECTIVES: It is important to subdivide Parkinson\u27s disease (PD) into subtypes, enabling potentially earlier disease recognition and tailored treatment strategies. We aimed to identify reproducible PD subtypes robust to variations in the number of patients and features. METHODS: We applied multiple feature-reduction and cluster-analysis methods to cross-sectional and timeless data, extracted from longitudinal datasets (years 0, 1, 2 & 4; Parkinson\u27s Progressive Marker Initiative; 885 PD/163 healthy-control visits; 35 datasets with combinations of non-imaging, conventional-imaging, and radiomics features from DAT-SPECT images). Hybrid machine-learning systems were constructed invoking 16 feature-reduction algorithms, 8 clustering algorithms, and 16 classifiers (C-index clustering evaluation used on each trajectory). We subsequently performed: i) identification of optimal subtypes, ii) multiple independent tests to assess reproducibility, iii) further confirmation by a statistical approach, iv) test of reproducibility to the size of the samples. RESULTS: When using no radiomics features, the clusters were not robust to variations in features, whereas, utilizing radiomics information enabled consistent generation of clusters through ensemble analysis of trajectories. We arrived at 3 distinct subtypes, confirmed using the training and testing process of k-means, as well as Hotelling\u27s T2 test. The 3 identified PD subtypes were 1) mild; 2) intermediate; and 3) severe, especially in terms of dopaminergic deficit (imaging), with some escalating motor and non-motor manifestations. CONCLUSION: Appropriate hybrid systems and independent statistical tests enable robust identification of 3 distinct PD subtypes. This was assisted by utilizing radiomics features from SPECT images (segmented using MRI). The PD subtypes provided were robust to the number of the subjects, and features

Henry Ford Health System Scholarly Commons

Novel concepts for lipid identification from shotgun mass spectra using a customized query language

Author: Herzog Ronny
Publication venue
Publication date: 30/05/2012
Field of study

Lipids are the main component of semipermeable cell membranes and linked to several important physiological processes. Shotgun lipidomics relies on the direct infusion of total lipid extracts from cells, tissues or organisms into the mass spectrometer and is a powerful tool to elucidate their molecular composition. Despite the technical advances in modern mass spectrometry the currently available software underperforms in several aspects of the lipidomics pipeline. This thesis addresses these issues by presenting a new concept for lipid identification using a customized query language for mass spectra in combination with efficient spectra alignment algorithms which are implemented in the open source kit “LipidXplorer”

Technische Universität Dresden: Qucosa

Optimal algorithms for complete linkage clustering in d dimensions

Author: Krznaric Drago
Levcopoulos Christos
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

It is shown that the complete linkage clustering of n points in R-d, where d greater than or equal to 1 is a constant, can be computed in optimal O(nlogn) time and linear space, under the L-1 and L-infinity-metrics. Furthermore, for every other fixed L-t-metric, it is shown that it can be approximated within an arbitrarily small constant factor in O(nlogn) time and linear space

Lund University Publications

Elsevier - Publisher Connector

Optimal algorithms for complete linkage clustering in d dimensions

Author: D. Defays
F. Aurenhammer
F. Murtagh
F. P. Preparata
G. N. Lance
J. C. Gower
R. R. Sokal
T. Kurita
W. H. E. Day
X. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref