Search CORE

5,866 research outputs found

Techniques for clustering gene expression data

Author: Crane Martin
Doolan Padraig
Kerr Gráinne
Ruskin Heather J.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

CiteSeerX

Irish Universities

DCU Online Research Access Service

Adaptive Evolutionary Clustering

Author: AC Harvey
Alfred O. Hero III
DJ Fenn
GW Milligan
H Lütkepohl
H Ning
HW Kuhn
J Schäfer
J Shi
Kevin S. Xu
M Charikar
Mark Kliger
N Eagle
O Ledoit
PJ Mucha
S Haykin
S Tadepalli
T Hastie
T Yang
TW Anderson
U Luxburg von
Y Chen
Y Chi
YR Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

In many practical applications of clustering, the objects to be clustered evolve over time, and a clustering result is desired at each time step. In such applications, evolutionary clustering typically outperforms traditional static clustering by producing clustering results that reflect long-term trends while being robust to short-term variations. Several evolutionary clustering algorithms have recently been proposed, often by adding a temporal smoothness penalty to the cost function of a static clustering method. In this paper, we introduce a different approach to evolutionary clustering by accurately tracking the time-varying proximities between objects followed by static clustering. We present an evolutionary clustering framework that adaptively estimates the optimal smoothing parameter using shrinkage estimation, a statistical approach that improves a naive estimate using additional information. The proposed framework can be used to extend a variety of static clustering algorithms, including hierarchical, k-means, and spectral clustering, into evolutionary clustering algorithms. Experiments on synthetic and real data sets indicate that the proposed framework outperforms static clustering and existing evolutionary clustering algorithms in many scenarios.Comment: To appear in Data Mining and Knowledge Discovery, MATLAB toolbox available at http://tbayes.eecs.umich.edu/xukevin/affec

arXiv.org e-Print Archive

CiteSeerX

Crossref

Methods for fast and reliable clustering

Author: Kärkkäinen Ismo
Publication venue: University of Joensuu
Publication date
Field of study

UEF Electronic Publications

On Randomly Projected Hierarchical Clustering with Guarantees

Author: Schneider Johannes
Vlachos Michail
Publication venue
Publication date: 22/01/2014
Field of study

Hierarchical clustering (HC) algorithms are generally limited to small data instances due to their runtime costs. Here we mitigate this shortcoming and explore fast HC algorithms based on random projections for single (SLC) and average (ALC) linkage clustering as well as for the minimum spanning tree problem (MST). We present a thorough adaptive analysis of our algorithms that improve prior work from

O(N^2)

by up to a factor of

N/(\log N)^2

for a dataset of

N

points in Euclidean space. The algorithms maintain, with arbitrary high probability, the outcome of hierarchical clustering as well as the worst-case running-time guarantees. We also present parameter-free instances of our algorithms.Comment: This version contains the conference paper "On Randomly Projected Hierarchical Clustering with Guarantees'', SIAM International Conference on Data Mining (SDM), 2014 and, additionally, proofs omitted in the conference versio

arXiv.org e-Print Archive

Crossref

Spatial analysis for the distribution of cells in tissue sections

Author: Xu Ruibin
Publication venue: Lunds universitet/Institutionen för naturgeografi och ekosystemvetenskap
Publication date: 01/01/2014
Field of study

Spatial analysis, playing an essential role in data mining, is applied in a considerable number of fields. It is because of its broad applicability that dealing with the interdisciplinary issues is becoming more prevalent. It aims at exploring the underlying patterns of the data. In this project, we will employ the methodology that we utilize to tackle spatial problems to investigate how the cells distribute in the infected tissue sections and if there are clusters existing among the cells. The cells that are neighboring to the viruses are of interest. The data were provided by the Medetect Company in the form of 2-dimensional point data. We firstly adopted two common spatial analysis methods, clustering methods and proximity methods. In addition, a method for constructing a 2-dimensional hull was developed in order to delineate the compartments in tissue sections. A binomial test was conducted to evaluate the results. It is detectable that the clusters do exist among cells. The immune cells would accumulate around the viruses. We also found different patterns near and far away from viruses. This study implicates that the cells are interactive with each other and thus present the spatial patterns. However, our analyses are restricted in a planar circumstance instead of treating them in 3-dimensional space. For the further study, the spatial analysis could be carried out in three dimensions.It has been popular to utilize the heuristic methods or the existing methods to discover new findings and explain the mysterious phenomena in other subjects. And it is known that everything in nature relates to each other. In this sense, we could assume that the entire distribution of objects is relative to the locations of individuals. The idea of my work is attempting to explore this spatial relationship existing among cells. In my project, the relationships between individual cells or groups of cells are interesting. Our data is presented like the point cloud. It is doubted that if there are any groups existing among these points and if the viruses have neighbors. The methods are mainly categorized into three parts. The first method is to integrate the similar objects into groups. Here the similar objects could be the ones that are close to each other. The second method analyzes the degree of closeness between objects and looks for the neighbors of viruses. The last method can be used to draw the border of a point cloud, which seems like constructing the boundary of districts. Within each method, we carried out the corresponding case studies. Since similar objects can be grouped together, it is interesting to look into the details of each group. Thus we can know which two objects are similar in the same group. Basically, different types of cells in the same group can be checked and studied. In the closeness analysis, we found that some cells are indeed closer to each other. The constructed border could help us know the shape of point clouds. It can be concluded that the spatial relationship does exist among the cells. Groups of cells can be identified at a large extent. And one certain type of cells could be more attracted by some cells from a local level. However, this study is carried out in a 2D space. Actually, we neglect the real shape of cells which have heights. This could be a more interesting topic in the future

Automated video-based analysis of human operators in mixed-model assembly work stations

Author: Bauters Karel
Publication venue
Publication date: 25/06/2021
Field of study

Ghent University Academic Bibliography