Search CORE

2,595 research outputs found

A survey of kernel and spectral methods for clustering

Author: Aizerman
Aronszajn
Belkin
Bengio
Bezdek
Bishop
Burges
Camastra
Chan
Chen
Chiang
Cortes
Cristianini
Cristianini
Dhillon
Dhillon
Donath
Duda
Fiedler
Fisher
Francesco Camastra
Francesco Masulli
Gersho
Girolami
Golub
Have
Horn
Huber
Hur
Jain
Kernighan
Kluger
Kohonen
Kohonen
Krishnapuram
Krishnapuram
Kulis
Lee
Leski
Linde
Lloyd
Martinetz
Maurizio Filippone
Mercer
Müller
Ng
Ritter
Rose
Roth
Roweis
Saitoh
Schölkopf
Schölkopf
Shi
Sigillito
Sneath
Stefano Rovetta
Tax
Vapnik
von Luxburg
Ward
Weston
Wolberg
Xu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

CiteSeerX

Archivio della ricerca - Università degli studi di Napoli "Parthenope"

Crossref

Enlighten

Archivio istituzionale della ricerca - Università di Genova

White Rose Research Online

Structure in the 3D Galaxy Distribution: I. Methods and Example Results

Author: Abazajian
Abazajian
Adelman-McCarthy
Andersen
Barrow
Blanton
Blanton
Blanton
Bok
Cappellari
Choi
Connolly
Cowan
Croft
Daley
Daley
de Berg
de Vaucouleurs
de Vaucouleurs
DeSieno
Doroshkevich
Efstathiou
Einasto
Gazis
Gomez
Gott
Gray
Gray
Hogg
Holmberg
Hubble
Hubble
Icke
Ikeuchi
Ivezić
Ivezić
Jackson
Jeffrey D. Scargle
Kim
Kohonen
Krzewina
Kutoyants
M. J. Way
Martinez
Melnyk
Merényi
Messier
Moore
Neyman
Neyman
Okabe
P. R. Gazis
Papoulis
Paredes
Pearson
Peebles
Preparata
Ramella
Reiz
Ritter
Saslaw
Scargle
Scargle
Schaap
Schaap
Schlegel
Shandarin
Shane
Silverman
Slezak
Snyder
Soares-Santos
Sousbie
Sousbie
Stein
Stoyan
Strauss
Szapudi
Totsuji
Ueda
van de Weygaert
van de Weygaert
van de Weygaert
Wright
York
Zehavi
Zehavi
Zel'dovich
Zhang
Zwicky
Publication venue: 'IOP Publishing'
Publication date: 02/12/2010
Field of study

Three methods for detecting and characterizing structure in point data, such as that generated by redshift surveys, are described: classification using self-organizing maps, segmentation using Bayesian blocks, and density estimation using adaptive kernels. The first two methods are new, and allow detection and characterization of structures of arbitrary shape and at a wide range of spatial scales. These methods should elucidate not only clusters, but also the more distributed, wide-ranging filaments and sheets, and further allow the possibility of detecting and characterizing an even broader class of shapes. The methods are demonstrated and compared in application to three data sets: a carefully selected volume-limited sample from the Sloan Digital Sky Survey redshift data, a similarly selected sample from the Millennium Simulation, and a set of points independently drawn from a uniform probability distribution -- a so-called Poisson distribution. We demonstrate a few of the many ways in which these methods elucidate large scale structure in the distribution of galaxies in the nearby Universe.Comment: Re-posted after referee corrections along with partially re-written introduction. 80 pages, 31 figures, ApJ in Press. For full sized figures please download from: http://astrophysics.arc.nasa.gov/~mway/lss1.pd

arXiv.org e-Print Archive

Crossref

Clustering comparison of point processes with applications to random geometric models

Author: Błaszczyszyn Bartłomiej
Yogeshwaran D.
Publication venue
Publication date: 01/01/2014
Field of study

In this chapter we review some examples, methods, and recent results involving comparison of clustering properties of point processes. Our approach is founded on some basic observations allowing us to consider void probabilities and moment measures as two complementary tools for capturing clustering phenomena in point processes. As might be expected, smaller values of these characteristics indicate less clustering. Also, various global and local functionals of random geometric models driven by point processes admit more or less explicit bounds involving void probabilities and moment measures, thus aiding the study of impact of clustering of the underlying point process. When stronger tools are needed, directional convex ordering of point processes happens to be an appropriate choice, as well as the notion of (positive or negative) association, when comparison to the Poisson point process is considered. We explain the relations between these tools and provide examples of point processes admitting them. Furthermore, we sketch some recent results obtained using the aforementioned comparison tools, regarding percolation and coverage properties of the Boolean model, the SINR model, subgraph counts in random geometric graphs, and more generally, U-statistics of point processes. We also mention some results on Betti numbers for \v{C}ech and Vietoris-Rips random complexes generated by stationary point processes. A general observation is that many of the results derived previously for the Poisson point process generalise to some "sub-Poisson" processes, defined as those clustering less than the Poisson process in the sense of void probabilities and moment measures, negative association or dcx-ordering.Comment: 44 pages, 4 figure

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Towards Stratification Learning through Homology Inference

Author: Bendich Paul
Mukherjee Sayan
Wang Bei
Publication venue
Publication date: 01/01/2010
Field of study

A topological approach to stratification learning is developed for point cloud data drawn from a stratified space. Given such data, our objective is to infer which points belong to the same strata. First we define a multi-scale notion of a stratified space, giving a stratification for each radius level. We then use methods derived from kernel and cokernel persistent homology to cluster the data points into different strata, and we prove a result which guarantees the correctness of our clustering, given certain topological conditions; some geometric intuition for these topological conditions is also provided. Our correctness result is then given a probabilistic flavor: we give bounds on the minimum number of sample points required to infer, with probability, which points belong to the same strata. Finally, we give an explicit algorithm for the clustering, prove its correctness, and apply it to some simulated data.Comment: 48 page

arXiv.org e-Print Archive

CiteSeerX

Statistical properties of determinantal point processes in high-dimensional Euclidean spaces

Author: Antonello Scardicchio
Chase E. Zachary
F. Mezzadri
F. Schwabl
L. Devroye
M. L. Mehta
S. Torquato
S. Torquato
Salvatore Torquato
Publication venue: 'American Physical Society (APS)'
Publication date: 27/10/2008
Field of study

The goal of this paper is to quantitatively describe some statistical properties of higher-dimensional determinantal point processes with a primary focus on the nearest-neighbor distribution functions. Toward this end, we express these functions as determinants of

N\times N

matrices and then extrapolate to

N\to\infty

. This formulation allows for a quick and accurate numerical evaluation of these quantities for point processes in Euclidean spaces of dimension

d

. We also implement an algorithm due to Hough \emph{et. al.} \cite{hough2006dpa} for generating configurations of determinantal point processes in arbitrary Euclidean spaces, and we utilize this algorithm in conjunction with the aforementioned numerical results to characterize the statistical properties of what we call the Fermi-sphere point process for

d = 1

to 4. This homogeneous, isotropic determinantal point process, discussed also in a companion paper \cite{ToScZa08}, is the high-dimensional generalization of the distribution of eigenvalues on the unit circle of a random matrix from the circular unitary ensemble (CUE). In addition to the nearest-neighbor probability distribution, we are able to calculate Voronoi cells and nearest-neighbor extrema statistics for the Fermi-sphere point process and discuss these as the dimension

d

is varied. The results in this paper accompany and complement analytical properties of higher-dimensional determinantal point processes developed in \cite{ToScZa08}.Comment: 42 pages, 17 figure

arXiv.org e-Print Archive

Crossref

A Comparative Study of Density Field Estimation for Galaxies: New Insights into the Evolution of Galaxies with Environment in COSMOS out to z~3

Author: Aragon-Calvo Miguel
Darvish Behnam
Mobasher Bahram
Scoville Nicholas
Sobral David
Publication venue: 'IOP Publishing'
Publication date: 26/03/2015
Field of study

It is well-known that galaxy environment has a fundamental effect in shaping its properties. We study the environmental effects on galaxy evolution, with an emphasis on the environment defined as the local number density of galaxies. The density field is estimated with different estimators (weighted adaptive kernel smoothing, 10

^{th}

and 5

^{th}

nearest neighbors, Voronoi and Delaunay tessellation) for a K

_{s}<

24 sample of

\sim

190,000 galaxies in the COSMOS field at 0.1

<

<

3.1. The performance of each estimator is evaluated with extensive simulations. We show that overall, there is a good agreement between the estimated density fields using different methods over

\sim

2 dex in overdensity values. However, our simulations show that adaptive kernel and Voronoi tessellation outperform other methods. Using the Voronoi tessellation method, we assign surface densities to a mass complete sample of quiescent and star-forming galaxies out to z

\sim

3. We show that at a fixed stellar mass, the median color of quiescent galaxies does not depend on their host environment out to z

\sim

3. We find that the number and stellar mass density of massive (

>

^{11}

_{\odot}

) star-forming galaxies have not significantly changed since z

\sim

3, regardless of their environment. However, for massive quiescent systems at lower redshifts (z

\lesssim

1.3), we find a significant evolution in the number and stellar mass densities in denser environments compared to lower density regions. Our results suggest that the relation between stellar mass and local density is more fundamental than the color-density relation and that environment plays a significant role in quenching star formation activity in galaxies at z

\lesssim

1.Comment: 20 pages, 11 figures, main figures 4,5,8 and 1

arXiv.org e-Print Archive

Caltech Authors

Lancaster E-Prints