16,687 research outputs found
A Tutorial on Spectral Clustering
In recent years, spectral clustering has become one of the most popular
modern clustering algorithms. It is simple to implement, can be solved
efficiently by standard linear algebra software, and very often outperforms
traditional clustering algorithms such as the k-means algorithm. On the first
glance spectral clustering appears slightly mysterious, and it is not obvious
to see why it works at all and what it really does. The goal of this tutorial
is to give some intuition on those questions. We describe different graph
Laplacians and their basic properties, present the most common spectral
clustering algorithms, and derive those algorithms from scratch by several
different approaches. Advantages and disadvantages of the different spectral
clustering algorithms are discussed
Image Segmentation by Making Use of Spectral Clustering
Import 05/08/2014Cílem bakalářské práce je ověřit možnosti segmentace obrazu s využitím spektrální dekompozice Laplaceovy matice sítě, kterou obraz vytváří, s následným shlukováním. V diplomové práci proveďte:
1) Seznamte se s problematikou spektrálního shlukování (vhodným materiálem je např.: Ulrike von Luxburg, A Tutorial on Spectral Clustering).
2) Metodu spektrálního shlukování implementujte a experimentálně ověřte pro segmentaci obrazu. Pozornost věnujte efektivnímu postupu výpočtu vlastních čísel a vektorů Laplaceovy matice (lze použít např. knihovnu ARPACK), jakož i výběru vhodné shlukovací metody.
3) Metodu implementovanou dle předchozího odstavce rozšiřte na tzv. difúzní spektrální shlukování (Gaura, Sojka: Diffusion Spectral Clustering for Image Segmentation).
4) Chování metod otestujte a shrňte dosažené výsledky.
Programátorské práce proveďte v C/C++.The goal of this bachelor work is to verify the possibilities of image segmentation by making use of the Laplacian matrix of the graph (grid) that is created by image, which is followed by subsequent clustering. In the bachelor work, do the following:
1) Get acquainted with the technique of spectral clustering (e.g. from the paper "A tutorial on Spectral Clustering" by Ulrike von Luxburg).
2) Verify the usefulness of spectral clustering in the area of image segmentation. Pay attention to the selection of efficient method of computing eigenvalues and eigenvectors (e.g. the ARPACK library can be used). Also, pay attention to the selection of the clustering method.
3) Improve the method from the previous paragraph into the method of diffusion spectral clustering (Gaura, Sojka: Diffusion Spectral Clustering for Image Segmentation).
4) Both the methods should be tested. The results should be summarised.
The software should be created in C/C++.470 - Katedra aplikované matematikyvýborn
Computer-aided Land Cover Mapping Protocol
The purpose of the resource is to produce a land cover type map from the digital file of a Landsat satellite image using MultiSpec software. Educational levels: Middle school, High school
htsint: a Python library for sequencing pipelines that combines data through gene set generation
Background: Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses.
Results: We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages.
Conclusion: The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint
Mining a medieval social network by kernel SOM and related methods
This paper briefly presents several ways to understand the organization of a
large social network (several hundreds of persons). We compare approaches
coming from data mining for clustering the vertices of a graph (spectral
clustering, self-organizing algorithms. . .) and provide methods for
representing the graph from these analysis. All these methods are illustrated
on a medieval social network and the way they can help to understand its
organization is underlined
Identification of Invariant Sensorimotor Structures as a Prerequisite for the Discovery of Objects
Perceiving the surrounding environment in terms of objects is useful for any
general purpose intelligent agent. In this paper, we investigate a fundamental
mechanism making object perception possible, namely the identification of
spatio-temporally invariant structures in the sensorimotor experience of an
agent. We take inspiration from the Sensorimotor Contingencies Theory to define
a computational model of this mechanism through a sensorimotor, unsupervised
and predictive approach. Our model is based on processing the unsupervised
interaction of an artificial agent with its environment. We show how
spatio-temporally invariant structures in the environment induce regularities
in the sensorimotor experience of an agent, and how this agent, while building
a predictive model of its sensorimotor experience, can capture them as densely
connected subgraphs in a graph of sensory states connected by motor commands.
Our approach is focused on elementary mechanisms, and is illustrated with a set
of simple experiments in which an agent interacts with an environment. We show
how the agent can build an internal model of moving but spatio-temporally
invariant structures by performing a Spectral Clustering of the graph modeling
its overall sensorimotor experiences. We systematically examine properties of
the model, shedding light more globally on the specificities of the paradigm
with respect to methods based on the supervised processing of collections of
static images.Comment: 24 pages, 10 figures, published in Frontiers Robotics and A
Analyzing and clustering neural data
This thesis aims to analyze neural data in an overall effort by the Charles Stark
Draper Laboratory to determine an underlying pattern in brain activity in healthy
individuals versus patients with a brain degenerative disorder. The neural data comes from ECoG (electrocorticography) applied to either humans or primates. Each ECoG array has electrodes that measure voltage variations which neuroscientists claim correlates to neurons transmitting signals to one another. ECoG differs from the less invasive technique of EEG (electroencephalography) in that EEG electrodes are placed above a patients scalp while ECoG involves drilling small holes in the skull to allow electrodes to be closer to the brain. Because of this ECoG boasts an exceptionally high signal-to-noise ratio and less susceptibility to artifacts than EEG [6]. While wearing the ECoG caps, the patients are asked to perform a range of different tasks.
The tasks performed by patients are partitioned into different levels of mental stress
i.e. how much concentration is presumably required. The specific dataset used in
this thesis is derived from cognitive behavior experiments performed on primates at
MGH (Massachusetts General Hospital).
The content of this thesis can be thought of as a pipelined process. First the
data is collected from the ECoG electrodes, then the data is pre-processed via signal processing techniques and finally the data is clustered via unsupervised learning techniques. For both the pre-processing and the clustering steps, different techniques are applied and then compared against one another. The focus of this thesis is to evaluate clustering techniques when applied to neural data.
For the pre-processing step, two types of bandpass filters, a Butterworth Filter
and a Chebyshev Filter were applied. For the clustering step three techniques were
applied to the data, K-means Clustering, Spectral Clustering and Self-Tuning Spectral Clustering. We conclude that for pre-processing the results from both filters are very similar and thus either filter is sufficient. For clustering we conclude that K- means has the lowest amount of overlap between clusters. K-means is also the most time-efficient of the three techniques and is thus the ideal choice for this application.2016-10-27T00:00:00
- …