16,687 research outputs found

    A Tutorial on Spectral Clustering

    Full text link
    In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed

    Image Segmentation by Making Use of Spectral Clustering

    Get PDF
    Import 05/08/2014Cílem bakalářské práce je ověřit možnosti segmentace obrazu s využitím spektrální dekompozice Laplaceovy matice sítě, kterou obraz vytváří, s následným shlukováním. V diplomové práci proveďte: 1) Seznamte se s problematikou spektrálního shlukování (vhodným materiálem je např.: Ulrike von Luxburg, A Tutorial on Spectral Clustering). 2) Metodu spektrálního shlukování implementujte a experimentálně ověřte pro segmentaci obrazu. Pozornost věnujte efektivnímu postupu výpočtu vlastních čísel a vektorů Laplaceovy matice (lze použít např. knihovnu ARPACK), jakož i výběru vhodné shlukovací metody. 3) Metodu implementovanou dle předchozího odstavce rozšiřte na tzv. difúzní spektrální shlukování (Gaura, Sojka: Diffusion Spectral Clustering for Image Segmentation). 4) Chování metod otestujte a shrňte dosažené výsledky. Programátorské práce proveďte v C/C++.The goal of this bachelor work is to verify the possibilities of image segmentation by making use of the Laplacian matrix of the graph (grid) that is created by image, which is followed by subsequent clustering. In the bachelor work, do the following: 1) Get acquainted with the technique of spectral clustering (e.g. from the paper "A tutorial on Spectral Clustering" by Ulrike von Luxburg). 2) Verify the usefulness of spectral clustering in the area of image segmentation. Pay attention to the selection of efficient method of computing eigenvalues and eigenvectors (e.g. the ARPACK library can be used). Also, pay attention to the selection of the clustering method. 3) Improve the method from the previous paragraph into the method of diffusion spectral clustering (Gaura, Sojka: Diffusion Spectral Clustering for Image Segmentation). 4) Both the methods should be tested. The results should be summarised. The software should be created in C/C++.470 - Katedra aplikované matematikyvýborn

    Computer-aided Land Cover Mapping Protocol

    Get PDF
    The purpose of the resource is to produce a land cover type map from the digital file of a Landsat satellite image using MultiSpec software. Educational levels: Middle school, High school

    htsint: a Python library for sequencing pipelines that combines data through gene set generation

    Get PDF
    Background: Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. Results: We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. Conclusion: The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint

    An introduction to statistical parametric speech synthesis

    Get PDF

    Mining a medieval social network by kernel SOM and related methods

    Get PDF
    This paper briefly presents several ways to understand the organization of a large social network (several hundreds of persons). We compare approaches coming from data mining for clustering the vertices of a graph (spectral clustering, self-organizing algorithms. . .) and provide methods for representing the graph from these analysis. All these methods are illustrated on a medieval social network and the way they can help to understand its organization is underlined

    Identification of Invariant Sensorimotor Structures as a Prerequisite for the Discovery of Objects

    Full text link
    Perceiving the surrounding environment in terms of objects is useful for any general purpose intelligent agent. In this paper, we investigate a fundamental mechanism making object perception possible, namely the identification of spatio-temporally invariant structures in the sensorimotor experience of an agent. We take inspiration from the Sensorimotor Contingencies Theory to define a computational model of this mechanism through a sensorimotor, unsupervised and predictive approach. Our model is based on processing the unsupervised interaction of an artificial agent with its environment. We show how spatio-temporally invariant structures in the environment induce regularities in the sensorimotor experience of an agent, and how this agent, while building a predictive model of its sensorimotor experience, can capture them as densely connected subgraphs in a graph of sensory states connected by motor commands. Our approach is focused on elementary mechanisms, and is illustrated with a set of simple experiments in which an agent interacts with an environment. We show how the agent can build an internal model of moving but spatio-temporally invariant structures by performing a Spectral Clustering of the graph modeling its overall sensorimotor experiences. We systematically examine properties of the model, shedding light more globally on the specificities of the paradigm with respect to methods based on the supervised processing of collections of static images.Comment: 24 pages, 10 figures, published in Frontiers Robotics and A

    Analyzing and clustering neural data

    Get PDF
    This thesis aims to analyze neural data in an overall effort by the Charles Stark Draper Laboratory to determine an underlying pattern in brain activity in healthy individuals versus patients with a brain degenerative disorder. The neural data comes from ECoG (electrocorticography) applied to either humans or primates. Each ECoG array has electrodes that measure voltage variations which neuroscientists claim correlates to neurons transmitting signals to one another. ECoG differs from the less invasive technique of EEG (electroencephalography) in that EEG electrodes are placed above a patients scalp while ECoG involves drilling small holes in the skull to allow electrodes to be closer to the brain. Because of this ECoG boasts an exceptionally high signal-to-noise ratio and less susceptibility to artifacts than EEG [6]. While wearing the ECoG caps, the patients are asked to perform a range of different tasks. The tasks performed by patients are partitioned into different levels of mental stress i.e. how much concentration is presumably required. The specific dataset used in this thesis is derived from cognitive behavior experiments performed on primates at MGH (Massachusetts General Hospital). The content of this thesis can be thought of as a pipelined process. First the data is collected from the ECoG electrodes, then the data is pre-processed via signal processing techniques and finally the data is clustered via unsupervised learning techniques. For both the pre-processing and the clustering steps, different techniques are applied and then compared against one another. The focus of this thesis is to evaluate clustering techniques when applied to neural data. For the pre-processing step, two types of bandpass filters, a Butterworth Filter and a Chebyshev Filter were applied. For the clustering step three techniques were applied to the data, K-means Clustering, Spectral Clustering and Self-Tuning Spectral Clustering. We conclude that for pre-processing the results from both filters are very similar and thus either filter is sufficient. For clustering we conclude that K- means has the lowest amount of overlap between clusters. K-means is also the most time-efficient of the three techniques and is thus the ideal choice for this application.2016-10-27T00:00:00
    corecore