Search CORE

13,974 research outputs found

The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration

Author: A Fiannaca
A Fiannaca
A Fiannaca
A Truszkowski
A Ultsch
A Ultsch
Alfonso Urso
Antonino Fiannaca
C Borgelt
CA Goble
D Digles
G Di Fatta
Giuseppe Di Fatta
HE Pence
J Hastings
K Wolstencroft
M Hall
Massimo La Rosa
N Belacel
P Ertl
Riccardo Rizzo
S Jupp
S Riniker
Salvatore Gaglio
T Kohonen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets

Central Archive at the University of Reading

Crossref

Springer - Publisher Connector

PubMed Central

Archivio istituzionale della ricerca - Università di Palermo

Path finding on a spherical self-organizing map using distance transformations

Author: Dennesen P.J.W.
Kessels A.
Lokker L.
Ramsay G.
van den Keijbus P.
van der Ven A.
van Nieuw Amerongen A.
Veerman E.
Vlasveld M.
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2003
Field of study

Spatialization methods create visualizations that allow users to analyze high-dimensional data in an intuitive manner and facilitates the extraction of meaningful information. Just as geographic maps are simpli ed representations of geographic spaces, these visualizations are esssentially maps of abstract data spaces that are created through dimensionality reduction. While we are familiar with geographic maps for path planning/ nding applications, research into using maps of high-dimensional spaces for such purposes has been largely ignored. However, literature has shown that it is possible to use these maps to track temporal and state changes within a high-dimensional space. A popular dimensionality reduction method that produces a mapping for these purposes is the Self-Organizing Map. By using its topology preserving capabilities with a colour-based visualization method known as the U-Matrix, state transitions can be visualized as trajectories on the resulting mapping. Through these trajectories, one can gather information on the transition path between two points in the original high-dimensional state space. This raises the interesting question of whether or not the Self-Organizing Map can be used to discover the transition path between two points in an n-dimensional space. In this thesis, we use a spherically structured Self-Organizing Map called the Geodesic Self-Organizing Map for dimensionality reduction and the creation of a topological mapping that approximates the n-dimensional space. We rst present an intuitive method for a user to navigate the surface of the Geodesic SOM. A new application of the distance transformation algorithm is then proposed to compute the path between two points on the surface of the SOM, which corresponds to two points in the data space. Discussions will then follow on how this application could be improved using some form of surface shape analysis. The new approach presented in this thesis would then be evaluated by analyzing the results of using the Geodesic SOM for manifold embedding and by carrying out data analyses using carbon dioxide emissions data

Maastricht University Research Portal

VU Research Portal

Sydney eScholarship

Radboud Repository

International Migration, Integration and Social Cohesion online publications

Path finding on a spherical self-organizing map using distance transformations

Author: Bui Michael
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2008
Field of study

Sydney eScholarship

A combined measure for quantifying and qualifying the topology preservation of growing self-organizing maps

Author: Arquero Hidalgo Águeda
Delgado Sanz Maria Soledad
Gonzalo Martín Consuelo
Martínez Izquierdo María Estíbaliz
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

The Self-OrganizingMap (SOM) is a neural network model that performs an ordered projection of a high dimensional input space in a low-dimensional topological structure. The process in which such mapping is formed is defined by the SOM algorithm, which is a competitive, unsupervised and nonparametric method, since it does not make any assumption about the input data distribution. The feature maps provided by this algorithm have been successfully applied for vector quantization, clustering and high dimensional data visualization processes. However, the initialization of the network topology and the selection of the SOM training parameters are two difficult tasks caused by the unknown distribution of the input signals. A misconfiguration of these parameters can generate a feature map of low-quality, so it is necessary to have some measure of the degree of adaptation of the SOM network to the input data model. The topologypreservation is the most common concept used to implement this measure. Several qualitative and quantitative methods have been proposed for measuring the degree of SOM topologypreservation, particularly using Kohonen's model. In this work, two methods for measuring the topologypreservation of the Growing Cell Structures (GCSs) model are proposed: the topographic function and the topology preserving ma

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Self-Organizing Time Map: An Abstraction of Temporal Multivariate Patterns

Author: Agarwal
Andrienko
Aupetit
Back
Back
Barreto
Barreto
Barreto
Bertin
Chappell
Cottrell
Deboeck
Denny
Fritzke
Guimarães
Guimarães
Guo
Hagenbuchner
Hammer
Harrower
Horio
Kaski
Kohonen
Kohonen
Kohonen
Kohonen
Koskela
Martín-del-Brío
Peter Sarlin
Sammon
Sarlin
Strickert
Strickert
Vesanto
Voegtlin
Publication venue: 'Elsevier BV'
Publication date: 09/08/2012
Field of study

This paper adopts and adapts Kohonen's standard Self-Organizing Map (SOM) for exploratory temporal structure analysis. The Self-Organizing Time Map (SOTM) implements SOM-type learning to one-dimensional arrays for individual time units, preserves the orientation with short-term memory and arranges the arrays in an ascending order of time. The two-dimensional representation of the SOTM attempts thus twofold topology preservation, where the horizontal direction preserves time topology and the vertical direction data topology. This enables discovering the occurrence and exploring the properties of temporal structural changes in data. For representing qualities and properties of SOTMs, we adapt measures and visualizations from the standard SOM paradigm, as well as introduce a measure of temporal structural changes. The functioning of the SOTM, and its visualizations and quality and property measures, are illustrated on artificial toy data. The usefulness of the SOTM in a real-world setting is shown on poverty, welfare and development indicators

arXiv.org e-Print Archive

Crossref

Somoclu: An Efficient Parallel Library for Self-Organizing Maps

Author: Gao Shi Chao
Lim Ik Soo
Wittek Peter
Zhao Li
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/06/2017
Field of study

Somoclu is a massively parallel tool for training self-organizing maps on large data sets written in C++. It builds on OpenMP for multicore execution, and on MPI for distributing the workload across the nodes in a cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, R and MATLAB interfaces facilitate interactive use. Apart from fast execution, memory use is highly optimized, enabling training large emergent maps even on a single computer.Comment: 26 pages, 9 figures. The code is available at https://peterwittek.github.io/somoclu

arXiv.org e-Print Archive

Directory of Open Access Journals

Journal of Statistical Software

Bangor University Research Portal

Batch and median neural gas

Author: Alexander Hasenfuß
Barbara Hammer
Belkin
Blake
Borg
Bottou
Bunke
Cheng
Cottrell
Cottrell
Duda
Fort
Graepel
Guenter
Hammer
Heskes
Kaski
Kohonen
Kohonen
Lundsteen
Marie Cottrell
Martinetz
Martinetz
Mevissen
Murty
Ripley
Seo
Somervuo
Thomas Villmann
Villmann
Zhong
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Neural Gas (NG) constitutes a very robust clustering algorithm given euclidian data which does not suffer from the problem of local minima like simple vector quantization, or topological restrictions like the self-organizing map. Based on the cost function of NG, we introduce a batch variant of NG which shows much faster convergence and which can be interpreted as an optimization of the cost function by the Newton method. This formulation has the additional benefit that, based on the notion of the generalized median in analogy to Median SOM, a variant for non-vectorial proximity data can be introduced. We prove convergence of batch and median versions of NG, SOM, and k-means in a unified formulation, and we investigate the behavior of the algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari

arXiv.org e-Print Archive

CiteSeerX

Crossref

Publications at Bielefeld University

HAL-Paris1

eXamine: a Cytoscape app for exploring annotated modules in networks

Author: Bucur Cristina-Iulia
Dinkla Kasper
El-Kebir Mohammed
Klau Gunnar W.
Siderius Marco
Smit Martine J.
Westenberg Michel A.
Publication venue
Publication date: 01/01/2013
Field of study

Background. Biological networks have growing importance for the interpretation of high-throughput "omics" data. Statistical and combinatorial methods allow to obtain mechanistic insights through the extraction of smaller subnetwork modules. Further enrichment analyses provide set-based annotations of these modules. Results. We present eXamine, a set-oriented visual analysis approach for annotated modules that displays set membership as contours on top of a node-link layout. Our approach extends upon Self Organizing Maps to simultaneously lay out nodes, links, and set contours. Conclusions. We implemented eXamine as a freely available Cytoscape app. Using eXamine we study a module that is activated by the virally-encoded G-protein coupled receptor US28 and formulate a novel hypothesis about its functioning

arXiv.org e-Print Archive

CiteSeerX

Repository TU/e

CWI's Institutional Repository