43,455 research outputs found

    jClust: a clustering and visualization toolbox

    Get PDF
    jClust is a user-friendly application which provides access to a set of widely used clustering and clique finding algorithms. The toolbox allows a range of filtering procedures to be applied and is combined with an advanced implementation of the Medusa interactive visualization module. These implemented algorithms are k-Means, Affinity propagation, Bron–Kerbosch, MULIC, Restricted neighborhood search cluster algorithm, Markov clustering and Spectral clustering, while the supported filtering procedures are haircut, outside–inside, best neighbors and density control operations. The combination of a simple input file format, a set of clustering and filtering algorithms linked together with the visualization tool provides a powerful tool for data analysis and information extraction

    Focused multidimensional scaling : interactive visualization for exploration of high-dimensional data

    Get PDF
    BackgroundVisualization is an important tool for generating meaning from scientific data, but the visualization of structures in high-dimensional data (such as from high-throughput assays) presents unique challenges. Dimension reduction methods are key in solving this challenge, but these methods can be misleading- especially when apparent clustering in the dimension-reducing representation is used as the basis for reasoning about relationships within the data.ResultsWe present two interactive visualization tools, distnet and focusedMDS, that help in assessing the validity of a dimension-reducing plot and in interactively exploring relationships between objects in the data. The distnet tool is used to examine discrepancies between the placement of points in a two dimensional visualization and the points' actual similarities in feature space. The focusedMDS tool is an intuitive, interactive multidimensional scaling tool that is useful for exploring the relationships of one particular data point to the others, that might be useful in a personalized medicine framework.ConclusionsWe introduce here two freely available tools for visually exploring and verifying the validity of dimension-reducing visualizations and biological information gained from these. The use of such tools can confirm that conclusions drawn from dimension-reducing visualizations are not simply artifacts of the visualization method, but are real biological insights.Peer reviewe

    The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration

    Get PDF
    Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets

    MarVis: a tool for clustering and visualization of metabolic biomarkers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A central goal of experimental studies in systems biology is to identify meaningful markers that are hidden within a diffuse background of data originating from large-scale analytical intensity measurements as obtained from metabolomic experiments. Intensity-based clustering is an unsupervised approach to the identification of metabolic markers based on the grouping of similar intensity profiles. A major problem of this basic approach is that in general there is no prior information about an adequate number of biologically relevant clusters.</p> <p>Results</p> <p>We present the tool MarVis (Marker Visualization) for data mining on intensity-based profiles using one-dimensional self-organizing maps (1D-SOMs). MarVis can import and export customizable CSV (Comma Separated Values) files and provides aggregation and normalization routines for preprocessing of intensity profiles that contain repeated measurements for a number of different experimental conditions. Robust clustering is then achieved by training of an 1D-SOM model, which introduces a similarity-based ordering of the intensity profiles. The ordering allows a convenient visualization of the intensity variations within the data and facilitates an interactive aggregation of clusters into larger blocks. The intensity-based visualization is combined with the presentation of additional data attributes, which can further support the analysis of experimental data.</p> <p>Conclusion</p> <p>MarVis is a user-friendly and interactive tool for exploration of complex pattern variation in a large set of experimental intensity profiles. The application of 1D-SOMs gives a convenient overview on relevant profiles and groups of profiles. The specialized visualization effectively supports researchers in analyzing a large number of putative clusters, even though the true number of biologically meaningful groups is unknown. Although MarVis has been developed for the analysis of metabolomic data, the tool may be applied to gene expression data as well.</p

    Contextualization of topics - browsing through terms, authors, journals and cluster allocations

    Full text link
    This paper builds on an innovative Information Retrieval tool, Ariadne. The tool has been developed as an interactive network visualization and browsing tool for large-scale bibliographic databases. It basically allows to gain insights into a topic by contextualizing a search query (Koopman et al., 2015). In this paper, we apply the Ariadne tool to a far smaller dataset of 111,616 documents in astronomy and astrophysics. Labeled as the Berlin dataset, this data have been used by several research teams to apply and later compare different clustering algorithms. The quest for this team effort is how to delineate topics. This paper contributes to this challenge in two different ways. First, we produce one of the different cluster solution and second, we use Ariadne (the method behind it, and the interface - called LittleAriadne) to display cluster solutions of the different group members. By providing a tool that allows the visual inspection of the similarity of article clusters produced by different algorithms, we present a complementary approach to other possible means of comparison. More particular, we discuss how we can - with LittleAriadne - browse through the network of topical terms, authors, journals and cluster solutions in the Berlin dataset and compare cluster solutions as well as see their context.Comment: proceedings of the ISSI 2015 conference (accepted

    Gepoclu: a software tool for identifying and analyzing gene positional clusters in large-scale gene expression analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The notion that genes are non-randomly organized within the chromosomes of eukaryotic organisms has recently received strong experimental support. Clusters of co-expressed and co-localized genes have been recognized as playing key roles in a number of functional pathways and adaptive responses including organism development, differentiation, disease states and aging. The identification of genes arranged in close proximity with each other within a particular temporal and spatial transcriptional program is anticipated to unravel possible functional links and reciprocal interactions.</p> <p>Results</p> <p>We developed a novel software tool <it>Gepoclu </it>(Gene Positional Clustering) that automatically selects genes based on expression values from multiple sources, including microarray, EST and qRT-PCR, and performs positional clustering. <it>Gepoclu </it>provides expression-based gene selection from multiple experimental sources, position-based gene clustering and cluster visualization functionalities, all as parts of the same fully integrated, and interactive, package. This means rapid iterations while exploring for emergent behavior, and full programmability of the filtering and clustering steps.</p> <p>Conclusions</p> <p><it>Gepoclu </it>is a useful data-mining tool for exploring relationships among transcriptional data deriving form different sources. It provides an easy interactive environment for analyzing positional clustering behavior of co-expressed genes, and at the same time it is fully programmable, so that it can be customized and extended to support specific analysis needs.</p
    corecore