5 research outputs found
Visual Encodings for Networks with Multiple Edge Types
This paper reports on a formal user study on visual encodings ofnetworks with multiple edge types in adjacency matrices. Our tasksand conditions were inspired by real problems in computationalbiology. We focus on encodings in adjacency matrices, selectingfour designs from a potentially huge design space of visual encodings.We then settle on three visual variables to evaluate in acrowdsourcing study with 159 participants: orientation, positionand colour. The best encodings were integrated into a visual analyticstool for inferring dynamic Bayesian networks and evaluated bycomputational biologists for additional evidence.We found that theencodings performed differently depending on the task, however,colour was found to help in all tasks except when trying to find theedge with the largest number of edge types. Orientation generallyoutperformed position in all of our tasks
MLCut : exploring Multi-Level Cuts in dendrograms for biological data
Choosing a single similarity threshold for cutting dendrograms is not sufficient for performing hierarchical clustering analysis of heterogeneous data sets. In addition, alternative automated or semi-automated methods that cut dendrograms in multiple levels make assumptions about the data in hand. In an attempt to help the user to find patterns in the data and resolve ambiguities in cluster assignments, we developed MLCut: a tool that provides visual support for exploring dendrograms of heterogeneous data sets in different levels of detail. The interactive exploration of the dendrogram is coordinated with a representation of the original data, shown as parallel coordinates. The tool supports three analysis steps. Firstly, a single-height similarity threshold can be applied using a dynamic slider to identify the main clusters. Secondly, a distinctiveness threshold can be applied using a second dynamic slider to identify âweak-edgesâ that indicate heterogeneity within clusters. Thirdly, the user can drill-down to further explore the dendrogram structure - always in relation to the original data - and cut the branches of the tree at multiple levels. Interactive drill-down is supported using mouse events such as hovering, pointing and clicking on elements of the dendrogram. Two prototypes of this tool have been developed in collaboration with a group of biologists for analysing their own data sets. We found that enabling the users to cut the tree at multiple levels, while viewing the effect in the original data, isa promising method for clustering which could lead to scientific discoveries.Postprin
TetraploidSNPMap: Software for Linkage Analysis and QTL Mapping in Autotetraploid Populations Using SNP Dosage Data
An earlier software application of ours, TetraploidMap for Windows, enabled linkage analysis and quantitative trait locus interval mapping to be carried out in an experimental cross of an autotetraploid species, using both dominant markers such as amplified fragment length polymorphisms and codominant markers such as simple sequence repeats. The size was limited to 800 markers, and quantitative trait locus mapping was conducted for each parent separately due to the difficulties in obtaining a reliable consensus map for the 2 parents. Modern genotyping technologies now give rise to datasets of thousands of single nucleotide polymorphisms, and these can be scored in autotetraploid species as single nucleotide polymorphism dosages, distinguishing among the heterozygotes AAAB, AABB, and ABBB, rather than simply using the presence or absence of an allele. The dosage data is more informative about recombination and leads to higher density linkage maps. The current program, TetraploidSNPMap, makes full use of the dosage data, and has new facilities for displaying the clustering of single nucleotide polymorphisms, rapid ordering of large numbers of single nucleotide polymorphisms using a multidimensional scaling analysis, and phase calling. It also has new routines for quantitative trait locus mapping based on a hidden Markov model, which use the dosage data to model the effects of alleles from both parents simultaneously. A Windows-based interface facilitates data entry and exploration. It is distributed with a detailed user guide. TetraploidSNPMap is freely available from our GitHub repository
BayesPiles: Visualisation Support for Bayesian Network Structure Learning
We address the problem of exploring, combining and comparing large collections of scored, directed networks for understanding inferred Bayesian networks used in biology. In this feld, heuristic algorithms explore the space of possible network solutions, sampling this space based on algorithm parameters and a network score that encodes the statistical fit to the data. The goal of the analyst is to guide the heuristic search and decide how to determine a final consensus network structure, usually by selecting the top-scoring network or constructing the consensus network from a collection of high-scoring networks. BayesPiles, our visualisation tool, helps with understanding the structure of the solution space and supporting the construction of a final consensus network that is representative of the underlying dataset. BayesPiles builds upon and extends MultiPiles to meet our domain requirements. We developed BayesPiles in conjunction with computational biologists who have used this tool on datasets used in their research. The biologists found our solution provides them with new insights and helps them achieve results that are representative of the underlying data
MLCut:exploring Multi-Level Cuts in dendrograms for biological data
Choosing a single similarity threshold for cutting dendrograms is not sufficient for performing hierarchical clustering analysis of heterogeneous data sets. In addition, alternative automated or semi-automated methods that cut dendrograms in multiple levels make assumptions about the data in hand. In an attempt to help the user to find patterns in the data and resolve ambiguities in cluster assignments, we developed MLCut: a tool that provides visual support for exploring dendrograms of heterogeneous data sets in different levels of detail. The interactive exploration of the dendrogram is coordinated with a representation of the original data, shown as parallel coordinates. The tool supports three analysis steps. Firstly, a single-height similarity threshold can be applied using a dynamic slider to identify the main clusters. Secondly, a distinctiveness threshold can be applied using a second dynamic slider to identify âweak-edgesâ that indicate heterogeneity within clusters. Thirdly, the user can drill-down to further explore the dendrogram structure - always in relation to the original data - and cut the branches of the tree at multiple levels. Interactive drill-down is supported using mouse events such as hovering, pointing and clicking on elements of the dendrogram. Two prototypes of this tool have been developed in collaboration with a group of biologists for analysing their own data sets. We found that enabling the users to cut the tree at multiple levels, while viewing the effect in the original data, isa promising method for clustering which could lead to scientific discoveries