7 research outputs found

    Multivariate relationship specification and visualization

    Get PDF
    In this dissertation, we present a novel method for multivariate visualization that focuses on multivariate relationshipswithin scientific datasets. Specifically, we explore the considerations of such a problem, i.e. we develop an appropriate visualization approach, provide a framework for the specification of multivariate relationships and analyze the space of such relationships for the purpose of guiding the user toward desired visualizations. The visualization approach is derived from a point classification algorithm that summarizes many variables of a dataset into a single image via the creation of attribute subspaces. Then, we extend the notion of attribute subspaces to encompass multivariate relationships. In addition, we provide an unconstrained framework for the user to define such relationships. Althoughwe intend this approach to be generally applicable, the specification of complicated relationships is a daunting task due to the increasing difficulty for a user to understand and apply these relationships. For this reason, we explore this relationship space with a common information visualization technique well suited for this purpose, parallel coordinates. In manipulating this space, a user is able to discover and select both complex and logically informative relationship specifications

    Visual analytics for relationships in scientific data

    Get PDF
    Domain scientists hope to address grand scientific challenges by exploring the abundance of data generated and made available through modern high-throughput techniques. Typical scientific investigations can make use of novel visualization tools that enable dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These general tools should be applicable to many disciplines: allowing biologists to develop an intuitive understanding of the structure of coexpression networks and discover genes that reside in critical positions of biological pathways, intelligence analysts to decompose social networks, and climate scientists to model extrapolate future climate conditions. By using a graph as a universal data representation of correlation, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool integrates techniques such as graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized B-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using real-world workflows from several large-scale studies. Parallel coordinates has proven to be a scalable visualization and navigation framework for multivariate data. However, when data with thousands of variables are at hand, we do not have a comprehensive solution to select the right set of variables and order them to uncover important or potentially insightful patterns. We present algorithms to rank axes based upon the importance of bivariate relationships among the variables and showcase the efficacy of the proposed system by demonstrating autonomous detection of patterns in a modern large-scale dataset of time-varying climate simulation

    Development of a geovisual analytics environment using parallel coordinates with applications to tropical cyclone trend analysis

    Get PDF
    A global transformation is being fueled by unprecedented growth in the quality, quantity, and number of different parameters in environmental data through the convergence of several technological advances in data collection and modeling. Although these data hold great potential for helping us understand many complex and, in some cases, life-threatening environmental processes, our ability to generate such data is far outpacing our ability to analyze it. In particular, conventional environmental data analysis tools are inadequate for coping with the size and complexity of these data. As a result, users are forced to reduce the problem in order to adapt to the capabilities of the tools. To overcome these limitations, we must complement the power of computational methods with human knowledge, flexible thinking, imagination, and our capacity for insight by developing visual analysis tools that distill information into the actionable criteria needed for enhanced decision support. In light of said challenges, we have integrated automated statistical analysis capabilities with a highly interactive, multivariate visualization interface to produce a promising approach for visual environmental data analysis. By combining advanced interaction techniques such as dynamic axis scaling, conjunctive parallel coordinates, statistical indicators, and aerial perspective shading, we provide an enhanced variant of the classical parallel coordinates plot. Furthermore, the system facilitates statistical processes such as stepwise linear regression and correlation analysis to assist in the identification and quantification of the most significant predictors for a particular dependent variable. These capabilities are combined into a unique geovisual analytics system that is demonstrated via a pedagogical case study and three North Atlantic tropical cyclone climate studies using a systematic workflow. In addition to revealing several significant associations between environmental observations and tropical cyclone activity, this research corroborates the notion that enhanced parallel coordinates coupled with statistical analysis can be used for more effective knowledge discovery and confirmation in complex, real-world data sets

    Development of a geovisual analytics environment using parallel coordinates with applications to tropical cyclone trend analysis

    Get PDF
    A global transformation is being fueled by unprecedented growth in the quality, quantity, and number of different parameters in environmental data through the convergence of several technological advances in data collection and modeling. Although these data hold great potential for helping us understand many complex and, in some cases, life-threatening environmental processes, our ability to generate such data is far outpacing our ability to analyze it. In particular, conventional environmental data analysis tools are inadequate for coping with the size and complexity of these data. As a result, users are forced to reduce the problem in order to adapt to the capabilities of the tools. To overcome these limitations, we must complement the power of computational methods with human knowledge, flexible thinking, imagination, and our capacity for insight by developing visual analysis tools that distill information into the actionable criteria needed for enhanced decision support. In light of said challenges, we have integrated automated statistical analysis capabilities with a highly interactive, multivariate visualization interface to produce a promising approach for visual environmental data analysis. By combining advanced interaction techniques such as dynamic axis scaling, conjunctive parallel coordinates, statistical indicators, and aerial perspective shading, we provide an enhanced variant of the classical parallel coordinates plot. Furthermore, the system facilitates statistical processes such as stepwise linear regression and correlation analysis to assist in the identification and quantification of the most significant predictors for a particular dependent variable. These capabilities are combined into a unique geovisual analytics system that is demonstrated via a pedagogical case study and three North Atlantic tropical cyclone climate studies using a systematic workflow. In addition to revealing several significant associations between environmental observations and tropical cyclone activity, this research corroborates the notion that enhanced parallel coordinates coupled with statistical analysis can be used for more effective knowledge discovery and confirmation in complex, real-world data sets

    Development of a statistical method for the identification of gene-environment interactions

    Get PDF
    In order to understand common, complex disease it is necessary to consider not just genetic risks and environmental risks, but also the interplay between them. This thesis aims to develop methodology for the detection of gene-environment interactions specifically; both by looking at the strengths and weaknesses of traditional approaches and through the development and testing of a novel statistical method. Developments in genotyping technology enable researchers to collect large volumes of polymorphisms in human genes, yet very few statistical methods are able to handle the volume, variation and complexity of this data, especially in combination with environmental risk factors. Interactions between genes and the environment are often subject to the curse of dimensionality, with each new variable increasing the potential number of interactions exponentially, leading to low power and a high false positive rate. The Mixed Tree Method (MTM) exploits the differences between environmental and genetic variables, by selecting the most appropriate features from conventional methods (including recursive partitioning, random forests and logistic regression) and combining them with new comparison algorithms which rank the genetic variables by the likelihood that they interact with the environmental variable under study. Results show the MTM to be as effective as the most successful current method for identification of interactions, but maintaining a much lower false positive rate and computational burden. As the number of SNPs in the dataset increases, the success of MTM compared to other methods becomes greater while the comparator approaches exhibit computational problems and rapidly increasing processing times. The MTM is also applied to a colorectal cancer dataset to show its use in a practical setting. The results together suggest that MTM could be a useful strategy for identifying gene environment interactions in future studies into complex disease
    corecore