71 research outputs found

    Animated interval scatter-plot views for the exploratory analysis of large scale microarray time-course data.

    Get PDF
    Microarray technologies are a relatively new development that allow biologists to monitor the activity of thousands of genes (normally around 8,000) in parallel across multiple stages of a biological process. While this new perspective on biological functioning is recognised as having the potential to have a significant impact on the diagnosis, treatment, and prevention of diseases, it is only through effective analysis of the data produced that biologists can begin to unlock this potential. A significant obstacle to achieving effective analysis of microarray time-course is the combined scale and complexity of the data. This inevitably makes it difficult to reveal certain significant patterns in the data. In particular, it is less dominant patterns and, specifically, patterns that occur over smaller intervals of an experiment's overall time-frame that are more difficult to find. While existing techniques are capable of finding either unexpected patterns of activity over the majority of an experiment's time-frame or expected patterns of activity over smaller intervals of the time-frame, there are no techniques, or combination of techniques, that are suitable for finding unsuspected patterns of activity over smaller intervals. In order to overcome this limitation we have developed the Time-series Explorer, which specifically supports biologists in their attempts to reveal these types of pattern by allowing them to control an animated interval scatter-plot view of their data. This paper discusses aspects of the technique that make such an animated overview viable and describes the results of a user evaluation assessing the practical utility of the technique within the wider context of microarray time-series analysis as a whole

    MaTSE: the gene expression time-series explorer.

    Get PDF
    Background High throughput gene expression time-course experiments provide a perspective on biological functioning recognized as having huge value for the diagnosis, treatment, and prevention of diseases. There are however significant challenges to properly exploiting this data due to its massive scale and complexity. In particular, existing techniques are found to be ill suited to finding patterns of changing activity over a limited interval of an experiments time frame. The Time-Series Explorer (TSE) was developed to overcome this limitation by allowing users to explore their data by controlling an animated scatter-plot view. MaTSE improves and extends TSE by allowing users to visualize data with missing values, cross reference multiple conditions, highlight gene groupings, and collaborate by sharing their findings. Results MaTSE was developed using an iterative software development cycle that involved a high level of user feedback and evaluation. The resulting software combines a variety of visualization and interaction techniques which work together to allow biologists to explore their data and reveal temporal patterns of gene activity. These include a scatter-plot that can be animated to view different temporal intervals of the data, a multiple coordinated view framework to support the cross reference of multiple experimental conditions, a novel method for highlighting overlapping groups in the scatter-plot, and a pattern browser component that can be used with scatter-plot box queries to support cooperative visualization. A final evaluation demonstrated the tools effectiveness in allowing users to find unexpected temporal patterns and the benefits of functionality such as the overlay of gene groupings and the ability to store patterns. Conclusions We have developed a new exploratory analysis tool, MaTSE, that allows users to find unexpected patterns of temporal activity in gene expression time-series data. Overall, the study acted well to demonstrate the benefits of an iterative software development life cycle and allowed us to investigate some visualization problems that are likely to be common in the field of bioinformatics. The subjects involved in the final evaluation were positive about the potential of MaTSE to help them find unexpected patterns in their data and characterized MaTSE as an exploratory tool valuable for hypothesis generation and the creation of new biological knowledge

    Visual Support for the Modeling and Simulation of Cell Biological Processes

    Get PDF
    This dissertation aims at bringing information visualization closer to the demands of analytical problem solving for the specific domain of modeling and simulating cell biological systems. To this end, main segments of visual support in the domain are identified. For one of these segments, the visual analysis of simulation data, new concepts are developed. First, this includes the visualization of simulation data in the context of data generation. Second, new multiple view techniques for large and complex simulation data are introduced.Diese Arbeit verfolgt das Ziel, Informationsvisualisierung näher an die Anforderungen des Analyseprozesses heranzuführen, mit Blick auf die konkrete Anwendung der Modellierung und Simulation zellbiologischer Systeme. Dazu werden wesentliche Teilbereiche der visuellen Unterstützung identifiziert. Für den Teilbereich der visuellen Analyse von Simulationsdaten werden neue Konzepte entwickelt. Dies beinhaltet zum einen die Visualisierung von Simulationsdaten im Kontext der Datengenerierung. Zum anderen werden neue Multiple-View-Techniken für große und komplexe Simulationsdaten vorgestellt

    Reflections on QuestVis: A Visualization System for an Environmental Sustainability Model

    Get PDF
    We present lessons learned from the iterative design of QuestVis, a visualization interface for the QUEST environmental sustainability model. The QUEST model predicts the effects of policy choices in the present using scenarios of future outcomes that consist of several hundred indicators. QuestVis treats this information as a high-dimensional dataset, and shows the relationship between input choices and output indicators using linked views and a compact multilevel browser for indicator values. A first prototype also featured an overview of the space of all possible scenarios based on dimensionality reduction, but this representation was deemed to be be inappropriate for a target audience of people unfamiliar with data analysis. A second prototype with a considerably simplified and streamlined interface was created that supported comparison between multiple scenarios using a flexible approach to aggregation. However, QuestVis was not deployed because of a mismatch between the design goals of the project and the true needs of the target user community, who did not need to carry out detailed analysis of the high-dimensional dataset. We discuss this breakdown in the context of a nested model for visualization design and evaluation

    Using machine learning to support better and intelligent visualisation for genomic data

    Get PDF
    Massive amounts of genomic data are created for the advent of Next Generation Sequencing technologies. Great technological advances in methods of characterising the human diseases, including genetic and environmental factors, make it a great opportunity to understand the diseases and to find new diagnoses and treatments. Translating medical data becomes more and more rich and challenging. Visualisation can greatly aid the processing and integration of complex data. Genomic data visual analytics is rapidly evolving alongside with advances in high-throughput technologies such as Artificial Intelligence (AI), and Virtual Reality (VR). Personalised medicine requires new genomic visualisation tools, which can efficiently extract knowledge from the genomic data effectively and speed up expert decisions about the best treatment of an individual patient’s needs. However, meaningful visual analysis of such large genomic data remains a serious challenge. Visualising these complex genomic data requires not only simply plotting of data but should also lead to better decisions. Machine learning has the ability to make prediction and aid in decision-making. Machine learning and visualisation are both effective ways to deal with big data, but they focus on different purposes. Machine learning applies statistical learning techniques to automatically identify patterns in data to make highly accurate prediction, while visualisation can leverage the human perceptual system to interpret and uncover hidden patterns in big data. Clinicians, experts and researchers intend to use both visualisation and machine learning to analyse their complex genomic data, but it is a serious challenge for them to understand and trust machine learning models in the serious medical industry. The main goal of this thesis is to study the feasibility of intelligent and interactive visualisation which combined with machine learning algorithms for medical data analysis. A prototype has also been developed to illustrate the concept that visualising genomics data from childhood cancers in meaningful and dynamic ways could lead to better decisions. Machine learning algorithms are used and illustrated during visualising the cancer genomic data in order to provide highly accurate predictions. This research could open a new and exciting path to discovery for disease diagnostics and therapies

    Supporting cognition in systems biology analysis: findings on users' processes and design implications

    Get PDF
    Abstract Background Current usability studies of bioinformatics tools suggest that tools for exploratory analysis support some tasks related to finding relationships of interest but not the deep causal insights necessary for formulating plausible and credible hypotheses. To better understand design requirements for gaining these causal insights in systems biology analyses a longitudinal field study of 15 biomedical researchers was conducted. Researchers interacted with the same protein-protein interaction tools to discover possible disease mechanisms for further experimentation. Results Findings reveal patterns in scientists' exploratory and explanatory analysis and reveal that tools positively supported a number of well-structured query and analysis tasks. But for several of scientists' more complex, higher order ways of knowing and reasoning the tools did not offer adequate support. Results show that for a better fit with scientists' cognition for exploratory analysis systems biology tools need to better match scientists' processes for validating, for making a transition from classification to model-based reasoning, and for engaging in causal mental modelling. Conclusion As the next great frontier in bioinformatics usability, tool designs for exploratory systems biology analysis need to move beyond the successes already achieved in supporting formulaic query and analysis tasks and now reduce current mismatches with several of scientists' higher order analytical practices. The implications of results for tool designs are discussed.http://deepblue.lib.umich.edu/bitstream/2027.42/134554/1/13009_2008_Article_29.pd

    Doctor of Philosophy

    Get PDF
    dissertationWith the ever-increasing amount of available computing resources and sensing devices, a wide variety of high-dimensional datasets are being produced in numerous fields. The complexity and increasing popularity of these data have led to new challenges and opportunities in visualization. Since most display devices are limited to communication through two-dimensional (2D) images, many visualization methods rely on 2D projections to express high-dimensional information. Such a reduction of dimension leads to an explosion in the number of 2D representations required to visualize high-dimensional spaces, each giving a glimpse of the high-dimensional information. As a result, one of the most important challenges in visualizing high-dimensional datasets is the automatic filtration and summarization of the large exploration space consisting of all 2D projections. In this dissertation, a new type of algorithm is introduced to reduce the exploration space that identifies a small set of projections that capture the intrinsic structure of high-dimensional data. In addition, a general framework for summarizing the structure of quality measures in the space of all linear 2D projections is presented. However, identifying the representative or informative projections is only part of the challenge. Due to the high-dimensional nature of these datasets, obtaining insights and arriving at conclusions based solely on 2D representations are limited and prone to error. How to interpret the inaccuracies and resolve the ambiguity in the 2D projections is the other half of the puzzle. This dissertation introduces projection distortion error measures and interactive manipulation schemes that allow the understanding of high-dimensional structures via data manipulation in 2D projections

    Exploration space of human-data interaction

    Get PDF
    Data is everywhere. Starting with the invention of writing, representation artifacts brought the data to observable state which led to natural establishment of an interaction form between human and data. In the human-data interaction (HDI) environment, data representations and analytic systems act as an intermediary role. I suggest a new de nition for HDI in which this interaction is conceptualized as a communication model over a set of media. The interaction occurs with the exchange of messages originated from both human and data. Timing and content of the messages are employed to facilitate objective evaluation of properties of analytic system in question. To systematically investigate the complex nature of HDI, my methodology postulates the phenomenon as a high-dimensional space in which data analytic systems could be positioned based on their properties. Evaluation of the properties are performed based on solid de nitions of the dimensions. I de ne ve properties for data analytic systems, namely, responsiveness, communication media level, unit task diversity, closeness factor, and progressiveness level, and demonstrate how these properties could be objectively calculated. I visually explore the HDI space in which data analytic systems reported in my thesis are plotted on a two-dimensional Cartesian system whose axes are responsiveness and communication media level. Visually identi able patterns in this plot, which I call realms, are characterized by quantitative and qualitative analysis of objective, behavioral, and subjective data collected during the user interaction with the corresponding analytic system
    • …
    corecore