64 research outputs found

    MaTSE: the gene expression time-series explorer.

    Get PDF
    Background High throughput gene expression time-course experiments provide a perspective on biological functioning recognized as having huge value for the diagnosis, treatment, and prevention of diseases. There are however significant challenges to properly exploiting this data due to its massive scale and complexity. In particular, existing techniques are found to be ill suited to finding patterns of changing activity over a limited interval of an experiments time frame. The Time-Series Explorer (TSE) was developed to overcome this limitation by allowing users to explore their data by controlling an animated scatter-plot view. MaTSE improves and extends TSE by allowing users to visualize data with missing values, cross reference multiple conditions, highlight gene groupings, and collaborate by sharing their findings. Results MaTSE was developed using an iterative software development cycle that involved a high level of user feedback and evaluation. The resulting software combines a variety of visualization and interaction techniques which work together to allow biologists to explore their data and reveal temporal patterns of gene activity. These include a scatter-plot that can be animated to view different temporal intervals of the data, a multiple coordinated view framework to support the cross reference of multiple experimental conditions, a novel method for highlighting overlapping groups in the scatter-plot, and a pattern browser component that can be used with scatter-plot box queries to support cooperative visualization. A final evaluation demonstrated the tools effectiveness in allowing users to find unexpected temporal patterns and the benefits of functionality such as the overlay of gene groupings and the ability to store patterns. Conclusions We have developed a new exploratory analysis tool, MaTSE, that allows users to find unexpected patterns of temporal activity in gene expression time-series data. Overall, the study acted well to demonstrate the benefits of an iterative software development life cycle and allowed us to investigate some visualization problems that are likely to be common in the field of bioinformatics. The subjects involved in the final evaluation were positive about the potential of MaTSE to help them find unexpected patterns in their data and characterized MaTSE as an exploratory tool valuable for hypothesis generation and the creation of new biological knowledge

    Cognitive Foundations for Visual Analytics

    Full text link

    Knowledge visualization: From theory to practice

    Get PDF
    Visualizations have been known as efficient tools that can help users analyze com- plex data. However, understanding the displayed data and finding underlying knowl- edge is still difficult. In this work, a new approach is proposed based on understanding the definition of knowledge. Although there are many definitions used in different ar- eas, this work focuses on representing knowledge as a part of a visualization and showing the benefit of adopting knowledge representation. Specifically, this work be- gins with understanding interaction and reasoning in visual analytics systems, then a new definition of knowledge visualization and its underlying knowledge conversion processes are proposed. The definition of knowledge is differentiated as either explicit or tacit knowledge. Instead of directly representing data, the value of the explicit knowledge associated with the data is determined based on a cost/benefit analysis. In accordance to its importance, the knowledge is displayed to help the user under- stand the complex data through visual analytical reasoning and discovery

    Genome visualisation and user studies in biologist-computer interaction

    Get PDF
    We surveyed a number of genome visualisation tools used in biomedical research. We recognised that none of the tools shows all the relevant data geneticists who look for candidate disease genes would like to see. The biological researchers we collaborate with would like to view integrated data from a variety of sources and be able to see both data overviews and details. In response to this need, we developed a new visualisation tool, VisGenome, which allows the users to add their own data or data downloaded from other sources, such as Ensembl. VisGenome visualises single and comparative representations of the rat, the mouse, and the human chromosomes, and can easily be used for other genomes. In the context of VisGenome development we made the following research contributions. We developed a new algorithm (CartoonPlus) which allows the users to see different kinds of data in cartoon scaling depending on a selected basis. Also, two user studies were conducted: an initial quantitative user study and a mixed paradigm user study. The first study showed that neither Ensembl nor VisGenome fulfil all user requirements and can be regarded as user-friendly, as the users make a significant number of mistakes during data navigation. To help users navigate their data easily, we improved existing visualisation techniques in VisGenome and added a new technique CartoonPlus. To verify if this solution was useful, we conducted a second user study. We saw that the users became more familiar with the tool, and found new ways to use the application on its own and in connection with other tools. They frequently used CartoonPlus, which allowed them to see small regions of their data in a way that was not possible before

    Information management applied to bioinformatics

    Get PDF
    Bioinformatics, the discipline concerned with biological information management is essential in the post-genome era, where the complexity of data processing allows for contemporaneous multi level research including that at the genome level, transcriptome level, proteome level, the metabolome level, and the integration of these -omic studies towards gaining an understanding of biology at the systems level. This research is also having a major impact on disease research and drug discovery, particularly through pharmacogenomics studies. In this study innovative resources have been generated via the use of two case studies. One was of the Research & Development Genetics (RDG) department at AstraZeneca, Alderley Park and the other was of the Pharmacogenomics Group at the Sanger Institute in Cambridge UK. In the AstraZeneca case study senior scientists were interviewed using semi-structured interviews to determine information behaviour through the study scientific workflows. Document analysis was used to generate an understanding of the underpinning concepts and fonned one of the sources of context-dependent information on which the interview questions were based. The objectives of the Sanger Institute case study were slightly different as interviews were carried out with eight scientists together with the use of participation observation, to collect data to develop a database standard for one process of their Pharmacogenomics workflow. The results indicated that AstraZeneca would benefit through upgrading their data management solutions in the laboratory and by development of resources for the storage of data from larger scale projects such as whole genome scans. These studies will also generate very large amounts of data and the analysis of these will require more sophisticated statistical methods. At the Sanger Institute a minimum information standard was reported for the manual design of primers and included in a decision making tree developed for Polymerase Chain Reactions (PCRs). This tree also illustrates problems that can be encountered when designing primers along with procedures that can be taken to address such issues.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Biological knowledge management and gene network analysis: a heuristic road to System Biology

    Get PDF
    In order to understand the molecular basis of living cells and organisms, biologists over the past decades have been studying life's core molecular players: the genes. Most genes have a specific function, a role they play in the collective task of developing a cell and supporting all the aspects of keeping it alive. These genes do not perform their function randomly. Instead, after billions of years of evolution, nature's trial-and-error process, they have become parts of an utterly complex and intricate network, an interconnected mesh of genes that comprises signal detection cascades, enzymatic reactions, control mechanisms, etc. Over several past decades, experimental molecular biologists have sought mainly to study these genes via a one-by-one approach. However, with the advent of high-throughput experimental techniques, the number-crunching power of computers, and the realisation that many biological functions are the result of interactions between genes or their proteins, Biology's related field of Systems Biology has emerged. Here, one tries to combine the dispersed information produced by many researchers, in integrated assemblies called gene networks. Our research comprises the development of two new methods for improved information integration in the field of molecular Systems Biology. The first one aims to support an approach to acquire insights in the dynamics of gene networks (the behaviour of gene activities over time), called 'modelling and simulation' of genetic regulatory networks. Our second new method approaches the problem of how to collect and manage the information necessary to compose such genetic networks in the first place, based on scattered information in a dispersed and increasingly fast growing body of publications. These two methods form two separate parts in this thesis (chapters 2-4, and chapters 5-7). Chapter 1, section 1.3 provides an introductory, complete overview of this thesis. It is intended as a light introduction to my doctoral research, presented in an informal and entertaining way, and mainly addressed to my friends and family. It forms an introduction for the laymen to our work and the concepts that are important for this thesis. Chapters 2, 3 and 4 constitute Part 1 of this thesis. Chapter 2 gives a review of the various formalisms for modelling and simulation of gene networks, as a thorough background for our work presented in the following chapter. Chapter 3 describes SIM-plex, our new software tool that forms a bridge between a mathematical gene network modelling formalism, and the biologist, who usually is more an expert in the biology behind the gene network than a mathematician can ever be. It shields off the mathematics in a new way so as to enable biologists to experiment with modelling and simulation themselves. Chapter 4 describes the various applications that SIM-plex was used for. The research described in Part 2 of this thesis, chapters 5, 6 and 7, emerged from our own need for a better management of biological information. We experienced this necessity while we were building a larger genetic network for the Arabidopsis cell cycle, and it forms a general problem in biology. Chapter 5 gives a background of the currently existing methods for harvesting literature information, but comes to the conclusion that no existing automated or manual method displays sufficient potential to capture the largest part of information from literature in a structured way. In chapter 6, we describe our bold proposal of a new method to tackle this problem: MineMap, a community-based manual text-curation initiative. We describe the various aspects required to make such a project possible, based on our own experiences with our prototype application MineMap. This research is organised in a 'heuristic' way, in the sense that we built a first sketch and a working solution that also generated experiences for improvements in a next design. While chapter 6 describes our new ideas and concrete implementations in considerable detail, chapter 7 then illustrates the core concept behind MineMap

    A functional and regulatory perspective on Arabidopsis thaliana

    Get PDF

    Visual Support for the Modeling and Simulation of Cell Biological Processes

    Get PDF
    This dissertation aims at bringing information visualization closer to the demands of analytical problem solving for the specific domain of modeling and simulating cell biological systems. To this end, main segments of visual support in the domain are identified. For one of these segments, the visual analysis of simulation data, new concepts are developed. First, this includes the visualization of simulation data in the context of data generation. Second, new multiple view techniques for large and complex simulation data are introduced.Diese Arbeit verfolgt das Ziel, Informationsvisualisierung näher an die Anforderungen des Analyseprozesses heranzuführen, mit Blick auf die konkrete Anwendung der Modellierung und Simulation zellbiologischer Systeme. Dazu werden wesentliche Teilbereiche der visuellen Unterstützung identifiziert. Für den Teilbereich der visuellen Analyse von Simulationsdaten werden neue Konzepte entwickelt. Dies beinhaltet zum einen die Visualisierung von Simulationsdaten im Kontext der Datengenerierung. Zum anderen werden neue Multiple-View-Techniken für große und komplexe Simulationsdaten vorgestellt

    Seventh Biennial Report : June 2003 - March 2005

    No full text
    corecore