159,075 research outputs found
Spark: A navigational paradigm for genomic data exploration
Biologists possess the detailed knowledge critical for extracting biological insight from genome-wide data resources, and yet they are increasingly faced with nontrivial computational analysis challenges posed by genome-scale methodologies. To lower this computational barrier, particularly in the early data exploration phases, we have developed an interactive pattern discovery and visualization approach, Spark, designed with epigenomic data in mind. Here we demonstrate Spark's ability to reveal both known and novel epigenetic signatures, including a previously unappreciated binding association between the YY1 transcription factor and the corepressor CTBP2 in human embryonic stem cells
An Interactive Bio-inspired Approach to Clustering and Visualizing Datasets
In this work, we present an interactive visual clustering approach for the exploration and analysis of datasets using the computational power of Graphics Processor Units (GPUs). The visualization is based on a collective behavioral model that enables cognitive amplification of information visualization. In this way, the workload of understanding the representation of information moves from the cognitive to the perceptual system. The results enable a more intuitive, interactive approach to the discovery of knowledge. The paper illustrates this behavioral model for clustering data, and applies it to the visualization of a number of real and synthetic datasets
Knowledge in (Geo)Visualisation: The relationship between seeing and thinking
Modern research in geovisualisation has framed the discipline as a field more akin to geovisual analytics - one that pleaces an emphasis on the human elements of exploration of data through interactive and dynamic geo-interfaces, rather than simple data representation. This rephrasing highlights the importance of cognitive aspects of human interaction with geo-based data and the interfaces designed to present them. In an attempt to provide a psychological background to the benefits of geovisual analytics, this paper will explroe the role that perception hasi n complex problem solving and knowledge discovery, and will demonstrate that, through modern interactive technologies, (geo)visualiations augment and facilitate our natural ability to surface novel, surprising and otherwise invisible relationships between information. It will argue that it is through these novel relationships that we add to our understandings of the original information and simultaneously reveal new knowledge 'between the gaps'
Doctor of Philosophy
dissertationRecent advancements in mobile devices - such as Global Positioning System (GPS), cellular phones, car navigation system, and radio-frequency identification (RFID) - have greatly influenced the nature and volume of data about individual-based movement in space and time. Due to the prevalence of mobile devices, vast amounts of mobile objects data are being produced and stored in databases, overwhelming the capacity of traditional spatial analytical methods. There is a growing need for discovering unexpected patterns, trends, and relationships that are hidden in the massive mobile objects data. Geographic visualization (GVis) and knowledge discovery in databases (KDD) are two major research fields that are associated with knowledge discovery and construction. Their major research challenges are the integration of GVis and KDD, enhancing the ability to handle large volume mobile objects data, and high interactivity between the computer and users of GVis and KDD tools. This dissertation proposes a visualization toolkit to enable highly interactive visual data exploration for mobile objects datasets. Vector algebraic representation and online analytical processing (OLAP) are utilized for managing and querying the mobile object data to accomplish high interactivity of the visualization tool. In addition, reconstructing trajectories at user-defined levels of temporal granularity with time aggregation methods allows exploration of the individual objects at different levels of movement generality. At a given level of generality, individual paths can be combined into synthetic summary paths based on three similarity measures, namely, locational similarity, directional similarity, and geometric similarity functions. A visualization toolkit based on the space-time cube concept exploits these functionalities to create a user-interactive environment for exploring mobile objects data. Furthermore, the characteristics of visualized trajectories are exported to be utilized for data mining, which leads to the integration of GVis and KDD. Case studies using three movement datasets (personal travel data survey in Lexington, Kentucky, wild chicken movement data in Thailand, and self-tracking data in Utah) demonstrate the potential of the system to extract meaningful patterns from the otherwise difficult to comprehend collections of space-time trajectories
Knowledge-assisted ranking: A visual analytic application for sports event data
© 2016 IEEE. Organizing sports video data for performance analysis can be challenging, especially in cases involving multiple attributes and when the criteria for sorting frequently changes depending on the user's task. The proposed visual analytic system enables users to specify a sort requirement in a flexible manner without depending on specific knowledge about individual sort keys. The authors use regression techniques to train different analytical models for different types of sorting requirements and use visualization to facilitate knowledge discovery at different stages of the process. They demonstrate the system with a rugby case study to find key instances for analyzing team and player performance. Organizing sports video data for performance analysis can be challenging in cases with multiple attributes, and when sorting frequently changes depending on the user's task. As this video shows, the proposed visual analytic system allows interactive data sorting and exploration
Minimizing User Effort in Large Scale Example-driven Data Exploration
Data Exploration is a key ingredient in a widely diverse set of discovery-oriented applications, including scientific computing, financial analysis, and evidence-based medicine. It refers to a series of exploratory tasks that aim to extract useful pieces of knowledge from data, and its challenge is to do so without requiring the user to specify with precision what information is being searched for. The goal of assisting users in constructing their exploratory queries effortlessly, which effectively reveals interesting data objects, has led to the development of a variety of intelligent semi-automatic approaches. Among such approaches, Example-driven Exploration is rapidly becoming an attractive choice for exploratory query formulation since it attempts to minimize the amount of prior knowledge required from the user to form an accurate exploratory query.
In particular, this dissertation focuses on interactive Example-driven Exploration, which steers the user towards discovering all data objects relevant to the users’ exploration based on their feedback on a small set of examples. Interactive Example-driven Exploration is especially beneficial for non-expert users, as it enables them to circumvent query languages by assigning relevancy to examples as a proxy for the intended exploratory analysis. However, existing interactive Example-driven Exploration systems fall short of supporting the need to perform complex explorations over large, unstructured high-dimensional data. To overcome these challenges, we have developed new methods of data reduction, example selection, data indexing, and result refinement that support practical, interactive data exploration.
The novelty of our approach is anchored on leveraging active learning and query optimization techniques that strike a balance between maximizing accuracy and minimizing user effort in providing feedback while enabling interactive performance for exploration tasks with arbitrary, large-sized datasets. Furthermore, it extends the exploration beyond the structured data by supporting a variety of high-dimensional unstructured data and enables the refinement of results when the exploration task is associated with too many relevant data objects that could be overwhelming to the user. To affirm the effectiveness of our proposed models, techniques, and algorithms, we implemented multiple prototype systems and evaluated them using real datasets. Some of them were also used in domain-specific analytics tools
Recommended from our members
The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect
hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design
and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications.
Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical
compounds.
Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets
Attribute exploration with fuzzy attributes and background knowledge
Abstract. Attribute exploration is a formal concept analytical tool for knowledge discovery by interactive determination of the implications holding between a given set of attributes. The corresponding algorithm queries the user in an efficient way about the implications between the attributes. The result of the exploration process is a representative set of examples for the entire theory and a set of implications from which all implications that hold between the considered attributes can be deduced. The method was successfully applied in different real-life applications for discrete data. In many instances, the user may know some implications before the exploration starts. These are considered as background knowledge and their usage shortens the exploration process. In this paper we show that the handling of background information can be generalised to the fuzzy setting
Recommended from our members
Visualization-driven Structural and Statistical Analysis of Turbulent Flows
Knowledge extraction from data volumes of ever increasing size requires ever more flexible tools to facilitate interactive query. In- teractivity enables real-time hypothesis testing and scientific discovery, but can generally not be achieved without some level of data reduction. The approach described in this paper combines multi-resolution access, region-of-interest extraction, and structure identification in order to pro- vide interactive spatial and statistical analysis of a terascale data volume. Unique aspects of our approach include the incorporation of both local and global statistics of the flow structures, and iterative refinement fa- cilities, which combine geometry, topology, and statistics to allow the user to effectively tailor the analysis and visualization to the science. Working together, these facilities allow a user to focus the spatial scale and domain of the analysis and perform an appropriately tailored mul- tivariate visualization of the corresponding data. All of these ideas and algorithms are instantiated in a deployed visualization and analysis tool called VAPOR, which is in routine use by scientists internationally. In data from a 10243 simulation of a forced turbulent flow, VAPOR allowed us to perform a visual data exploration of the flow properties at interac- tive speeds, leading to the discovery of novel scientific properties of the flow, in the form of two distinct vortical structure populations. These structures would have been very difficult (if not impossible) to find with statistical overviews or other existing visualization-driven analysis ap- proaches. This kind of intelligent, focused analysis/refinement approach will become even more important as computational science moves to- wards petascale applications
- …