7 research outputs found

    Concept-driven visualization for terascale data analytics

    Get PDF
    Over the past couple of decades the amount of scientific data sets has exploded. The science community has since been facing the common problem of being drowned in data, and yet starved of information. Identification and extraction of meaningful features from large data sets has become one of the central problems of scientific research, for both simulation as well as sensory data sets. The problems at hand are multifold and need to be addressed concurrently to provide scientists with the necessary tools, methods, and systems. Firstly, the underlying data structures and management need to be optimized for the kind of data most commonly used in scientific research, i.e. terascale time-varying, multi-dimensional, multi-variate, and potentially non-uniform grids. This implies avoidance of data duplication, utilization of a transparent query structure, and use of sophisticated underlying data structures and algorithms.Secondly, in the case of scientific data sets, simplistic queries are not a sufficient method to describe subsets or features. For time-varying data sets, many features can generally be described as local events, i.e. spatially and temporally limited regions with characteristic properties in value space. While most often scientists know quite well what they are looking for in a data set, at times they cannot formally or definitively describe their concept well to computer science experts, especially when based on partially substantiated knowledge. Scientists need to be enabled to query and extract such features or events directly and without having to rewrite their hypothesis into an inadequately simple query language. Thirdly, tools to analyze the quality and sensitivity of these event queries itself are required. Understanding local data sensitivity is a necessity for enabling scientists to refine query parameters as needed to produce more meaningful findings.Query sensitivity analysis can also be utilized to establish trends for event-driven queries, i.e. how does the query sensitivity differ between locations and over a series of data sets. In this dissertation, we present an approach to apply these interdependent measures to aid scientists in better understanding their data sets. An integrated system containing all of the above tools and system parts is presented

    Rethinking the Delivery Architecture of Data-Intensive Visualization

    Get PDF
    The web has transformed the way people create and consume information. However, data-intensive science applications have rarely been able to take full benefits of the web ecosystem so far. Analysis and visualization have remained close to large datasets on large servers and desktops, because of the vast resources that data-intensive applications require. This hampers the accessibility and on-demand availability of data-intensive science. In this work, I propose a novel architecture for the delivery of interactive, data-intensive visualization to the web ecosystem. The proposed architecture, codenamed Fabric, follows the idea of keeping the server-side oblivious of application logic as a set of scalable microservices that 1) manage data and 2) compute data products. Disconnected from application logic, the services allow interactive data-intensive visualization be simultaneously accessible to many users. Meanwhile, the client-side of this architecture perceives visualization applications as an interaction-in image-out black box with the sole responsibility of keeping track of application state and mapping interactions into well-defined and structured visualization requests. Fabric essentially provides a separation of concern that decouples the otherwise tightly coupled client and server seen in traditional data applications. Initial results show that as a result of this, Fabric enables high scalability of audience, scientific reproducibility, and improves control and protection of data products

    Distributed Data Management for Large Volume Visualization

    No full text
    We propose a distributed data management scheme for large data visualization that emphasizes efcient data sharing and access. To minimize data access time and support users with a variety of lo-cal computing capabilities, we introduce an adaptive data selection method based on an Enhanced Time-Space Partitioning (ETSP) tree that assists with effective visibility culling, as well as multires-olution data selection. By traversing the tree, our data management algorithm can quickly identify the visible regions of data, and, for each region, adaptively choose the lowest resolution satisfying user-specied error tolerances. Only necessary data elements are ac-cessed and sent to the visualization pipeline. To further address the issue of sharing large-scale data among geographically distributed collaborative teams, we have designed an infrastructure for integrat-ing our data management technique with a distributed data storage system provided by Logistical Networking (LoN). Data sets at dif-ferent resolutions are generated and uploaded to LoN for wide-area access. We describe a parallel volume rendering system that veries the effectiveness of our data storage, selection and access scheme

    Distributed data management for large volume visualization

    Get PDF
    We propose a distributed data management scheme for large data visualization that emphasizes efcient data sharing and access. To minimize data access time and support users with a variety of lo-cal computing capabilities, we introduce an adaptive data selection method based on an Enhanced Time-Space Partitioning (ETSP) tree that assists with effective visibility culling, as well as multires-olution data selection. By traversing the tree, our data management algorithm can quickly identify the visible regions of data, and, for each region, adaptively choose the lowest resolution satisfying user-specied error tolerances. Only necessary data elements are ac-cessed and sent to the visualization pipeline. To further address the issue of sharing large-scale data among geographically distributed collaborative teams, we have designed an infrastructure for integrat-ing our data management technique with a distributed data storage system provided by Logistical Networking (LoN). Data sets at dif-ferent resolutions are generated and uploaded to LoN for wide-area access. We describe a parallel volume rendering system that veries the effectiveness of our data storage, selection and access scheme

    Distributed Data Management for Large Volume Visualization

    No full text
    We propose a distributed data management scheme for large data visualization that emphasizes efficient data sharing and access. To minimize data access time and support users with a variety of local computing capabilities, we introduce an adaptive data selection method based on an “Enhanced Time-Space Partitioning ” (ETSP) tree that assists with effective visibility culling, as well as multiresolution data selection. By traversing the tree, our data management algorithm can quickly identify the visible regions of data, and, for each region, adaptively choose the lowest resolution satisfying userspecified error tolerances. Only necessary data elements are accessed and sent to the visualization pipeline. To further address the issue of sharing large-scale data among geographically distributed collaborative teams, we have designed an infrastructure for integrating our data management technique with a distributed data storage system provided by Logistical Networking (LoN). Data sets at different resolutions are generated and uploaded to LoN for wide-area access. We describe a parallel volume rendering system that verifies the effectiveness of our data storage, selection and access scheme
    corecore