Search CORE

36 research outputs found

Informatics Approaches to Data Preservation and Analysis in Protein Electrostatics

Author: Baker Nathan A.
Dowling Chase
Gosink Luke
Pulsipher Trenton
Sansone Susanna-Assunta
Publication venue: Biophysical Society. Published by Elsevier Inc.
Publication date: 27/01/2015
Field of study

Elsevier - Publisher Connector

Recommended from our members

Query-Driven Visualization of Time-Varying Adaptive Mesh Refinement Data

Author: Anderson John C.
Bethel E. Wes
Gosink Luke J.
Joy Kenneth I.
Publication venue: Lawrence Berkeley National Laboratory
Publication date: 01/08/2008
Field of study

The visualization and analysis of AMR-based simulations is integral to the process of obtaining new insight in scientific research. We present a new method for performing query-driven visualization and analysis on AMR data, with specific emphasis on time-varying AMR data. Our work introduces a new method that directly addresses the dynamic spatial and temporal properties of AMR grids which challenge many existing visualization techniques. Further, we present the first implementation of query-driven visualization on the GPU that uses a GPU-based indexing structure to both answer queries and efficiently utilize GPU memory. We apply our method to two different science domains to demonstrate its broad applicability

UNT Digital Library

Recommended from our members

Bin-Hash Indexing: A Parallel Method for Fast Query Processing

Author: Bethel Edward W.
Bethel Edward Wes
Gosink Luke J.
Joy Kenneth I.
Owens John D.
Wu Kesheng
Publication venue: Lawrence Berkeley National Laboratory
Publication date: 27/06/2008
Field of study

This paper presents a new parallel indexing data structure for answering queries. The index, called Bin-Hash, offers extremely high levels of concurrency, and is therefore well-suited for the emerging commodity of parallel processors, such as multi-cores, cell processors, and general purpose graphics processing units (GPU). The Bin-Hash approach first bins the base data, and then partitions and separately stores the values in each bin as a perfect spatial hash table. To answer a query, we first determine whether or not a record satisfies the query conditions based on the bin boundaries. For the bins with records that can not be resolved, we examine the spatial hash tables. The procedures for examining the bin numbers and the spatial hash tables offer the maximum possible level of concurrency; all records are able to be evaluated by our procedure independently in parallel. Additionally, our Bin-Hash procedures access much smaller amounts of data than similar parallel methods, such as the projection index. This smaller data footprint is critical for certain parallel processors, like GPUs, where memory resources are limited. To demonstrate the effectiveness of Bin-Hash, we implement it on a GPU using the data-parallel programming language CUDA. The concurrency offered by the Bin-Hash index allows us to fully utilize the GPU's massive parallelism in our work; over 12,000 records can be simultaneously evaluated at any one time. We show that our new query processing method is an order of magnitude faster than current state-of-the-art CPU-based indexing technologies. Additionally, we compare our performance to existing GPU-based projection index strategies

UNT Digital Library

Data Parallel Bin-Based Indexing for Answering Queries on Multi-Core Architectures

Author: Gosink Luke
Publication venue: eScholarship, University of California
Publication date: 13/10/2009
Field of study

The multi-core trend in CPUs and general purpose graphics processing units (GPUs) offers new opportunities for the database community. The increase of cores at exponential rates is likely to affect virtually every server and client in the coming decade, and presents database management systems with a huge, compelling disruption that will radically change how processing is done. This paper presents a new parallel indexing data structure for answering queries that takes full advantage of the increasing thread-level parallelism emerging in multi-core architectures. In our approach, our Data Parallel Bin-based Index Strategy (DP-BIS) first bins the base data, and then partitions and stores the values in each bin as a separate, bin-based data cluster. In answering a query, the procedures for examining the bin numbers and the bin-based data clusters offer the maximum possible level of concurrency; each record is evaluated by a single thread and all threads are processed simultaneously in parallel. We implement and demonstrate the effectiveness of DP-BIS on two multi-core architectures: a multi-core CPU and a GPU. The concurrency afforded by DP-BIS allows us to fully utilize the thread-level parallelism provided by each architecture--for example, our GPU-based DP-BIS implementation simultaneously evaluates over 12,000 records with an equivalent number of concurrently executing threads. In comparing DP-BIS's performance across these architectures, we show that the GPU-based DP-BIS implementation requires significantly less computation time to answer a query than the CPU-based implementation. We also demonstrate in our analysis that DP-BIS provides better overall performance than the commonly utilized CPU and GPU-based projection index. Finally, due to data encoding, we show that DP-BIS accesses significantly smaller amounts of data than index strategies that operate solely on a column's base data; this smaller data footprint is critical for parallel processors that possess limited memory resources (e.g., GPUs)

Ezid

eScholarship - University of California

Recommended from our members

Query-Driven Visualization Strategies for the Analysis and Visualization of Complex Datasets

Author: Gosink Luke
Publication venue: eScholarship, University of California
Publication date: 01/01/2009
Field of study

There is an urgent need in scientific communities, driven by their ability to generate ever-larger, increasingly complex data, for scalable analysis methods that rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) methods are among the small subset of techniques that are able to address both large and highly complex datasets---e.g.\ multivariate, multitemporal, and multiresolution representations of scalar, vector, and function field data. This dissertation presents new methods that either directly extend the utility and accelerate the performance of QDV as a whole, or enable QDV's substantial and flexible analysis strengths to be applied to new areas of scientific research. The first part of this dissertation presents a new data-parallel strategy that accelerates the most fundamental task performed by QDV: the evaluation of user defined, ad~hoc queries. The second part of this dissertation extends QDV strategies to analyze and visualize time-varying adaptive mesh refinement (AMR) data. AMR techniques are used in many scientific communities to efficiently and accurately model complex, continuous physical phenomena. By extending QDV methods to address the dynamic spatiotemporal properties of time-varying AMR data, I provide scientists with a powerful tool for visually analyzing the data generated from these important simulations. The final part of this dissertation leverages statistical analysis methods to generate deeper insight into the regions that are selected by a user's query. In this effort I introduce two new methods that increase the utility of query-driven strategies. The first strategy uses correlation fields, created between pairs of variables, in conjunction with the cumulative distribution functions (CDF) of variables expressed in a user's query. This strategy identifies important variable interactions within query regions. The second strategy forms a statistical-based segmentation within the query-region to generate deeper insight into the ``statistical structure'' of a user's query. In this approach, segments indicate which variable contributes most to the underlying joint density distribution of the user's query. These segments, when used in conjunction with each variable's CDF, intuitively aid users in refining the constraints over the variables in their query

eScholarship - University of California

Bayesian Model Averaging for Ensemble-Based Estimates of Solvation-Free Energies.

Author: Gosink Luke J,
Publication venue
Publication date: 06/07/2017
Field of study

Ezid

Query-Driven Visualization of Time-Varying Adaptive Mesh Refinement Data

Author: Gosink Luke J.
Publication venue: eScholarship, University of California
Publication date: 06/11/2008
Field of study

Ezid

eScholarship - University of California

An Application of Multivariate Statistical Analysis for Query-Driven Visualization

Author: Gosink Luke J.
Publication venue: eScholarship, University of California
Publication date: 06/10/2010
Field of study

Abstract?Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability

Ezid

eScholarship - University of California

Bin-Hash Indexing: A Parallel Method for Fast Query Processing

Author: Gosink Luke J.
Publication venue: eScholarship, University of California
Publication date: 20/08/2008
Field of study

Ezid

eScholarship - University of California

Recommended from our members

Variable Interactions in Query Driven Visualization

Author: Anderson John C.
Bethel Wes
Gosink Luke
Joy Ken
Publication venue: eScholarship, University of California
Publication date: 01/01/2007
Field of study

Our capability to generate increasingly large and more complex datasets has established the need for scalable methods that can provide insight into important variable trends. Query-driven methods are among the small subset of techniques that are able to address both large and highly complex data sets. This paper presents a new method in which coherent and meaningful visualizations are constructed to convey relational information about the trends that exist \emph{between} variables in a query. Correlation fields are created between pairs of variables and used in conjunction with the cumulative distribution function of each of the query's variables to reveal, both visually and statistically, trends in variable behavior and interactions. We illustrate our concepts by discussing interactions between variables in two flame-front simulations

eScholarship - University of California