367 research outputs found
MolSieve: A Progressive Visual Analytics System for Molecular Dynamics Simulations
Molecular Dynamics (MD) simulations are ubiquitous in cutting-edge
physio-chemical research. They provide critical insights into how a physical
system evolves over time given a model of interatomic interactions.
Understanding a system's evolution is key to selecting the best candidates for
new drugs, materials for manufacturing, and countless other practical
applications. With today's technology, these simulations can encompass millions
of unit transitions between discrete molecular structures, spanning up to
several milliseconds of real time. Attempting to perform a brute-force analysis
with data-sets of this size is not only computationally impractical, but would
not shed light on the physically-relevant features of the data. Moreover, there
is a need to analyze simulation ensembles in order to compare similar processes
in differing environments. These problems call for an approach that is
analytically transparent, computationally efficient, and flexible enough to
handle the variety found in materials based research. In order to address these
problems, we introduce MolSieve, a progressive visual analytics system that
enables the comparison of multiple long-duration simulations. Using MolSieve,
analysts are able to quickly identify and compare regions of interest within
immense simulations through its combination of control charts, data-reduction
techniques, and highly informative visual components. A simple programming
interface is provided which allows experts to fit MolSieve to their needs. To
demonstrate the efficacy of our approach, we present two case studies of
MolSieve and report on findings from domain collaborators.Comment: Updated references to GPCC
2023 SDSU Data Science Symposium Presentation Abstracts
This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium
2023 SDSU Data Science Symposium Presentation Abstracts
This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium
Molecular cavity topological representation for pattern analysis: A NLP analogy-based word2vec method
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. Cavity analysis in molecular dynamics is important for understanding molecular function. However, analyzing the dynamic pattern of molecular cavities remains a difficult task. In this paper, we propose a novel method to topologically represent molecular cavities by vectorization. First, a characterization of cavities is established through Word2Vec model, based on an analogy between the cavities and natural language processing (NLP) terms. Then, we use some techniques such as dimension reduction and clustering to conduct an exploratory analysis of the vectorized molecular cavity. On a real data set, we demonstrate that our approach is applicable to maintain the topological characteristics of the cavity and can find the change patterns from a large number of cavities
Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective
Given the complexity and lack of transparency in deep neural networks (DNNs),
extensive efforts have been made to make these systems more interpretable or
explain their behaviors in accessible terms. Unlike most reviews, which focus
on algorithmic and model-centric perspectives, this work takes a "data-centric"
view, examining how data collection, processing, and analysis contribute to
explainable AI (XAI). We categorize existing work into three categories subject
to their purposes: interpretations of deep models, referring to feature
attributions and reasoning processes that correlate data points with model
outputs; influences of training data, examining the impact of training data
nuances, such as data valuation and sample anomalies, on decision-making
processes; and insights of domain knowledge, discovering latent patterns and
fostering new knowledge from data and models to advance social values and
scientific discovery. Specifically, we distill XAI methodologies into data
mining operations on training and testing data across modalities, such as
images, text, and tabular data, as well as on training logs, checkpoints,
models and other DNN behavior descriptors. In this way, our study offers a
comprehensive, data-centric examination of XAI from a lens of data mining
methods and applications
- …