367 research outputs found

    MolSieve: A Progressive Visual Analytics System for Molecular Dynamics Simulations

    Full text link
    Molecular Dynamics (MD) simulations are ubiquitous in cutting-edge physio-chemical research. They provide critical insights into how a physical system evolves over time given a model of interatomic interactions. Understanding a system's evolution is key to selecting the best candidates for new drugs, materials for manufacturing, and countless other practical applications. With today's technology, these simulations can encompass millions of unit transitions between discrete molecular structures, spanning up to several milliseconds of real time. Attempting to perform a brute-force analysis with data-sets of this size is not only computationally impractical, but would not shed light on the physically-relevant features of the data. Moreover, there is a need to analyze simulation ensembles in order to compare similar processes in differing environments. These problems call for an approach that is analytically transparent, computationally efficient, and flexible enough to handle the variety found in materials based research. In order to address these problems, we introduce MolSieve, a progressive visual analytics system that enables the comparison of multiple long-duration simulations. Using MolSieve, analysts are able to quickly identify and compare regions of interest within immense simulations through its combination of control charts, data-reduction techniques, and highly informative visual components. A simple programming interface is provided which allows experts to fit MolSieve to their needs. To demonstrate the efficacy of our approach, we present two case studies of MolSieve and report on findings from domain collaborators.Comment: Updated references to GPCC

    2023 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium

    2023 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium

    Molecular cavity topological representation for pattern analysis: A NLP analogy-based word2vec method

    Full text link
    © 2019 by the authors. Licensee MDPI, Basel, Switzerland. Cavity analysis in molecular dynamics is important for understanding molecular function. However, analyzing the dynamic pattern of molecular cavities remains a difficult task. In this paper, we propose a novel method to topologically represent molecular cavities by vectorization. First, a characterization of cavities is established through Word2Vec model, based on an analogy between the cavities and natural language processing (NLP) terms. Then, we use some techniques such as dimension reduction and clustering to conduct an exploratory analysis of the vectorized molecular cavity. On a real data set, we demonstrate that our approach is applicable to maintain the topological characteristics of the cavity and can find the change patterns from a large number of cavities

    Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective

    Full text link
    Given the complexity and lack of transparency in deep neural networks (DNNs), extensive efforts have been made to make these systems more interpretable or explain their behaviors in accessible terms. Unlike most reviews, which focus on algorithmic and model-centric perspectives, this work takes a "data-centric" view, examining how data collection, processing, and analysis contribute to explainable AI (XAI). We categorize existing work into three categories subject to their purposes: interpretations of deep models, referring to feature attributions and reasoning processes that correlate data points with model outputs; influences of training data, examining the impact of training data nuances, such as data valuation and sample anomalies, on decision-making processes; and insights of domain knowledge, discovering latent patterns and fostering new knowledge from data and models to advance social values and scientific discovery. Specifically, we distill XAI methodologies into data mining operations on training and testing data across modalities, such as images, text, and tabular data, as well as on training logs, checkpoints, models and other DNN behavior descriptors. In this way, our study offers a comprehensive, data-centric examination of XAI from a lens of data mining methods and applications