944 research outputs found

    DimLift: Interactive Hierarchical Data Exploration through Dimensional Bundling

    Get PDF
    The identification of interesting patterns and relationships is essential to exploratory data analysis. This becomes increasingly difficult in high dimensional datasets. While dimensionality reduction techniques can be utilized to reduce the analysis space, these may unintentionally bury key dimensions within a larger grouping and obfuscate meaningful patterns. With this work we introduce DimLift , a novel visual analysis method for creating and interacting with dimensional bundles . Generated through an iterative dimensionality reduction or user-driven approach, dimensional bundles are expressive groups of dimensions that contribute similarly to the variance of a dataset. Interactive exploration and reconstruction methods via a layered parallel coordinates plot allow users to lift interesting and subtle relationships to the surface, even in complex scenarios of missing and mixed data types. We exemplify the power of this technique in an expert case study on clinical cohort data alongside two additional case examples from nutrition and ecology.acceptedVersio

    A visual analytics approach to feature discovery and subspace exploration in protein flexibility matrices

    Get PDF
    The vast amount of information generated by domain scientists makes the transi- tion from data to knowledge difficult and often impedes important discoveries. For example, the knowledge gained from protein flexibility data sets can speed advances in genetic therapies and drug discovery. However, these models generate so much data that large scale analysis by traditional methods is almost impossible. This hinders biomedical advances. Visual analytics is a new field that can help alleviate this problem. Visual analytics attempts to seamlessly integrate human abilities in pattern recognition, domain knowledge, and synthesis with automatic analysis techniques. I propose a novel, visual analytics pipeline and prototype which eases discovery, com- parison, and exploration in the outputs of complex computational biology datasets. The approach utilizes automatic feature extraction by image segmentation to locate regions of interest in the data, visually presents the features to users in an intuitive way, and provides rich interactions for multi-resolution visual exploration. Functional- ity is also provided for subspace exploration based on automatic similarity calculation and comparative visualizations. The effectiveness of feature discovery and subspace exploration is shown through a user study and user scenarios. Feedback from analysts confirms the suitability of the proposed solution to domain tasks

    Intuitive Data-Driven Visualization of Food Relatedness via t-Distributed Stochastic Neighbor Embedding

    Get PDF
    The relationship between diet and health is important, yet difficultto study in practice. Dietary pattern analysis is one method forinvestigating this link; having more variety in diet tends to be bene-ficial and a score can be generated based on a heuristic approachto food intake habits. We aim to enhance the intuition behindthese food scores by creating an intuitive data-driven visualizationof food relatedness by leveraging t-distributed stochastic neighborembedding (t-SNE). More specifically, by performing t-SNE anal-ysis in a controlled manner to project the high-dimensional nutri-tional information of food items into a lower dimensional food sim-ilarity space, the natural clustering of foods based on the underly-ing nutritional composition becomes visually observable. The effi-cacy of this data-driven approach for visualizing food relatednesswas investigated on a total of 8549 food item entries in the USDAfood composition database, with the results showing considerablepromise as a tool for gaining important nutritional insights. This isthe first step toward providing a novel method to enhance dietarypattern analysis with additional context and insight into food intakehabits based on the inherent nutritional content of the foods con-sumed

    Dimensionality Reduction and Subspace Clustering in Mixed Reality for Condition Monitoring of High-Dimensional Production Data

    Get PDF
    Visual analytics are becoming more and more important in the light of big data and related scenarios. Along this trend, the field of immersive analytics has been variously furthered as it is able to provide sophisticated visual data analytics on one hand, while preserving user-friendliness on the other. Furthermore, recent hardware developments like smart glasses, as well as achievements in virtual-reality applications, have fanned immersive analytic solutions. Notably, such solutions can be very effective when they are applied to high-dimensional data sets. Taking this advantage into account, the work at hand applies immersive analytics to a high-dimensional production data set in order to improve the digital support of daily work tasks. More specifically, a mixed-reality implementation is presented that shall support manufactures as well as data scientists to comprehensively analyze machine data. As a particular goal, the prototype shall simplify the analysis of manufacturing data through the usage of dimensionality reduction effects. Therefore, five aspects are mainly reported in this paper. First, it is shown how dimensionality reduction effects can be represented by clusters. Second, it is presented how the resulting information loss of the reduction is addressed. Third, the graphical interface of the developed prototype is illustrated as it provides a (1) correlation coefficient graph, a (2) plot for the information loss, and a (3) 3D particle system. In addition, an implemented voice recognition feature of the prototype is shown, which was considered as being promising to select or deselect data variables users are interested in when analyzing the data. Fourth, based on a machine learning library, it is shown how the prototype reduces computational resources by the use of smart glasses. The main idea is based on a recommendation approach as well as the use of subspace clustering. Fifth, results from a practical setting are presented, in which the prototype was shown to domain experts. The latter reported that such a tool is actually helpful to analyze machine data on a daily basis. Moreover, it was reported that such system can be used to educate machine operators more properly. As a general outcome of this work, the presented approach may constitute a helpful solution for the industry as well as other domains like medicine
    • …
    corecore