9,792 research outputs found

    LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation

    Full text link
    We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate topics. One of the problems with methods like LDA is that users who apply them may not understand the topics that are generated. Also, users may find it difficult to search correlated topics and correlated documents. LDAExplore, tries to alleviate these problems by visualizing topic and word distributions generated from the document corpus and allowing the user to interact with them. The system is designed for users, who have minimal knowledge of LDA or Topic Modelling methods. To evaluate our design, we run a pilot study which uses the abstracts of 322 Information Visualization papers, where every abstract is considered a document. The topics generated are then explored by users. The results show that users are able to find correlated documents and group them based on topics that are similar

    Visualizing Evolving Trees

    Full text link
    Evolving trees arise in many real-life scenarios from computer file systems and dynamic call graphs, to fake news propagation and disease spread. Most layout algorithms for static trees, however, do not work well in an evolving setting (e.g., they are not designed to be stable between time steps). Dynamic graph layout algorithms are better suited to this task, although they often introduce unnecessary edge crossings. With this in mind we propose two methods for visualizing evolving trees that guarantee no edge crossings, while optimizing (1) desired edge length realization, (2) layout compactness, and (3) stability. We evaluate the two new methods, along with four prior approaches (two static and two dynamic), on real-world datasets using quantitative metrics: stress, desired edge length realization, layout compactness, stability, and running time. The new methods are fully functional and available on github

    Document Collection Visualization and Clustering Using An Atom Metaphor for Display and Interaction

    Get PDF
    Visual Data Mining have proven to be of high value in exploratory data analysis and data mining because it provides an intuitive feedback on data analysis and support decision-making activities. Several visualization techniques have been developed for cluster discovery such as Grand Tour, HD-Eye, Star Coordinates, etc. They are very useful tool which are visualized in 2D or 3D; however, they have not simple for users who are not trained. This thesis proposes a new approach to build a 3D clustering visualization system for document clustering by using k-mean algorithm. A cluster will be represented by a neutron (centroid) and electrons (documents) which will keep a distance with neutron by force. Our approach employs quantified domain knowledge and explorative observation as prediction to map high dimensional data onto 3D space for revealing the relationship among documents. User can perform an intuitive visual assessment of the consistency of the cluster structure

    Visualization of graphs and trees for software analysis

    Get PDF
    A software architecture is an abstraction of a software system, which is indispensable for many software engineering tasks. Unfortunately, in many cases information pertaining to the software architecture is not available, outdated, or inappropriate for the task at hand. The RECONSTRUCTOR project focuses on software architecture reconstruction, i.e., obtaining architectural information from an existing system. Our research, which is part of RECONSTRUCTOR, focuses on interactive visualization and tries to answer the following question: How can users be enabled to understand the large amounts of information relevant for program understanding using visual representations? To answer this question, we have iteratively developed a number of techniques for visualizing software systems. A large number of these cases consists of hierarchically organized data, combined with adjacency relations. Examples are function calls within a hierarchically organized software system and correspondence relations between two different versions of a hierarchically organized software system. Hierarchical Edge Bundles (HEBs) are used to visualize adjacency relations in hierarchically organized data, such as the aforementioned function calls within a software system. HEBs significantly reduce visual clutter by visually bundling relations together. Massive Sequence Views (MSVs) are used in conjunction with HEBs to enable analysis of sequences of relations, such as function-call traces. HEBs are furthermore used to visually compare hierarchically organized data, e.g., two different versions of a software system. HEBs visually emphasize splits, joins, and relocations of subhierarchies and provide for interactive selection of sets of relations. Since HEBs require a hierarchy to perform the bundling, we present Force-Directed Edge Bundles (FDEBs) as an alternative to visually bundle relations together in the absence of a hierarchical component. FDEBs use a self-organizing approach to bundling in which edges are modeled as flexible springs that can attract each other. As a result, visual clutter is reduced and high-level edge patterns are better visible. Finally, in all these methods, a clear depiction of the direction of edges is important. We have therefore performed a separate study in which we evaluated ten representations (including the standard arrow) for depicting directed edges in a controlled user study

    Mapping Topics and Topic Bursts in PNAS

    Full text link
    Scientific research is highly dynamic. New areas of science continually evolve;others gain or lose importance, merge or split. Due to the steady increase in the number of scientific publications it is hard to keep an overview of the structure and dynamic development of one's own field of science, much less all scientific domains. However, knowledge of hot topics, emergent research frontiers, or change of focus in certain areas is a critical component of resource allocation decisions in research labs, governmental institutions, and corporations. This paper demonstrates the utilization of Kleinberg's burst detection algorithm, co-word occurrence analysis, and graph layout techniques to generate maps that support the identification of major research topics and trends. The approach was applied to analyze and map the complete set of papers published in the Proceedings of the National Academy of Sciences (PNAS) in the years 1982-2001. Six domain experts examined and commented on the resulting maps in an attempt to reconstruct the evolution of major research areas covered by PNAS
    • …
    corecore