92 research outputs found

    LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation

    Full text link
    We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate topics. One of the problems with methods like LDA is that users who apply them may not understand the topics that are generated. Also, users may find it difficult to search correlated topics and correlated documents. LDAExplore, tries to alleviate these problems by visualizing topic and word distributions generated from the document corpus and allowing the user to interact with them. The system is designed for users, who have minimal knowledge of LDA or Topic Modelling methods. To evaluate our design, we run a pilot study which uses the abstracts of 322 Information Visualization papers, where every abstract is considered a document. The topics generated are then explored by users. The results show that users are able to find correlated documents and group them based on topics that are similar

    Visualizing Topic Models About African American Women's Experiences and Standpoints

    Get PDF
    Keynote address at the 2016 Computational Social Science Workshop

    TOME: Interactive TOpic Model and MEtadata Visualization

    Get PDF
    As archives are being digitized at an increasing rate, scholars will require new tools to make sense of this expanding amount of material. We propose to build TOME, a tool to support the interactive exploration and visualization of text-based archives. Drawing upon the technique of topic modeling--a computational method for identifying themes that recur across a collection--TOME will visualize the topics that characterize each archive, as well as the relationships between specific topics and related metadata, such as publication date. An archive of 19th-century antislavery newspapers, characterized by diverse authors and shifting political alliances, will serve as our initial dataset; it promises to motivate new methods for visualizing topic models and extending their impact. In turn, by applying our new methods to these texts, we will illuminate how issues of gender and racial identity affect the development of political ideology in the nineteenth century, and into the present day

    A New Geometric Approach to Latent Topic Modeling and Discovery

    Full text link
    A new geometrically-motivated algorithm for nonnegative matrix factorization is developed and applied to the discovery of latent "topics" for text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme points of empirical cross-document word-frequencies that correspond to novel "words" unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets.Comment: This paper was submitted to the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2013 on November 30, 201

    TopicVis: A GUI for topic-based feedback and navigation

    Get PDF
    This paper describes a search system which includes topic model visualization to improve the user search experience. The system graphically renders the topics in a retrieved set of documents, enables a user to selectively refine search results and allows easy navigation through information on selective topics within documents
    corecore