71,302 research outputs found
LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation
We present LDAExplore, a tool to visualize topic distributions in a given
document corpus that are generated using Topic Modeling methods. Latent
Dirichlet Allocation (LDA) is one of the basic methods that is predominantly
used to generate topics. One of the problems with methods like LDA is that
users who apply them may not understand the topics that are generated. Also,
users may find it difficult to search correlated topics and correlated
documents. LDAExplore, tries to alleviate these problems by visualizing topic
and word distributions generated from the document corpus and allowing the user
to interact with them. The system is designed for users, who have minimal
knowledge of LDA or Topic Modelling methods. To evaluate our design, we run a
pilot study which uses the abstracts of 322 Information Visualization papers,
where every abstract is considered a document. The topics generated are then
explored by users. The results show that users are able to find correlated
documents and group them based on topics that are similar
Topic Similarity Networks: Visual Analytics for Large Document Sets
We investigate ways in which to improve the interpretability of LDA topic
models by better analyzing and visualizing their outputs. We focus on examining
what we refer to as topic similarity networks: graphs in which nodes represent
latent topics in text collections and links represent similarity among topics.
We describe efficient and effective approaches to both building and labeling
such networks. Visualizations of topic models based on these networks are shown
to be a powerful means of exploring, characterizing, and summarizing large
collections of unstructured text documents. They help to "tease out"
non-obvious connections among different sets of documents and provide insights
into how topics form larger themes. We demonstrate the efficacy and
practicality of these approaches through two case studies: 1) NSF grants for
basic research spanning a 14 year period and 2) the entire English portion of
Wikipedia.Comment: 9 pages; 2014 IEEE International Conference on Big Data (IEEE BigData
2014
TopicViz: Semantic Navigation of Document Collections
When people explore and manage information, they think in terms of topics and
themes. However, the software that supports information exploration sees text
at only the surface level. In this paper we show how topic modeling -- a
technique for identifying latent themes across large collections of documents
-- can support semantic exploration. We present TopicViz, an interactive
environment for information exploration. TopicViz combines traditional search
and citation-graph functionality with a range of novel interactive
visualizations, centered around a force-directed layout that links documents to
the latent themes discovered by the topic model. We describe several use
scenarios in which TopicViz supports rapid sensemaking on large document
collections
Visualizing Bags of Vectors
The motivation of this work is two-fold - a) to compare between two different
modes of visualizing data that exists in a bag of vectors format b) to propose
a theoretical model that supports a new mode of visualizing data. Visualizing
high dimensional data can be achieved using Minimum Volume Embedding, but the
data has to exist in a format suitable for computing similarities while
preserving local distances. This paper compares the visualization between two
methods of representing data and also proposes a new method providing sample
visualizations for that method
- …