5,511 research outputs found
LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation
We present LDAExplore, a tool to visualize topic distributions in a given
document corpus that are generated using Topic Modeling methods. Latent
Dirichlet Allocation (LDA) is one of the basic methods that is predominantly
used to generate topics. One of the problems with methods like LDA is that
users who apply them may not understand the topics that are generated. Also,
users may find it difficult to search correlated topics and correlated
documents. LDAExplore, tries to alleviate these problems by visualizing topic
and word distributions generated from the document corpus and allowing the user
to interact with them. The system is designed for users, who have minimal
knowledge of LDA or Topic Modelling methods. To evaluate our design, we run a
pilot study which uses the abstracts of 322 Information Visualization papers,
where every abstract is considered a document. The topics generated are then
explored by users. The results show that users are able to find correlated
documents and group them based on topics that are similar
Guidelines For Pursuing and Revealing Data Abstractions
Many data abstraction types, such as networks or set relationships, remain
unfamiliar to data workers beyond the visualization research community. We
conduct a survey and series of interviews about how people describe their data,
either directly or indirectly. We refer to the latter as latent data
abstractions. We conduct a Grounded Theory analysis that (1) interprets the
extent to which latent data abstractions exist, (2) reveals the far-reaching
effects that the interventionist pursuit of such abstractions can have on data
workers, (3) describes why and when data workers may resist such explorations,
and (4) suggests how to take advantage of opportunities and mitigate risks
through transparency about visualization research perspectives and agendas. We
then use the themes and codes discovered in the Grounded Theory analysis to
develop guidelines for data abstraction in visualization projects. To continue
the discussion, we make our dataset open along with a visual interface for
further exploration
A format for phylogenetic placements
We have developed a unified format for phylogenetic placements, that is,
mappings of environmental sequence data (e.g. short reads) into a phylogenetic
tree. We are motivated to do so by the growing number of tools for computing
and post-processing phylogenetic placements, and the lack of an established
standard for storing them. The format is lightweight, versatile, extensible,
and is based on the JSON format which can be parsed by most modern programming
languages. Our format is already implemented in several tools for computing and
post-processing parsimony- and likelihood-based phylogenetic placements, and
has worked well in practice. We believe that establishing a standard format for
analyzing read placements at this early stage will lead to a more efficient
development of powerful and portable post-analysis tools for the growing
applications of phylogenetic placement.Comment: Documents version 3 of the forma
A review of data visualization: opportunities in manufacturing sequence management.
Data visualization now benefits from developments in technologies that offer innovative ways of presenting complex data. Potentially these have widespread application in communicating the complex information domains typical of manufacturing sequence management environments for global enterprises. In this paper the authors review the visualization functionalities, techniques and applications reported in literature, map these to manufacturing sequence information presentation requirements and identify the opportunities available and likely development paths. Current leading-edge practice in dynamic updating and communication with suppliers is not being exploited in manufacturing sequence management; it could provide significant benefits to manufacturing business. In the context of global manufacturing operations and broad-based user communities with differing needs served by common data sets, tool functionality is generally ahead of user application
Cheetah Experimental Platform Web 1.0: Cleaning Pupillary Data
Recently, researchers started using cognitive load in various settings, e.g.,
educational psychology, cognitive load theory, or human-computer interaction.
Cognitive load characterizes a tasks' demand on the limited information
processing capacity of the brain. The widespread adoption of eye-tracking
devices led to increased attention for objectively measuring cognitive load via
pupil dilation. However, this approach requires a standardized data processing
routine to reliably measure cognitive load. This technical report presents
CEP-Web, an open source platform to providing state of the art data processing
routines for cleaning pupillary data combined with a graphical user interface,
enabling the management of studies and subjects. Future developments will
include the support for analyzing the cleaned data as well as support for
Task-Evoked Pupillary Response (TEPR) studies
- …