Search CORE

5,511 research outputs found

LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation

Author: Brantley Kiante
Chen Jian
Ganesan Ashwinkumar
Pan Shimei
Publication venue
Publication date: 23/07/2015
Field of study

We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate topics. One of the problems with methods like LDA is that users who apply them may not understand the topics that are generated. Also, users may find it difficult to search correlated topics and correlated documents. LDAExplore, tries to alleviate these problems by visualizing topic and word distributions generated from the document corpus and allowing the user to interact with them. The system is designed for users, who have minimal knowledge of LDA or Topic Modelling methods. To evaluate our design, we run a pilot study which uses the abstracts of 322 Information Visualization papers, where every abstract is considered a document. The topics generated are then explored by users. The results show that users are able to find correlated documents and group them based on topics that are similar

arXiv.org e-Print Archive

CiteSeerX

Guidelines For Pursuing and Revealing Data Abstractions

Author: Bigelow Alex
Isaacs Katherine E.
Williams Katy
Publication venue
Publication date: 07/09/2020
Field of study

Many data abstraction types, such as networks or set relationships, remain unfamiliar to data workers beyond the visualization research community. We conduct a survey and series of interviews about how people describe their data, either directly or indirectly. We refer to the latter as latent data abstractions. We conduct a Grounded Theory analysis that (1) interprets the extent to which latent data abstractions exist, (2) reveals the far-reaching effects that the interventionist pursuit of such abstractions can have on data workers, (3) describes why and when data workers may resist such explorations, and (4) suggests how to take advantage of opportunities and mitigate risks through transparency about visualization research perspectives and agendas. We then use the themes and codes discovered in the Grounded Theory analysis to develop guidelines for data abstraction in visualization projects. To continue the discussion, we make our dataset open along with a visual interface for further exploration

arXiv.org e-Print Archive

The University of Arizona

A format for phylogenetic placements

Author: A Kluge
A Monier
Aaron Gallagher
Alexandros Stamatakis
C Von Mering
D Crockford
F Matsen
F Matsen
Frederick A. Matsen
J Caporaso
J Felsenstein
Jonathan H. Badger
M Pirrung
M Stark
M Wu
Noah G. Hoffman
O Westesson
S Berger
S Berger
S Evans
S Mirarab
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 16/01/2012
Field of study

We have developed a unified format for phylogenetic placements, that is, mappings of environmental sequence data (e.g. short reads) into a phylogenetic tree. We are motivated to do so by the growing number of tools for computing and post-processing phylogenetic placements, and the lack of an established standard for storing them. The format is lightweight, versatile, extensible, and is based on the JSON format which can be parsed by most modern programming languages. Our format is already implemented in several tools for computing and post-processing parsimony- and likelihood-based phylogenetic placements, and has worked well in practice. We believe that establishing a standard format for analyzing read placements at this early stage will lead to a more efficient development of powerful and portable post-analysis tools for the growing applications of phylogenetic placement.Comment: Documents version 3 of the forma

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

A review of data visualization: opportunities in manufacturing sequence management.

Author: Al-Gaylani M. F.
Sackett Peter J.
Tiwari Ashutosh
Williams D.
Publication venue: 'Informa UK Limited'
Publication date: 01/10/2006
Field of study

Data visualization now benefits from developments in technologies that offer innovative ways of presenting complex data. Potentially these have widespread application in communicating the complex information domains typical of manufacturing sequence management environments for global enterprises. In this paper the authors review the visualization functionalities, techniques and applications reported in literature, map these to manufacturing sequence information presentation requirements and identify the opportunities available and likely development paths. Current leading-edge practice in dynamic updating and communication with suppliers is not being exploited in manufacturing sequence management; it could provide significant benefits to manufacturing business. In the context of global manufacturing operations and broad-based user communities with differing needs served by common data sets, tool functionality is generally ahead of user application

Crossref

Cranfield CERES

Cheetah Experimental Platform Web 1.0: Cleaning Pupillary Data

Author: Maran Thomas
Neurauter Manuel
Pinggera Jakob
Weber Barbara
Zugal Stefan
Publication venue
Publication date: 01/01/2017
Field of study

Recently, researchers started using cognitive load in various settings, e.g., educational psychology, cognitive load theory, or human-computer interaction. Cognitive load characterizes a tasks' demand on the limited information processing capacity of the brain. The widespread adoption of eye-tracking devices led to increased attention for objectively measuring cognitive load via pupil dilation. However, this approach requires a standardized data processing routine to reliably measure cognitive load. This technical report presents CEP-Web, an open source platform to providing state of the art data processing routines for cleaning pupillary data combined with a graphical user interface, enabling the management of studies and subjects. Future developments will include the support for analyzing the cleaned data as well as support for Task-Evoked Pupillary Response (TEPR) studies

arXiv.org e-Print Archive

Online Research Database In Technology