27,779 research outputs found
Contextualization of topics - browsing through terms, authors, journals and cluster allocations
This paper builds on an innovative Information Retrieval tool, Ariadne. The
tool has been developed as an interactive network visualization and browsing
tool for large-scale bibliographic databases. It basically allows to gain
insights into a topic by contextualizing a search query (Koopman et al., 2015).
In this paper, we apply the Ariadne tool to a far smaller dataset of 111,616
documents in astronomy and astrophysics. Labeled as the Berlin dataset, this
data have been used by several research teams to apply and later compare
different clustering algorithms. The quest for this team effort is how to
delineate topics. This paper contributes to this challenge in two different
ways. First, we produce one of the different cluster solution and second, we
use Ariadne (the method behind it, and the interface - called LittleAriadne) to
display cluster solutions of the different group members. By providing a tool
that allows the visual inspection of the similarity of article clusters
produced by different algorithms, we present a complementary approach to other
possible means of comparison. More particular, we discuss how we can - with
LittleAriadne - browse through the network of topical terms, authors, journals
and cluster solutions in the Berlin dataset and compare cluster solutions as
well as see their context.Comment: proceedings of the ISSI 2015 conference (accepted
Exploratory topic modeling with distributional semantics
As we continue to collect and store textual data in a multitude of domains,
we are regularly confronted with material whose largely unknown thematic
structure we want to uncover. With unsupervised, exploratory analysis, no prior
knowledge about the content is required and highly open-ended tasks can be
supported. In the past few years, probabilistic topic modeling has emerged as a
popular approach to this problem. Nevertheless, the representation of the
latent topics as aggregations of semi-coherent terms limits their
interpretability and level of detail.
This paper presents an alternative approach to topic modeling that maps
topics as a network for exploration, based on distributional semantics using
learned word vectors. From the granular level of terms and their semantic
similarity relations global topic structures emerge as clustered regions and
gradients of concepts. Moreover, the paper discusses the visual interactive
representation of the topic map, which plays an important role in supporting
its exploration.Comment: Conference: The Fourteenth International Symposium on Intelligent
Data Analysis (IDA 2015
TGVizTab: An ontology visualisation extension for Protégé
Ontologies are gaining a lot of interest and many are being developed to provide a variety of knowledge services. There is an increasing need for tools to graphically and in-teractively visualise such modelling structures to enhance their clarification, verification and analysis. Protégé 2000 is one of the most popular ontology modelling tools currently available. This paper introduces TGVizTab; a new Protégé plugin based on TouchGraph technology to graphically visualise Protégé?s ontologies
Visualising the structure of document search results: A comparison of graph theoretic approaches
This is the post-print of the article - Copyright @ 2010 Sage PublicationsPrevious work has shown that distance-similarity visualisation or ‘spatialisation’ can provide a potentially useful context in which to browse the results of a query search, enabling the user to adopt a simple local foraging or ‘cluster growing’ strategy to navigate through the retrieved document set. However, faithfully mapping feature-space models to visual space can be problematic owing to their inherent high dimensionality and non-linearity. Conventional linear approaches to dimension reduction tend to fail at this kind of task, sacrificing local structural in order to preserve a globally optimal mapping. In this paper the clustering performance of a recently proposed algorithm called isometric feature mapping (Isomap), which deals with non-linearity by transforming dissimilarities into geodesic distances, is compared to that of non-metric multidimensional scaling (MDS). Various graph pruning methods, for geodesic distance estimation, are also compared. Results show that Isomap is significantly better at preserving local structural detail than MDS, suggesting it is better suited to cluster growing and other semantic navigation tasks. Moreover, it is shown that applying a minimum-cost graph pruning criterion can provide a parameter-free alternative to the traditional K-neighbour method, resulting in spatial clustering that is equivalent to or better than that achieved using an optimal-K criterion
- …