8,964 research outputs found
Exploratory topic modeling with distributional semantics
As we continue to collect and store textual data in a multitude of domains,
we are regularly confronted with material whose largely unknown thematic
structure we want to uncover. With unsupervised, exploratory analysis, no prior
knowledge about the content is required and highly open-ended tasks can be
supported. In the past few years, probabilistic topic modeling has emerged as a
popular approach to this problem. Nevertheless, the representation of the
latent topics as aggregations of semi-coherent terms limits their
interpretability and level of detail.
This paper presents an alternative approach to topic modeling that maps
topics as a network for exploration, based on distributional semantics using
learned word vectors. From the granular level of terms and their semantic
similarity relations global topic structures emerge as clustered regions and
gradients of concepts. Moreover, the paper discusses the visual interactive
representation of the topic map, which plays an important role in supporting
its exploration.Comment: Conference: The Fourteenth International Symposium on Intelligent
Data Analysis (IDA 2015
Connection Discovery using Shared Images by Gaussian Relational Topic Model
Social graphs, representing online friendships among users, are one of the
fundamental types of data for many applications, such as recommendation,
virality prediction and marketing in social media. However, this data may be
unavailable due to the privacy concerns of users, or kept private by social
network operators, which makes such applications difficult. Inferring user
interests and discovering user connections through their shared multimedia
content has attracted more and more attention in recent years. This paper
proposes a Gaussian relational topic model for connection discovery using user
shared images in social media. The proposed model not only models user
interests as latent variables through their shared images, but also considers
the connections between users as a result of their shared images. It explicitly
relates user shared images to user connections in a hierarchical, systematic
and supervisory way and provides an end-to-end solution for the problem. This
paper also derives efficient variational inference and learning algorithms for
the posterior of the latent variables and model parameters. It is demonstrated
through experiments with over 200k images from Flickr that the proposed method
significantly outperforms the methods in previous works.Comment: IEEE International Conference on Big Data 201
Recommended from our members
Do Engineering Students Learn Ethics From an Ethics Course?
The goal of the present research is to develop machine-assisted methods that can assist in the analysis of students’ written compositions in ethics courses. As part of this research, we analyzed Social Impact Assessment (SIA) papers submitted by engineering undergraduates in a course on engineering ethics. The SIA papers required students to identify and discuss a contemporary engineering technology (e.g., autonomous tractor trailers) and to explicitly discuss the ethical issues involved in that technology. Here we describe the ability of three machine tools to discriminate differences in the technical compared to ethical portions of the SIA papers. First, using LIWC (Language Inquiry and Word Count) we quantified differences in analytical thinking, expertise and self-confidence, disclosure, and affect, in the technical and ethical portions of the papers. Next, we applied MEH (Meaning Extraction Helper) to examine differences in critical concepts in the technical and ethical portions of the papers. Finally, we used LDA (Latent Dirichlet Allocation) to examine differences in the topics in the technical and ethical portions of the papers. The results of these three tests demonstrate the ability of machine-based tools to discriminate conceptual, affective, and motivational differences in the texts that students compose that relate to engineering technology and to engineering ethics. We discuss the utility and future directions for this research.Cockrell School of Engineerin
Detecting and Explaining Causes From Text For a Time Series Event
Explaining underlying causes or effects about events is a challenging but
valuable task. We define a novel problem of generating explanations of a time
series event by (1) searching cause and effect relationships of the time series
with textual data and (2) constructing a connecting chain between them to
generate an explanation. To detect causal features from text, we propose a
novel method based on the Granger causality of time series between features
extracted from text such as N-grams, topics, sentiments, and their composition.
The generation of the sequence of causal entities requires a commonsense
causative knowledge base with efficient reasoning. To ensure good
interpretability and appropriate lexical usage we combine symbolic and neural
representations, using a neural reasoning algorithm trained on commonsense
causal tuples to predict the next cause step. Our quantitative and human
analysis show empirical evidence that our method successfully extracts
meaningful causality relationships between time series with textual features
and generates appropriate explanation between them.Comment: Accepted at EMNLP 201
- …