3,505 research outputs found
Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time
Dynamic topic modeling facilitates the identification of topical trends over
time in temporal collections of unstructured documents. We introduce a novel
unsupervised neural dynamic topic model named as Recurrent Neural
Network-Replicated Softmax Model (RNNRSM), where the discovered topics at each
time influence the topic discovery in the subsequent time steps. We account for
the temporal ordering of documents by explicitly modeling a joint distribution
of latent topical dependencies over time, using distributional estimators with
temporal recurrent connections. Applying RNN-RSM to 19 years of articles on NLP
research, we demonstrate that compared to state-of-the art topic models, RNNRSM
shows better generalization, topic interpretation, evolution and trends. We
also introduce a metric (named as SPAN) to quantify the capability of dynamic
topic model to capture word evolution in topics over time.Comment: In Proceedings of the 16th Annual Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language
Technologies (NAACL-HLT 2018
Cross-lingual RST Discourse Parsing
Discourse parsing is an integral part of understanding information flow and
argumentative structure in documents. Most previous research has focused on
inducing and evaluating models from the English RST Discourse Treebank.
However, discourse treebanks for other languages exist, including Spanish,
German, Basque, Dutch and Brazilian Portuguese. The treebanks share the same
underlying linguistic theory, but differ slightly in the way documents are
annotated. In this paper, we present (a) a new discourse parser which is
simpler, yet competitive (significantly better on 2/3 metrics) to state of the
art for English, (b) a harmonization of discourse treebanks across languages,
enabling us to present (c) what to the best of our knowledge are the first
experiments on cross-lingual discourse parsing.Comment: To be published in EACL 2017, 13 page
- …