19,521 research outputs found
Using Answer Set Programming for pattern mining
Serial pattern mining consists in extracting the frequent sequential patterns
from a unique sequence of itemsets. This paper explores the ability of a
declarative language, such as Answer Set Programming (ASP), to solve this issue
efficiently. We propose several ASP implementations of the frequent sequential
pattern mining task: a non-incremental and an incremental resolution. The
results show that the incremental resolution is more efficient than the
non-incremental one, but both ASP programs are less efficient than dedicated
algorithms. Nonetheless, this approach can be seen as a first step toward a
generic framework for sequential pattern mining with constraints.Comment: Intelligence Artificielle Fondamentale (2014
Unsupervised Extraction of Representative Concepts from Scientific Literature
This paper studies the automated categorization and extraction of scientific
concepts from titles of scientific articles, in order to gain a deeper
understanding of their key contributions and facilitate the construction of a
generic academic knowledgebase. Towards this goal, we propose an unsupervised,
domain-independent, and scalable two-phase algorithm to type and extract key
concept mentions into aspects of interest (e.g., Techniques, Applications,
etc.). In the first phase of our algorithm we propose PhraseType, a
probabilistic generative model which exploits textual features and limited POS
tags to broadly segment text snippets into aspect-typed phrases. We extend this
model to simultaneously learn aspect-specific features and identify academic
domains in multi-domain corpora, since the two tasks mutually enhance each
other. In the second phase, we propose an approach based on adaptor grammars to
extract fine grained concept mentions from the aspect-typed phrases without the
need for any external resources or human effort, in a purely data-driven
manner. We apply our technique to study literature from diverse scientific
domains and show significant gains over state-of-the-art concept extraction
techniques. We also present a qualitative analysis of the results obtained.Comment: Published as a conference paper at CIKM 201
Lifelong Learning CRF for Supervised Aspect Extraction
This paper makes a focused contribution to supervised aspect extraction. It
shows that if the system has performed aspect extraction from many past domains
and retained their results as knowledge, Conditional Random Fields (CRF) can
leverage this knowledge in a lifelong learning manner to extract in a new
domain markedly better than the traditional CRF without using this prior
knowledge. The key innovation is that even after CRF training, the model can
still improve its extraction with experiences in its applications.Comment: Accepted at ACL 2017. arXiv admin note: text overlap with
arXiv:1612.0794
Sequential Importance Sampling for Online Bayesian Changepoint Detection
Online detection of abrupt changes in the parameters of a generative model for a time series is useful when modelling data in areas of application such as finance, robotics, and biometrics. We present an algorithm based on Sequential Importance Sampling which allows this problem to be solved in an online setting without relying on conjugate priors. Our results are exact and unbiased as we avoid using posterior approximations, and only rely on Monte Carlo integration when computing predictive probabilities. We apply the proposed algorithm to three example data sets. In two of the examples we compare our results to previously published analyses which used conjugate priors. In the third example we demonstrate an application where conjugate priors are not available. Avoiding conjugate priors allows a wider range of models to be considered with Bayesian changepoint detection, and additionally allows the use of arbitrary informative priors to quantify the uncertainty more flexibly
- …