45,884 research outputs found
Unsupervised Extraction of Representative Concepts from Scientific Literature
This paper studies the automated categorization and extraction of scientific
concepts from titles of scientific articles, in order to gain a deeper
understanding of their key contributions and facilitate the construction of a
generic academic knowledgebase. Towards this goal, we propose an unsupervised,
domain-independent, and scalable two-phase algorithm to type and extract key
concept mentions into aspects of interest (e.g., Techniques, Applications,
etc.). In the first phase of our algorithm we propose PhraseType, a
probabilistic generative model which exploits textual features and limited POS
tags to broadly segment text snippets into aspect-typed phrases. We extend this
model to simultaneously learn aspect-specific features and identify academic
domains in multi-domain corpora, since the two tasks mutually enhance each
other. In the second phase, we propose an approach based on adaptor grammars to
extract fine grained concept mentions from the aspect-typed phrases without the
need for any external resources or human effort, in a purely data-driven
manner. We apply our technique to study literature from diverse scientific
domains and show significant gains over state-of-the-art concept extraction
techniques. We also present a qualitative analysis of the results obtained.Comment: Published as a conference paper at CIKM 201
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
Most successful information extraction systems operate with access to a large
collection of documents. In this work, we explore the task of acquiring and
incorporating external evidence to improve extraction accuracy in domains where
the amount of training data is scarce. This process entails issuing search
queries, extraction from new sources and reconciliation of extracted values,
which are repeated until sufficient evidence is collected. We approach the
problem using a reinforcement learning framework where our model learns to
select optimal actions based on contextual information. We employ a deep
Q-network, trained to optimize a reward function that reflects extraction
accuracy while penalizing extra effort. Our experiments on two databases -- of
shooting incidents, and food adulteration cases -- demonstrate that our system
significantly outperforms traditional extractors and a competitive
meta-classifier baseline.Comment: Appearing in EMNLP 2016 (12 pages incl. supplementary material
Reading the Source Code of Social Ties
Though online social network research has exploded during the past years, not
much thought has been given to the exploration of the nature of social links.
Online interactions have been interpreted as indicative of one social process
or another (e.g., status exchange or trust), often with little systematic
justification regarding the relation between observed data and theoretical
concept. Our research aims to breach this gap in computational social science
by proposing an unsupervised, parameter-free method to discover, with high
accuracy, the fundamental domains of interaction occurring in social networks.
By applying this method on two online datasets different by scope and type of
interaction (aNobii and Flickr) we observe the spontaneous emergence of three
domains of interaction representing the exchange of status, knowledge and
social support. By finding significant relations between the domains of
interaction and classic social network analysis issues (e.g., tie strength,
dyadic interaction over time) we show how the network of interactions induced
by the extracted domains can be used as a starting point for more nuanced
analysis of online social data that may one day incorporate the normative
grammar of social interaction. Our methods finds applications in online social
media services ranging from recommendation to visual link summarization.Comment: 10 pages, 8 figures, Proceedings of the 2014 ACM conference on Web
(WebSci'14
- …