1,774 research outputs found
Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps
Concept maps can be used to concisely represent important information and
bring structure into large document collections. Therefore, we study a variant
of multi-document summarization that produces summaries in the form of concept
maps. However, suitable evaluation datasets for this task are currently
missing. To close this gap, we present a newly created corpus of concept maps
that summarize heterogeneous collections of web documents on educational
topics. It was created using a novel crowdsourcing approach that allows us to
efficiently determine important elements in large document collections. We
release the corpus along with a baseline system and proposed evaluation
protocol to enable further research on this variant of summarization.Comment: Published at EMNLP 201
Summarization from Medical Documents: A Survey
Objective:
The aim of this paper is to survey the recent work in medical documents
summarization.
Background:
During the last decade, documents summarization got increasing attention by
the AI research community. More recently it also attracted the interest of the
medical research community as well, due to the enormous growth of information
that is available to the physicians and researchers in medicine, through the
large and growing number of published journals, conference proceedings, medical
sites and portals on the World Wide Web, electronic medical records, etc.
Methodology:
This survey gives first a general background on documents summarization,
presenting the factors that summarization depends upon, discussing evaluation
issues and describing briefly the various types of summarization techniques. It
then examines the characteristics of the medical domain through the different
types of medical documents. Finally, it presents and discusses the
summarization techniques used so far in the medical domain, referring to the
corresponding systems and their characteristics.
Discussion and conclusions:
The paper discusses thoroughly the promising paths for future research in
medical documents summarization. It mainly focuses on the issue of scaling to
large collections of documents in various languages and from different media,
on personalization issues, on portability to new sub-domains, and on the
integration of summarization technology in practical applicationsComment: 21 pages, 4 table
Words are Malleable: Computing Semantic Shifts in Political and Media Discourse
Recently, researchers started to pay attention to the detection of temporal
shifts in the meaning of words. However, most (if not all) of these approaches
restricted their efforts to uncovering change over time, thus neglecting other
valuable dimensions such as social or political variability. We propose an
approach for detecting semantic shifts between different viewpoints--broadly
defined as a set of texts that share a specific metadata feature, which can be
a time-period, but also a social entity such as a political party. For each
viewpoint, we learn a semantic space in which each word is represented as a low
dimensional neural embedded vector. The challenge is to compare the meaning of
a word in one space to its meaning in another space and measure the size of the
semantic shifts. We compare the effectiveness of a measure based on optimal
transformations between the two spaces with a measure based on the similarity
of the neighbors of the word in the respective spaces. Our experiments
demonstrate that the combination of these two performs best. We show that the
semantic shifts not only occur over time, but also along different viewpoints
in a short period of time. For evaluation, we demonstrate how this approach
captures meaningful semantic shifts and can help improve other tasks such as
the contrastive viewpoint summarization and ideology detection (measured as
classification accuracy) in political texts. We also show that the two laws of
semantic change which were empirically shown to hold for temporal shifts also
hold for shifts across viewpoints. These laws state that frequent words are
less likely to shift meaning while words with many senses are more likely to do
so.Comment: In Proceedings of the 26th ACM International on Conference on
Information and Knowledge Management (CIKM2017
- …