114 research outputs found
The SENSEI Annotated Corpus: Human Summaries of Reader Comment Conversations in On-line News
Researchers are beginning to explore how
to generate summaries of extended argumentative
conversations in social media,
such as those found in reader comments in
on-line news. To date, however, there has
been little discussion of what these summaries
should be like and a lack of humanauthored
exemplars, quite likely because
writing summaries of this kind of interchange
is so difficult. In this paper we
propose one type of reader comment summary
– the conversation overview summary
– that aims to capture the key argumentative
content of a reader comment
conversation. We describe a method we
have developed to support humans in authoring
conversation overview summaries
and present a publicly available corpus –
the first of its kind – of news articles plus
comment sets, each multiply annotated,
according to our method, with conversation
overview summaries
Automatic Label Generation for News Comment Clusters
We present a supervised approach to automat-
ically labelling topic clusters of reader com-
ments to online news. We use a feature set
that includes both features capturing proper-
ties local to the cluster and features that cap-
ture aspects from the news article and from
comments outside the cluster. We evaluate
the approach in an automatic and a manual,
task-based setting. Both evaluations show the
approach to outperform a baseline method,
which uses tf*idf to select comment-internal
terms for use as topic labels. We illustrate how
cluster labels can be used to generate cluster
summaries and present two alternative sum-
mary formats: a pie chart summary and an ab-
stractive summary
Automatic Label Generation for News Comment Clusters
We present a supervised approach to automat-
ically labelling topic clusters of reader com-
ments to online news. We use a feature set
that includes both features capturing proper-
ties local to the cluster and features that cap-
ture aspects from the news article and from
comments outside the cluster. We evaluate
the approach in an automatic and a manual,
task-based setting. Both evaluations show the
approach to outperform a baseline method,
which uses tf*idf to select comment-internal
terms for use as topic labels. We illustrate how
cluster labels can be used to generate cluster
summaries and present two alternative sum-
mary formats: a pie chart summary and an ab-
stractive summary
The SENSEI Overview of Newspaper Readers’ Comments
Automatic summarization of reader comments in on-line news
is a challenging but clearly useful task. Work to date has produced extractive
summaries using well-known techniques from other areas of NLP.
But do users really want these, and do they support users in realistic
tasks? We specify an alternative summary type for reader comments,
based on the notions of issues and viewpoints, and demonstrate our user
interface to present it. An evaluation to assess how well summarization
systems support users in time-limited tasks (identifying issues and characterizing
opinions) gives good results for this prototype
Large Scale Semi-supervised Object Detection using Visual and Semantic Knowledge Transfer
Deep CNN-based object detection systems have achieved
remarkable success on several large-scale object detection
benchmarks. However, training such detectors requires a
large number of labeled bounding boxes, which are more
difficult to obtain than image-level annotations. Previous
work addresses this issue by transforming image-level classifiers
into object detectors. This is done by modeling the
differences between the two on categories with both imagelevel
and bounding box annotations, and transferring this
information to convert classifiers to detectors for categories
without bounding box annotations. We improve this previous
work by incorporating knowledge about object similarities
from visual and semantic domains during the transfer
process. The intuition behind our proposed method is that
visually and semantically similar categories should exhibit
more common transferable properties than dissimilar categories,
e.g. a better detector would result by transforming
the differences between a dog classifier and a dog detector
onto the cat class, than would by transforming from
the violin class. Experimental results on the challenging
ILSVRC2013 detection dataset demonstrate that each of our
proposed object similarity based knowledge transfer methods
outperforms the baseline methods. We found strong evidence
that visual similarity and semantic relatedness are
complementary for the task, and when combined notably
improve detection, achieving state-of-the-art detection performance
in a semi-supervised setting
Visual and semantic knowledge transfer for large scale semi-supervised object detection
Deep CNN-based object detection systems have achieved remarkable success on several large-scale object detection benchmarks. However, training such detectors requires a large number of labeled bounding boxes, which are more difficult to obtain than image-level annotations. Previous work addresses this issue by transforming image-level classifiers into object detectors. This is done by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We improve this previous work by incorporating knowledge about object similarities from visual and semantic domains during the transfer process. The intuition behind our proposed method is that visually and semantically similar categories should exhibit more common transferable properties than dissimilar categories, e.g. a better detector would result by transforming the differences between a dog classifier and a dog detector onto the cat class, than would by transforming from the violin class. Experimental results on the challenging ILSVRC2013 detection dataset demonstrate that each of our proposed object similarity based knowledge transfer methods outperforms the baseline methods. We found strong evidence that visual similarity and semantic relatedness are complementary for the task, and when combined notably improve detection, achieving state-of-the-art detection performance in a semi-supervised setting
Svestka's Research: Then and Now
Zdenek Svestka's research work influenced many fields of solar physics,
especially in the area of flare research. In this article I take five of the
areas that particularly interested him and assess them in a "then and now"
style. His insights in each case were quite sound, although of course in the
modern era we have learned things that he could not readily have envisioned.
His own views about his research life have been published recently in this
journal, to which he contributed so much, and his memoir contains much
additional scientific and personal information (Svestka, 2010).Comment: Invited review for "Solar and Stellar Flares," a conference in honour
of Prof. Zden\v{e}k \v{S}vestka, Prague, June 23-27, 2014. This is a
contribution to a Topical Issue in Solar Physics, based on the presentations
at this meeting (Editors Lyndsay Fletcher and Petr Heinzel
Extracting bilingual terms from the Web
In this paper we make two contributions. First, we describe a multi-component system called BiTES (Bilingual Term Extraction System) designed to automatically gather domain-specific bilingual term pairs from Web data. BiTES components consist of data gathering tools, domain classifiers, monolingual text extraction systems and bilingual term aligners. BiTES is readily extendable to new language pairs and has been successfully used to gather bilingual terminology for 24 language pairs, including English and all official EU languages, save Irish. Second, we describe a novel set of methods for evaluating the main components of BiTES and present the results of our evaluation for six language pairs. Results show that the BiTES approach can be used to successfully harvest quality bilingual term pairs from the Web. Our evaluation method delivers significant insights about the strengths and weaknesses of our techniques. It can be straightforwardly reused to evaluate other bilingual term extraction systems and makes a novel contribution to the study of how to evaluate bilingual terminology extraction systems
- …