517 research outputs found
Semi-Supervised Learning for Neural Keyphrase Generation
We study the problem of generating keyphrases that summarize the key points
for a given document. While sequence-to-sequence (seq2seq) models have achieved
remarkable performance on this task (Meng et al., 2017), model training often
relies on large amounts of labeled data, which is only applicable to
resource-rich domains. In this paper, we propose semi-supervised keyphrase
generation methods by leveraging both labeled data and large-scale unlabeled
samples for learning. Two strategies are proposed. First, unlabeled documents
are first tagged with synthetic keyphrases obtained from unsupervised keyphrase
extraction methods or a selflearning algorithm, and then combined with labeled
samples for training. Furthermore, we investigate a multi-task learning
framework to jointly learn to generate keyphrases as well as the titles of the
articles. Experimental results show that our semi-supervised learning-based
methods outperform a state-of-the-art model trained with labeled data only.Comment: To appear in EMNLP 2018 (12 pages, 7 figures, 6 tables
Optical tomography: Image improvement using mixed projection of parallel and fan beam modes
Mixed parallel and fan beam projection is a technique used to increase the quality images. This research focuses on enhancing the image quality in optical tomography. Image quality can be deļ¬ned by measuring the Peak Signal to Noise Ratio (PSNR) and Normalized Mean Square Error (NMSE) parameters. The ļ¬ndings of this research prove that by combining parallel and fan beam projection, the image quality can be increased by more than 10%in terms of its PSNR value and more than 100% in terms of its NMSE value compared to a single parallel beam
Video Timeline Modeling For News Story Understanding
In this paper, we present a novel problem, namely video timeline modeling.
Our objective is to create a video-associated timeline from a set of videos
related to a specific topic, thereby facilitating the content and structure
understanding of the story being told. This problem has significant potential
in various real-world applications, such as news story summarization. To
bootstrap research in this area, we curate a realistic benchmark dataset,
YouTube-News-Timeline, consisting of over k timelines and k YouTube
news videos. Additionally, we propose a set of quantitative metrics as the
protocol to comprehensively evaluate and compare methodologies. With such a
testbed, we further develop and benchmark exploratory deep learning approaches
to tackle this problem. We anticipate that this exploratory work will pave the
way for further research in video timeline modeling. The assets are available
via
https://github.com/google-research/google-research/tree/master/video_timeline_modeling.Comment: Accepted as a spotlight by NeurIPS 2023, Track on Datasets and
Benchmark
Detecting (Un)Important Content for Single-Document News Summarization
We present a robust approach for detecting intrinsic sentence importance in
news, by training on two corpora of document-summary pairs. When used for
single-document summarization, our approach, combined with the "beginning of
document" heuristic, outperforms a state-of-the-art summarizer and the
beginning-of-article baseline in both automatic and manual evaluations. These
results represent an important advance because in the absence of cross-document
repetition, single document summarizers for news have not been able to
consistently outperform the strong beginning-of-article baseline.Comment: Accepted By EACL 201
- ā¦