Search CORE

5 research outputs found

Differentiable Instruction Optimization for Cross-Task Generalization

Author: Isonuma Masaru
Mori Junichiro
Sakata Ichiro
Publication venue
Publication date: 16/06/2023
Field of study

Instruction tuning has been attracting much attention to achieve generalization ability across a wide variety of tasks. Although various types of instructions have been manually created for instruction tuning, it is still unclear what kind of instruction is optimal to obtain cross-task generalization ability. This work presents instruction optimization, which optimizes training instructions with respect to generalization ability. Rather than manually tuning instructions, we introduce learnable instructions and optimize them with gradient descent by leveraging bilevel optimization. Experimental results show that the learned instruction enhances the diversity of instructions and improves the generalization ability compared to using only manually created instructions.Comment: 14pages, 6 figures, accepted for Findings of ACL202

arXiv.org e-Print Archive

SciReviewGen: A Large-scale Dataset for Automatic Literature Review Generation

Author: Isonuma Masaru
Kasanishi Tetsu
Mori Junichiro
Sakata Ichiro
Publication venue
Publication date: 24/05/2023
Field of study

Automatic literature review generation is one of the most challenging tasks in natural language processing. Although large language models have tackled literature review generation, the absence of large-scale datasets has been a stumbling block to the progress. We release SciReviewGen, consisting of over 10,000 literature reviews and 690,000 papers cited in the reviews. Based on the dataset, we evaluate recent transformer-based summarization models on the literature review generation task, including Fusion-in-Decoder extended for literature review generation. Human evaluation results show that some machine-generated summaries are comparable to human-written reviews, while revealing the challenges of automatic literature review generation such as hallucinations and a lack of detailed information. Our dataset and code are available at https://github.com/tetsu9923/SciReviewGen.Comment: ACL findings 2023 (to be appeared). arXiv admin note: text overlap with arXiv:1810.04020 by other author

arXiv.org e-Print Archive

Tree-Structured Neural Topic Model

Author: Bollegala Danushka
Isonuma Masaru
Linguist Assoc Computat
Mori Junichiro
Sakata Ichiro
Publication venue
Publication date: 01/01/2020
Field of study

University of Liverpool Repository

Crossref

Unsupervised Abstractive Opinion Summarization by Generating Sentences with Tree-Structured Topic Guidance

Author: Bollegala Danushka
Isonuma Masaru
Mori Junichiro
Sakata Ichiro
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2021
Field of study

This paper presents a novel unsupervised abstractive summarization method for opinionated texts. While the basic variational autoencoder-based models assume a unimodal Gaussian prior for the latent code of sentences, we alternate it with a recursive Gaussian mixture, where each mixture component corresponds to the latent code of a topic sentence and is mixed by a tree-structured topic distribution. By decoding each Gaussian component, we generate sentences with tree-structured topic guidance, where the root sentence conveys generic content, and the leaf sentences describe specific topics. Experimental results demonstrate that the generated topic sentences are appropriate as a summary of opinionated texts, which are more informative and cover more input contents than those generated by the recent unsupervised summarization model (Bra\v{z}inskas et al., 2020). Furthermore, we demonstrate that the variance of latent Gaussians represents the granularity of sentences, analogous to Gaussian word embedding (Vilnis and McCallum, 2015).Comment: accepted to TACL, pre-MIT Press publication versio

arXiv.org e-Print Archive

University of Liverpool Repository