66 research outputs found
Implicit Discourse Relation Classification via Multi-Task Neural Networks
Without discourse connectives, classifying implicit discourse relations is a
challenging task and a bottleneck for building a practical discourse parser.
Previous research usually makes use of one kind of discourse framework such as
PDTB or RST to improve the classification performance on discourse relations.
Actually, under different discourse annotation frameworks, there exist multiple
corpora which have internal connections. To exploit the combination of
different discourse corpora, we design related discourse classification tasks
specific to a corpus, and propose a novel Convolutional Neural Network embedded
multi-task learning system to synthesize these tasks by learning both unique
and shared representations for each task. The experimental results on the PDTB
implicit discourse relation classification task demonstrate that our model
achieves significant gains over baseline systems.Comment: This is the pre-print version of a paper accepted by AAAI-1
Chinese Function Tag Labeling
PACLIC 23 / City University of Hong Kong / 3-5 December 200
Table-to-text Generation by Structure-aware Seq2seq Learning
Table-to-text generation aims to generate a description for a factual table
which can be viewed as a set of field-value records. To encode both the content
and the structure of a table, we propose a novel structure-aware seq2seq
architecture which consists of field-gating encoder and description generator
with dual attention. In the encoding phase, we update the cell memory of the
LSTM unit by a field gate and its corresponding field value in order to
incorporate field information into table representation. In the decoding phase,
dual attention mechanism which contains word level attention and field level
attention is proposed to model the semantic relevance between the generated
description and the table. We conduct experiments on the \texttt{WIKIBIO}
dataset which contains over 700k biographies and corresponding infoboxes from
Wikipedia. The attention visualizations and case studies show that our model is
capable of generating coherent and informative descriptions based on the
comprehensive understanding of both the content and the structure of a table.
Automatic evaluations also show our model outperforms the baselines by a great
margin. Code for this work is available on
https://github.com/tyliupku/wiki2bio.Comment: Accepted by AAAI201
Order-Planning Neural Text Generation From Structured Data
Generating texts from structured data (e.g., a table) is important for
various natural language processing tasks such as question answering and dialog
systems. In recent studies, researchers use neural language models and
encoder-decoder frameworks for table-to-text generation. However, these neural
network-based approaches do not model the order of contents during text
generation. When a human writes a summary based on a given table, he or she
would probably consider the content order before wording. In a biography, for
example, the nationality of a person is typically mentioned before occupation
in a biography. In this paper, we propose an order-planning text generation
model to capture the relationship between different fields and use such
relationship to make the generated text more fluent and smooth. We conducted
experiments on the WikiBio dataset and achieve significantly higher performance
than previous methods in terms of BLEU, ROUGE, and NIST scores
Guiding AMR Parsing with Reverse Graph Linearization
Abstract Meaning Representation (AMR) parsing aims to extract an abstract
semantic graph from a given sentence. The sequence-to-sequence approaches,
which linearize the semantic graph into a sequence of nodes and edges and
generate the linearized graph directly, have achieved good performance.
However, we observed that these approaches suffer from structure loss
accumulation during the decoding process, leading to a much lower F1-score for
nodes and edges decoded later compared to those decoded earlier. To address
this issue, we propose a novel Reverse Graph Linearization (RGL) enhanced
framework. RGL defines both default and reverse linearization orders of an AMR
graph, where most structures at the back part of the default order appear at
the front part of the reversed order and vice versa. RGL incorporates the
reversed linearization to the original AMR parser through a two-pass
self-distillation mechanism, which guides the model when generating the default
linearizations. Our analysis shows that our proposed method significantly
mitigates the problem of structure loss accumulation, outperforming the
previously best AMR parsing model by 0.8 and 0.5 Smatch scores on the AMR 2.0
and AMR 3.0 dataset, respectively. The code are available at
https://github.com/pkunlp-icler/AMR_reverse_graph_linearization.Comment: Findings of EMNLP202
Statistical Knowledge Assessment for Large Language Models
Given varying prompts regarding a factoid question, can a large language
model (LLM) reliably generate factually correct answers? Existing LLMs may
generate distinct responses for different prompts. In this paper, we study the
problem of quantifying knowledge contained in an LLM regarding a given set of
facts. We propose KaRR, a statistical approach to assess factual knowledge for
LLMs. The main idea is to estimate the ratio of LLM generating text
corresponding to the answer entity given diverse prompts of the subject and the
querying relation, versus it generating by random chances. Our assessment suite
contains a comprehensive set of 994,123 entities and 600 relations, with
1,395,905 text aliases. We use our method to evaluate 20 LLMs of various sizes,
including LLaMA, Alpaca, OPT, etc. Experiments show that our results have a
strong correlation (0.43 Kendall's ) with the results of human assessment
on LLMs. Our results reveal that the knowledge in LLMs with the same backbone
architecture adheres to the scaling law, while tuning on instruction-following
data sometimes compromises the model's capability to generate factually correct
text reliably.Comment: Accepted by NeurIPS 202
- …