167 research outputs found
A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability
This paper presents the first comprehensive analysis of ChatGPT's Text-to-SQL
ability. Given the recent emergence of large-scale conversational language
model ChatGPT and its impressive capabilities in both conversational abilities
and code generation, we sought to evaluate its Text-to-SQL performance. We
conducted experiments on 12 benchmark datasets with different languages,
settings, or scenarios, and the results demonstrate that ChatGPT has strong
text-to-SQL abilities. Although there is still a gap from the current
state-of-the-art (SOTA) model performance, considering that the experiment was
conducted in a zero-shot scenario, ChatGPT's performance is still impressive.
Notably, in the ADVETA (RPL) scenario, the zero-shot ChatGPT even outperforms
the SOTA model that requires fine-tuning on the Spider dataset by 4.1\%,
demonstrating its potential for use in practical applications. To support
further research in related fields, we have made the data generated by ChatGPT
publicly available at https://github.com/THU-BPM/chatgpt-sql.Comment: 6 pages, 1 figure
A Semantic Invariant Robust Watermark for Large Language Models
Watermark algorithms for large language models (LLMs) have achieved extremely
high accuracy in detecting text generated by LLMs. Such algorithms typically
involve adding extra watermark logits to the LLM's logits at each generation
step. However, prior algorithms face a trade-off between attack robustness and
security robustness. This is because the watermark logits for a token are
determined by a certain number of preceding tokens; a small number leads to low
security robustness, while a large number results in insufficient attack
robustness. In this work, we propose a semantic invariant watermarking method
for LLMs that provides both attack robustness and security robustness. The
watermark logits in our work are determined by the semantics of all preceding
tokens. Specifically, we utilize another embedding LLM to generate semantic
embeddings for all preceding tokens, and then these semantic embeddings are
transformed into the watermark logits through our trained watermark model.
Subsequent analyses and experiments demonstrated the attack robustness of our
method in semantically invariant settings: synonym substitution and text
paraphrasing settings. Finally, we also show that our watermark possesses
adequate security robustness. Our code and data are available at
https://github.com/THU-BPM/Robust_Watermark.Comment: 16 pages, 9 figures, 2 table
Scene Graph Modification as Incremental Structure Expanding
A scene graph is a semantic representation that expresses the objects,
attributes, and relationships between objects in a scene. Scene graphs play an
important role in many cross modality tasks, as they are able to capture the
interactions between images and texts. In this paper, we focus on scene graph
modification (SGM), where the system is required to learn how to update an
existing scene graph based on a natural language query. Unlike previous
approaches that rebuilt the entire scene graph, we frame SGM as a graph
expansion task by introducing the incremental structure expanding (ISE). ISE
constructs the target graph by incrementally expanding the source graph without
changing the unmodified structure. Based on ISE, we further propose a model
that iterates between nodes prediction and edges prediction, inferring more
accurate and harmonious expansion decisions progressively. In addition, we
construct a challenging dataset that contains more complicated queries and
larger scene graphs than existing datasets. Experiments on four benchmarks
demonstrate the effectiveness of our approach, which surpasses the previous
state-of-the-art model by large margins.Comment: In COLING 2022 as a long paper. Code and data available at
https://github.com/THU-BPM/SG
Prompt Me Up: Unleashing the Power of Alignments for Multimodal Entity and Relation Extraction
How can we better extract entities and relations from text? Using multimodal
extraction with images and text obtains more signals for entities and
relations, and aligns them through graphs or hierarchical fusion, aiding in
extraction. Despite attempts at various fusions, previous works have overlooked
many unlabeled image-caption pairs, such as NewsCLIPing. This paper proposes
innovative pre-training objectives for entity-object and relation-image
alignment, extracting objects from images and aligning them with entity and
relation prompts for soft pseudo-labels. These labels are used as
self-supervised signals for pre-training, enhancing the ability to extract
entities and relations. Experiments on three datasets show an average 3.41% F1
improvement over prior SOTA. Additionally, our method is orthogonal to previous
multimodal fusions, and using it on prior SOTA fusions further improves 5.47%
F1.Comment: Accepted to ACM Multimedia 202
Exploring the Compositional Generalization in Context Dependent Text-to-SQL Parsing
In the context-dependent Text-to-SQL task, the generated SQL statements are
refined iteratively based on the user input utterance from each interaction.
The input text from each interaction can be viewed as component modifications
to the previous SQL statements, which could be further extracted as the
modification patterns. Since these modification patterns could also be combined
with other SQL statements, the models are supposed to have the compositional
generalization to these novel combinations. This work is the first exploration
of compositional generalization in context-dependent Text-to-SQL scenarios. To
facilitate related studies, we constructed two challenging benchmarks named
\textsc{CoSQL-CG} and \textsc{SParC-CG} by recombining the modification
patterns and existing SQL statements. The following experiments show that all
current models struggle on our proposed benchmarks. Furthermore, we found that
better aligning the previous SQL statements with the input utterance could give
models better compositional generalization ability. Based on these
observations, we propose a method named \texttt{p-align} to improve the
compositional generalization of Text-to-SQL models. Further experiments
validate the effectiveness of our method. Source code and data are available.Comment: Accepted to ACL 2023 (Findings), Long Paper, 11 page
RAPL: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction
How to identify semantic relations among entities in a document when only a
few labeled documents are available? Few-shot document-level relation
extraction (FSDLRE) is crucial for addressing the pervasive data scarcity
problem in real-world scenarios. Metric-based meta-learning is an effective
framework widely adopted for FSDLRE, which constructs class prototypes for
classification. However, existing works often struggle to obtain class
prototypes with accurate relational semantics: 1) To build prototype for a
target relation type, they aggregate the representations of all entity pairs
holding that relation, while these entity pairs may also hold other relations,
thus disturbing the prototype. 2) They use a set of generic NOTA
(none-of-the-above) prototypes across all tasks, neglecting that the NOTA
semantics differs in tasks with different target relation types. In this paper,
we propose a relation-aware prototype learning method for FSDLRE to strengthen
the relational semantics of prototype representations. By judiciously
leveraging the relation descriptions and realistic NOTA instances as guidance,
our method effectively refines the relation prototypes and generates
task-specific NOTA prototypes. Extensive experiments demonstrate that our
method outperforms state-of-the-art approaches by average 2.61% across
various settings of two FSDLRE benchmarks.Comment: Accepted to EMNLP 202
- …