238,423 research outputs found
Hexatagging: Projective Dependency Parsing as Tagging
We introduce a novel dependency parser, the hexatagger, that constructs
dependency trees by tagging the words in a sentence with elements from a finite
set of possible tags. In contrast to many approaches to dependency parsing, our
approach is fully parallelizable at training time, i.e., the structure-building
actions needed to build a dependency parse can be predicted in parallel to each
other. Additionally, exact decoding is linear in time and space complexity.
Furthermore, we derive a probabilistic dependency parser that predicts hexatags
using no more than a linear model with features from a pretrained language
model, i.e., we forsake a bespoke architecture explicitly designed for the
task. Despite the generality and simplicity of our approach, we achieve
state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test
set. Additionally, our parser's linear time complexity and parallelism
significantly improve computational efficiency, with a roughly 10-times
speed-up over previous state-of-the-art models during decoding.Comment: accepted at ACL 202
Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification
Multi-label text classification (MLTC) is the task of assigning multiple
labels to a given text, and has a wide range of application domains. Most
existing approaches require an enormous amount of annotated data to learn a
classifier and/or a set of well-defined constraints on the label space
structure, such as hierarchical relations which may be complicated to provide
as the number of labels increases. In this paper, we study the MLTC problem in
annotation-free and scarce-annotation settings in which the magnitude of
available supervision signals is linear to the number of labels. Our method
follows three steps, (1) mapping input text into a set of preliminary label
likelihoods by natural language inference using a pre-trained language model,
(2) calculating a signed label dependency graph by label descriptions, and (3)
updating the preliminary label likelihoods with message passing along the label
dependency graph, driven with a collective loss function that injects the
information of expected label frequency and average multi-label cardinality of
predictions. The experiments show that the proposed framework achieves
effective performance under low supervision settings with almost imperceptible
computational and memory overheads added to the usage of pre-trained language
model outperforming its initial performance by 70\% in terms of example-based
F1 score.Comment: Proc. Conf. Lifelong Learning Agents (CoLLAs), 202
Palm: Predicting Actions through Language Models @ Ego4D Long-Term Action Anticipation Challenge 2023
We present Palm, a solution to the Long-Term Action Anticipation (LTA) task
utilizing vision-language and large language models. Given an input video with
annotated action periods, the LTA task aims to predict possible future actions.
We hypothesize that an optimal solution should capture the interdependency
between past and future actions, and be able to infer future actions based on
the structure and dependency encoded in the past actions. Large language models
have demonstrated remarkable commonsense-based reasoning ability. Inspired by
that, Palm chains an image captioning model and a large language model. It
predicts future actions based on frame descriptions and action labels extracted
from the input videos. Our method outperforms other participants in the EGO4D
LTA challenge and achieves the best performance in terms of action prediction.
Our code is available at https://github.com/DanDoge/Pal
What Syntactic Structures block Dependencies in RNN Language Models?
Recurrent Neural Networks (RNNs) trained on a language modeling task have
been shown to acquire a number of non-local grammatical dependencies with some
success. Here, we provide new evidence that RNN language models are sensitive
to hierarchical syntactic structure by investigating the filler--gap dependency
and constraints on it, known as syntactic islands. Previous work is
inconclusive about whether RNNs learn to attenuate their expectations for gaps
in island constructions in particular or in any sufficiently complex syntactic
environment. This paper gives new evidence for the former by providing control
studies that have been lacking so far. We demonstrate that two state-of-the-art
RNN models are are able to maintain the filler--gap dependency through
unbounded sentential embeddings and are also sensitive to the hierarchical
relationship between the filler and the gap. Next, we demonstrate that the
models are able to maintain possessive pronoun gender expectations through
island constructions---this control case rules out the possibility that island
constructions block all information flow in these networks. We also evaluate
three untested islands constraints: coordination islands, left branch islands,
and sentential subject islands. Models are able to learn left branch islands
and learn coordination islands gradiently, but fail to learn sentential subject
islands. Through these controls and new tests, we provide evidence that model
behavior is due to finer-grained expectations than gross syntactic complexity,
but also that the models are conspicuously un-humanlike in some of their
performance characteristics.Comment: To Appear at the 41st Annual Meeting of the Cognitive Science
Society, Montreal, Canada, July 201
- …