94 research outputs found
Optical Flow Requires Multiple Strategies (but only one network)
We show that the matching problem that underlies optical flow requires
multiple strategies, depending on the amount of image motion and other factors.
We then study the implications of this observation on training a deep neural
network for representing image patches in the context of descriptor based
optical flow. We propose a metric learning method, which selects suitable
negative samples based on the nature of the true match. This type of training
produces a network that displays multiple strategies depending on the input and
leads to state of the art results on the KITTI 2012 and KITTI 2015 optical flow
benchmarks
Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing
We introduce a novel method for multilingual transfer that utilizes deep
contextual embeddings, pretrained in an unsupervised fashion. While contextual
embeddings have been shown to yield richer representations of meaning compared
to their static counterparts, aligning them poses a challenge due to their
dynamic nature. To this end, we construct context-independent variants of the
original monolingual spaces and utilize their mapping to derive an alignment
for the context-dependent spaces. This mapping readily supports processing of a
target language, improving transfer by context-aware embeddings. Our
experimental results demonstrate the effectiveness of this approach for
zero-shot and few-shot learning of dependency parsing. Specifically, our method
consistently outperforms the previous state-of-the-art on 6 tested languages,
yielding an improvement of 6.8 LAS points on average.Comment: NAACL 201
When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it
Understanding longer narratives or participating in conversations requires
tracking of discourse entities that have been mentioned. Indefinite noun
phrases (NPs), such as 'a dog', frequently introduce discourse entities but
this behavior is modulated by sentential operators such as negation. For
example, 'a dog' in 'Arthur doesn't own a dog' does not introduce a discourse
entity due to the presence of negation. In this work, we adapt the
psycholinguistic assessment of language models paradigm to higher-level
linguistic phenomena and introduce an English evaluation suite that targets the
knowledge of the interactions between sentential operators and indefinite NPs.
We use this evaluation suite for a fine-grained investigation of the entity
tracking abilities of the Transformer-based models GPT-2 and GPT-3. We find
that while the models are to a certain extent sensitive to the interactions we
investigate, they are all challenged by the presence of multiple NPs and their
behavior is not systematic, which suggests that even models at the scale of
GPT-3 do not fully acquire basic entity tracking abilities.Comment: To appear at NAACL 202
The Limitations of Stylometry for Detecting Machine-Generated Fake News
Recent developments in neural language models (LMs) have raised concerns
about their potential misuse for automatically spreading misinformation. In
light of these concerns, several studies have proposed to detect
machine-generated fake news by capturing their stylistic differences from
human-written text. These approaches, broadly termed stylometry, have found
success in source attribution and misinformation detection in human-written
texts. However, in this work, we show that stylometry is limited against
machine-generated misinformation. While humans speak differently when trying to
deceive, LMs generate stylistically consistent text, regardless of underlying
motive. Thus, though stylometry can successfully prevent impersonation by
identifying text provenance, it fails to distinguish legitimate LM applications
from those that introduce false information. We create two benchmarks
demonstrating the stylistic similarity between malicious and legitimate uses of
LMs, employed in auto-completion and editing-assistance settings. Our findings
highlight the need for non-stylometry approaches in detecting machine-generated
misinformation, and open up the discussion on the desired evaluation
benchmarks.Comment: Accepted for Computational Linguistics journal (squib). Previously
posted with title "Are We Safe Yet? The Limitations of Distributional
Features for Fake News Detection
Automatic Fact-guided Sentence Modification
Online encyclopediae like Wikipedia contain large amounts of text that need
frequent corrections and updates. The new information may contradict existing
content in encyclopediae. In this paper, we focus on rewriting such dynamically
changing articles. This is a challenging constrained generation task, as the
output must be consistent with the new information and fit into the rest of the
existing document. To this end, we propose a two-step solution: (1) We identify
and remove the contradicting components in a target text for a given claim,
using a neutralizing stance model; (2) We expand the remaining text to be
consistent with the given claim, using a novel two-encoder sequence-to-sequence
model with copy attention. Applied to a Wikipedia fact update dataset, our
method successfully generates updated sentences for new claims, achieving the
highest SARI score. Furthermore, we demonstrate that generating synthetic data
through such rewritten sentences can successfully augment the FEVER
fact-checking training dataset, leading to a relative error reduction of 13%.Comment: AAAI 202
Is margin all you need? An extensive empirical study of active learning on tabular data
Given a labeled training set and a collection of unlabeled data, the goal of
active learning (AL) is to identify the best unlabeled points to label. In this
comprehensive study, we analyze the performance of a variety of AL algorithms
on deep neural networks trained on 69 real-world tabular classification
datasets from the OpenML-CC18 benchmark. We consider different data regimes and
the effect of self-supervised model pre-training. Surprisingly, we find that
the classical margin sampling technique matches or outperforms all others,
including current state-of-art, in a wide range of experimental settings. To
researchers, we hope to encourage rigorous benchmarking against margin, and to
practitioners facing tabular data labeling constraints that
hyper-parameter-free margin may often be all they need
Few-shot Conformal Prediction with Auxiliary Tasks
We develop a novel approach to conformal prediction when the target task has
limited data available for training. Conformal prediction identifies a small
set of promising output candidates in place of a single prediction, with
guarantees that the set contains the correct answer with high probability. When
training data is limited, however, the predicted set can easily become unusably
large. In this work, we obtain substantially tighter prediction sets while
maintaining desirable marginal guarantees by casting conformal prediction as a
meta-learning paradigm over exchangeable collections of auxiliary tasks. Our
conformalization algorithm is simple, fast, and agnostic to the choice of
underlying model, learning algorithm, or dataset. We demonstrate the
effectiveness of this approach across a number of few-shot classification and
regression tasks in natural language processing, computer vision, and
computational chemistry for drug discovery.Comment: ICML camera read
Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters
Natural Language Inference (NLI) has been extensively studied by the NLP
community as a framework for estimating the semantic relation between sentence
pairs. While early work identified certain biases in NLI models, recent
advancements in modeling and datasets demonstrated promising performance. In
this work, we further explore the direct zero-shot applicability of NLI models
to real applications, beyond the sentence-pair setting they were trained on.
First, we analyze the robustness of these models to longer and out-of-domain
inputs. Then, we develop new aggregation methods to allow operating over full
documents, reaching state-of-the-art performance on the ContractNLI dataset.
Interestingly, we find NLI scores to provide strong retrieval signals, leading
to more relevant evidence extractions compared to common similarity-based
methods. Finally, we go further and investigate whole document clusters to
identify both discrepancies and consensus among sources. In a test case, we
find real inconsistencies between Wikipedia pages in different languages about
the same topic.Comment: Findings of EMNLP 202
Conformal Risk Control
We extend conformal prediction to control the expected value of any monotone
loss function. The algorithm generalizes split conformal prediction together
with its coverage guarantee. Like conformal prediction, the conformal risk
control procedure is tight up to an factor. Worked examples
from computer vision and natural language processing demonstrate the usage of
our algorithm to bound the false negative rate, graph distance, and token-level
F1-score.Comment: Code available at https://github.com/aangelopoulos/conformal-ris
- …