1,574 research outputs found
New term in effective field theory at fixed topology
A random matrix model for lattice QCD which takes into account the positive
definite nature of the Wilson term is introduced. The corresponding effective
theory for fixed index of the Wilson Dirac operator is derived to next to
leading order. It reveals a new term proportional to the topological index of
the Wilson Dirac operator and the lattice spacing. The new term appears
naturally in a fixed index spurion analysis. The spurion approach reveals that
the term is the first in a new family of such terms and that equivalent terms
are relevant for the effective theory of continuum QCD.Comment: 22 pages, 2 figures, version to appear in PR
Sequence classification with human attention
Learning attention functions requires large volumes of data, but many NLP tasks simulate human behavior, and in this paper, we show that human attention really does provide a good inductive bias on many attention functions in NLP. Specifically, we use estimated human attention derived from eye-tracking corpora to regularize attention functions in recurrent neural networks. We show substantial improvements across a range of tasks, including sentiment analysis, grammatical error detection, and detection of abusive language
Higher-order Comparisons of Sentence Encoder Representations
Representational Similarity Analysis (RSA) is a technique developed by
neuroscientists for comparing activity patterns of different measurement
modalities (e.g., fMRI, electrophysiology, behavior). As a framework, RSA has
several advantages over existing approaches to interpretation of language
encoders based on probing or diagnostic classification: namely, it does not
require large training samples, is not prone to overfitting, and it enables a
more transparent comparison between the representational geometries of
different models and modalities. We demonstrate the utility of RSA by
establishing a previously unknown correspondence between widely-employed
pretrained language encoders and human processing difficulty via eye-tracking
data, showcasing its potential in the interpretability toolbox for neural
modelsComment: EMNLP 201
Timed Parity Games: Complexity and Robustness
We consider two-player games played in real time on game structures with
clocks where the objectives of players are described using parity conditions.
The games are \emph{concurrent} in that at each turn, both players
independently propose a time delay and an action, and the action with the
shorter delay is chosen. To prevent a player from winning by blocking time, we
restrict each player to play strategies that ensure that the player cannot be
responsible for causing a zeno run. First, we present an efficient reduction of
these games to \emph{turn-based} (i.e., not concurrent) \emph{finite-state}
(i.e., untimed) parity games. Our reduction improves the best known complexity
for solving timed parity games. Moreover, the rich class of algorithms for
classical parity games can now be applied to timed parity games. The states of
the resulting game are based on clock regions of the original game, and the
state space of the finite game is linear in the size of the region graph.
Second, we consider two restricted classes of strategies for the player that
represents the controller in a real-time synthesis problem, namely,
\emph{limit-robust} and \emph{bounded-robust} winning strategies. Using a
limit-robust winning strategy, the controller cannot choose an exact
real-valued time delay but must allow for some nonzero jitter in each of its
actions. If there is a given lower bound on the jitter, then the strategy is
bounded-robust winning. We show that exact strategies are more powerful than
limit-robust strategies, which are more powerful than bounded-robust winning
strategies for any bound. For both kinds of robust strategies, we present
efficient reductions to standard timed automaton games. These reductions
provide algorithms for the synthesis of robust real-time controllers
Analogy Training Multilingual Encoders
Language encoders encode words and phrases in ways that capture their local semantic relatedness, but are known to be globally inconsistent. Global inconsistency can seemingly be corrected for, in part, by leveraging signals from knowledge bases, but previous results are partial and limited to monolingual English encoders. We extract a large-scale multilingual, multi-word analogy dataset from Wikidata for diagnosing and correcting for global inconsistencies and implement a four-way Siamese BERT architecture for grounding multilingual BERT (mBERT) in Wikidata through analogy training. We show that analogy training not only improves the global consistency of mBERT, as well as the isomorphism of language-specific subspaces, but also leads to significant gains on downstream tasks such as bilingual dictionary induction and sentence retrieval
- …