Search CORE

34,346 research outputs found

We Need to Talk About Random Splits

Author: Bastings Jasmijn
Ebert Sebastian
Filippova Katja
Søgaard Anders
Publication venue
Publication date: 01/01/2021
Field of study

Gorman and Bedrick (2019) argued for using random splits rather than standard splits in NLP experiments. We argue that random splits, like standard splits, lead to overly optimistic performance estimates. We can also split data in biased or adversarial ways, e.g., training on short sentences and evaluating on long ones. Biased sampling has been used in domain adaptation to simulate real-world drift; this is known as the covariate shift assumption. In NLP, however, even worst-case splits, maximizing bias, often under-estimate the error observed on new samples of in-domain data, i.e., the data that models should minimally generalize to at test time. This invalidates the covariate shift assumption. Instead of using multiple random splits, future benchmarks should ideally include multiple, independent test sets instead; if infeasible, we argue that multiple biased splits leads to more realistic performance estimates than multiple random splits.Comment: Accepted at EACL 202

arXiv.org e-Print Archive

Copenhagen University Research Information System

Mediation and peace

Author: Hörner Johannes
Morelli Massimo
Squintani Francesco
Publication venue: Department of Economics, University of Warwick
Publication date: 01/01/2010
Field of study

This paper applies mechanism design to conflict resolution. We determine when and how unmediated communication and mediation reduce the ex ante probability of conflict in a game with asymmetric information. Mediation improves upon unmediated communication when the intensity of conflict is high, or when asymmetric information is significant. The mediator improves upon unmediated communication by not precisely reporting information to conflicting parties, and precisely, by not revealing to a player with probability one that the opponent is weak. Arbitrators who can enforce settlements are no more effective than mediators who only make non-binding recommendations

CiteSeerX

Cadmus, EUI Research Repository

Columbia University Academic Commons

Warwick Research Archives Portal Repository

Yale University

On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law

Author: Abbasnejad Ehsan
Hengel Anton van den
Kafle Kushal
Kanan Christopher
Shrestha Robik
Teney Damien
Publication venue
Publication date: 01/01/2020
Field of study

Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set. OOD benchmarks are designed to present a different joint distribution of data and labels between training and test time. VQA-CP has become the standard OOD benchmark for visual question answering, but we discovered three troubling practices in its current use. First, most published methods rely on explicit knowledge of the construction of the OOD splits. They often rely on ``inverting'' the distribution of labels, e.g. answering mostly 'yes' when the common training answer is 'no'. Second, the OOD test set is used for model selection. Third, a model's in-domain performance is assessed after retraining it on in-domain splits (VQA v2) that exhibit a more balanced distribution of labels. These three practices defeat the objective of evaluating generalization, and put into question the value of methods specifically designed for this dataset. We show that embarrassingly-simple methods, including one that generates answers at random, surpass the state of the art on some question types. We provide short- and long-term solutions to avoid these pitfalls and realize the benefits of OOD evaluation

arXiv.org e-Print Archive

Adelaide Research & Scholarship

A random tunnel number one 3-manifold does not fiber over the circle

Author: Birman
Button
Dunfield
Dunfield
Dwass
Dylan P Thurston
Gromov
Hoste
Jaco
Jaco
Kerckhoff
Luo
Masur
Mirzakhani
Murasugi
Nathan M Dunfield
Penner
Poulalhon
Schaeffer
Scharlemann
Stallings
Publication venue: 'Mathematical Sciences Publishers'
Publication date: 01/01/2006
Field of study

We address the question: how common is it for a 3-manifold to fiber over the circle? One motivation for considering this is to give insight into the fairly inscrutable Virtual Fibration Conjecture. For the special class of 3-manifolds with tunnel number one, we provide compelling theoretical and experimental evidence that fibering is a very rare property. Indeed, in various precise senses it happens with probability 0. Our main theorem is that this is true for a measured lamination model of random tunnel number one 3-manifolds. The first ingredient is an algorithm of K Brown which can decide if a given tunnel number one 3-manifold fibers over the circle. Following the lead of Agol, Hass and W Thurston, we implement Brown's algorithm very efficiently by working in the context of train tracks/interval exchanges. To analyze the resulting algorithm, we generalize work of Kerckhoff to understand the dynamics of splitting sequences of complete genus 2 interval exchanges. Combining all of this with a "magic splitting sequence" and work of Mirzakhani proves the main theorem. The 3-manifold situation contrasts markedly with random 2-generator 1-relator groups; in particular, we show that such groups "fiber" with probability strictly between 0 and 1.Comment: This is the version published by Geometry & Topology on 15 December 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

The roundtable: an abstract model of conversation dynamics

Author: Lacasa Lucas
Mastrangeli Massimo
Schmidt Martin
Publication venue
Publication date: 01/01/2010
Field of study

Is it possible to abstract a formal mechanism originating schisms and governing the size evolution of social conversations? In this work a constructive solution to such problem is proposed: an abstract model of a generic N-party turn-taking conversation. The model develops from simple yet realistic assumptions derived from experimental evidence, abstracts from conversation content and semantics while including topological information, and is driven by stochastic dynamics. We find that a single mechanism - namely the dynamics of conversational party's individual fitness, as related to conversation size - controls the development of the self-organized schisming phenomenon. Potential generalizations of the model - including individual traits and preferences, memory effects and more elaborated conversational topologies - may find important applications also in other fields of research, where dynamically-interacting and networked agents play a fundamental role.Comment: 18 pages, 4 figures, to be published in Journal of Artificial Societies and Social Simulatio

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

DI-fusion

Secure bit commitment from relativistic constraints

Author: Hänggi Esther
Kaniewski Jędrzej
Tomamichel Marco
Wehner Stephanie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

We investigate two-party cryptographic protocols that are secure under assumptions motivated by physics, namely relativistic assumptions (no-signalling) and quantum mechanics. In particular, we discuss the security of bit commitment in so-called split models, i.e. models in which at least some of the parties are not allowed to communicate during certain phases of the protocol. We find the minimal splits that are necessary to evade the Mayers-Lo-Chau no-go argument and present protocols that achieve security in these split models. Furthermore, we introduce the notion of local versus global command, a subtle issue that arises when the split committer is required to delegate non-communicating agents to open the commitment. We argue that classical protocols are insecure under global command in the split model we consider. On the other hand, we provide a rigorous security proof in the global command model for Kent's quantum protocol [Kent 2011, Unconditionally Secure Bit Commitment by Transmitting Measurement Outcomes]. The proof employs two fundamental principles of modern physics, the no-signalling property of relativity and the uncertainty principle of quantum mechanics.Comment: published version, IEEE format, 18 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

OPUS - University of Technology Sydney

Copenhagen University Research Information System

ScholarBank@NUS

Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems

Author: Asri Layla El
Fine Emery
Harris Justin
Mehrotra Rahul
Schulz Hannes
Sharma Shikhar
Suleman Kaheer
Zumer Jeremie
Publication venue
Publication date: 01/01/2017
Field of study

This paper presents the Frames dataset (Frames is available at http://datasets.maluuba.com/Frames), a corpus of 1369 human-human dialogues with an average of 15 turns per dialogue. We developed this dataset to study the role of memory in goal-oriented dialogue systems. Based on Frames, we introduce a task called frame tracking, which extends state tracking to a setting where several states are tracked simultaneously. We propose a baseline model for this task. We show that Frames can also be used to study memory in dialogue management and information presentation through natural language generation

arXiv.org e-Print Archive

Crossref