Search CORE

7,926 research outputs found

The glass ceiling in NLP

Author: Schluter Natalie
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 31/10/2018
Field of study

The IT University of Copenhagen's Repository

Why Comparing Single Performance Scores Does Not Allow to Draw Conclusions About Machine Learning Approaches

Author: Gurevych Iryna
Reimers Nils
Publication venue
Publication date: 26/03/2018
Field of study

Developing state-of-the-art approaches for specific tasks is a major driving force in our research community. Depending on the prestige of the task, publishing it can come along with a lot of visibility. The question arises how reliable are our evaluation methodologies to compare approaches? One common methodology to identify the state-of-the-art is to partition data into a train, a development and a test set. Researchers can train and tune their approach on some part of the dataset and then select the model that worked best on the development set for a final evaluation on unseen test data. Test scores from different approaches are compared, and performance differences are tested for statistical significance. In this publication, we show that there is a high risk that a statistical significance in this type of evaluation is not due to a superior learning approach. Instead, there is a high risk that the difference is due to chance. For example for the CoNLL 2003 NER dataset we observed in up to 26% of the cases type I errors (false positives) with a threshold of p < 0.05, i.e., falsely concluding a statistically significant difference between two identical approaches. We prove that this evaluation setup is unsuitable to compare learning approaches. We formalize alternative evaluation setups based on score distributions

arXiv.org e-Print Archive

TUbiblio

Multiplex Communities and the Emergence of International Conflict

Author: Dasandi Niheer
Mikhaylov Slava Jankin
Pomeroy Caleb
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

Advances in community detection reveal new insights into multiplex and multilayer networks. Less work, however, investigates the relationship between these communities and outcomes in social systems. We leverage these advances to shed light on the relationship between the cooperative mesostructure of the international system and the onset of interstate conflict. We detect communities based upon weaker signals of affinity expressed in United Nations votes and speeches, as well as stronger signals observed across multiple layers of bilateral cooperation. Communities of diplomatic affinity display an expected negative relationship with conflict onset. Ties in communities based upon observed cooperation, however, display no effect under a standard model specification and a positive relationship with conflict under an alternative specification. These results align with some extant hypotheses but also point to a paucity in our understanding of the relationship between community structure and behavioral outcomes in networks.Comment: arXiv admin note: text overlap with arXiv:1802.0039

arXiv.org e-Print Archive

Directory of Open Access Journals

On the Similarities Between Native, Non-native and Translated Texts

Author: Nisioi Sergiu
Ordan Noam
Rabinovich Ella
Wintner Shuly
Publication venue
Publication date: 01/01/2016
Field of study

We present a computational analysis of three language varieties: native, advanced non-native, and translation. Our goal is to investigate the similarities and differences between non-native language productions and translations, contrasting both with native language. Using a collection of computational methods we establish three main results: (1) the three types of texts are easily distinguishable; (2) non-native language and translations are closer to each other than each of them is to native language; and (3) some of these characteristics depend on the source or native language, while others do not, reflecting, perhaps, unified principles that similarly affect translations and non-native language.Comment: ACL2016, 12 page

arXiv.org e-Print Archive

Crossref

Semantic Variation in Online Communities of Practice

Author: Del Tredici Marco
Fernández Raquel
Publication venue
Publication date: 01/01/2017
Field of study

We introduce a framework for quantifying semantic variation of common words in Communities of Practice and in sets of topic-related communities. We show that while some meaning shifts are shared across related communities, others are community-specific, and therefore independent from the discussed topic. We propose such findings as evidence in favour of sociolinguistic theories of socially-driven semantic variation. Results are evaluated using an independent language modelling task. Furthermore, we investigate extralinguistic features and show that factors such as prominence and dissemination of words are related to semantic variation.Comment: 13 pages, Proceedings of the 12th International Conference on Computational Semantics (IWCS 2017

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE