7,926 research outputs found

    The glass ceiling in NLP

    Get PDF

    Why Comparing Single Performance Scores Does Not Allow to Draw Conclusions About Machine Learning Approaches

    Full text link
    Developing state-of-the-art approaches for specific tasks is a major driving force in our research community. Depending on the prestige of the task, publishing it can come along with a lot of visibility. The question arises how reliable are our evaluation methodologies to compare approaches? One common methodology to identify the state-of-the-art is to partition data into a train, a development and a test set. Researchers can train and tune their approach on some part of the dataset and then select the model that worked best on the development set for a final evaluation on unseen test data. Test scores from different approaches are compared, and performance differences are tested for statistical significance. In this publication, we show that there is a high risk that a statistical significance in this type of evaluation is not due to a superior learning approach. Instead, there is a high risk that the difference is due to chance. For example for the CoNLL 2003 NER dataset we observed in up to 26% of the cases type I errors (false positives) with a threshold of p < 0.05, i.e., falsely concluding a statistically significant difference between two identical approaches. We prove that this evaluation setup is unsuitable to compare learning approaches. We formalize alternative evaluation setups based on score distributions

    Multiplex Communities and the Emergence of International Conflict

    Full text link
    Advances in community detection reveal new insights into multiplex and multilayer networks. Less work, however, investigates the relationship between these communities and outcomes in social systems. We leverage these advances to shed light on the relationship between the cooperative mesostructure of the international system and the onset of interstate conflict. We detect communities based upon weaker signals of affinity expressed in United Nations votes and speeches, as well as stronger signals observed across multiple layers of bilateral cooperation. Communities of diplomatic affinity display an expected negative relationship with conflict onset. Ties in communities based upon observed cooperation, however, display no effect under a standard model specification and a positive relationship with conflict under an alternative specification. These results align with some extant hypotheses but also point to a paucity in our understanding of the relationship between community structure and behavioral outcomes in networks.Comment: arXiv admin note: text overlap with arXiv:1802.0039

    On the Similarities Between Native, Non-native and Translated Texts

    Full text link
    We present a computational analysis of three language varieties: native, advanced non-native, and translation. Our goal is to investigate the similarities and differences between non-native language productions and translations, contrasting both with native language. Using a collection of computational methods we establish three main results: (1) the three types of texts are easily distinguishable; (2) non-native language and translations are closer to each other than each of them is to native language; and (3) some of these characteristics depend on the source or native language, while others do not, reflecting, perhaps, unified principles that similarly affect translations and non-native language.Comment: ACL2016, 12 page

    Semantic Variation in Online Communities of Practice

    Get PDF
    We introduce a framework for quantifying semantic variation of common words in Communities of Practice and in sets of topic-related communities. We show that while some meaning shifts are shared across related communities, others are community-specific, and therefore independent from the discussed topic. We propose such findings as evidence in favour of sociolinguistic theories of socially-driven semantic variation. Results are evaluated using an independent language modelling task. Furthermore, we investigate extralinguistic features and show that factors such as prominence and dissemination of words are related to semantic variation.Comment: 13 pages, Proceedings of the 12th International Conference on Computational Semantics (IWCS 2017
    • …
    corecore