51 research outputs found
Punishment of mainstream national parties, not Euroscepticism, is behind Irish results
The outcome of the European elections in Ireland reflected those across much of Europe — losses for the traditional establishment parties, gains for populist, Eurosceptic and anti-austerity candidates. However, in addition to the resentment of austerity measures and democratic deficit felt across many EU states, internal historical and political circumstances have also contributed to these results
Recommended from our members
Improving a Fundamental Measure of Lexical Association
Pointwise mutual information (PMI), a simple measure of lexical association, is part of several algorithms used as models of lexical semantic memory. Typically, it is used as a component of more complex distributional models rather than in isolation. We show that when two simple techniques are applied—(1) down-weighting co-occurrences involving low-frequency words in order to address PMI’s so-called “frequency bias,” and (2) defining co-occurrences as counts of “events in which instances of word1 and word2 co-occur in a context” rather than “contexts in which word1 and word2 co-occur”—then PMI outperforms default parameterizations of word embedding models in terms of how closely it matches human relatedness judgments. We also identify which down-weighting techniques are most helpful. The results suggest that simple measures may be capable of modeling certain phenomena in semantic memory, and that complex models which incorporate PMI might be improved with these modifications.Cambridge Centre for Digital Knowledg
Social media and political communication in the 2014 elections to the European Parliament
Social media play an increasingly important part in the communication strategies of political campaigns by reflecting information about the policy preferences and opinions of political actors and their public followers. In addition, the content of the messages provides rich information about the political issues and the framing of those issues during elections, such as whether contested issues concern Europe or rather extend pre-existing national debates. In this study, we survey the European landscape of social media using tweets originating from and referring to political actors during the 2014 European Parliament election campaign. We describe the language and national distribution of the messages, the relative volume of different types of communications, and the factors that determine the adoption and use of social media by the candidates. We also analyze the dynamics of the volume and content of the communications over the duration of the campaign with reference to both the EU integration dimension of the debate and the prominence of the most visible list-leading candidates. Our findings indicate that the lead candidates and their televised debate had a prominent influence on the volume and content of communications, and that the content and emotional tone of communications more reflects preferences along the EU dimension of political contestation rather than classic national issues relating to left-right differences
The Idea of Liberty, 1600-1800: A Distributional Concept Analysis.
This article uses computational and statistical methods for analyzing the concept of liberty 1600-1800. Based on a bespoke set of tools for parsing conceptual structures it contributes to the literature on the concept of liberty and engages with the thesis concerning negative liberty first put forward by Isaiah Berlin and subsequently modified by Quentin Skinner
Recommended from our members
Tracing Shifting Conceptual Vocabularies Through Time
This paper presents work in progress on an algorithm to track and identify changes in the vocabulary used to describe particular concepts over time, with emphasis on treating concepts as distinct from changes in word meaning. We apply the algorithm to word vectors generated from Google Books n-grams from 1800-1990 and evaluate the induced networks with respect to their flexibility (robustness to changes in vocabulary) and stability (they should not leap from topic to topic). We also describe work in progress using the British National Biography Linked Open Data Serials to construct a “ground truth” evaluation dataset for algorithms which aim to detect shifts in the vocabulary used to describe concepts. Finally, we discuss limitations of the proposed method, ways in which the method could be improved in the future, and other considerations.Cambridge Centre for Digital Knowledge, University of Cambridg
Argument mining with graph representation learning
Argument Mining (AM) is a unique task in Natural Language Processing (NLP) that targets arguments: a meaningful logical structure in human language. Since the argument plays a significant role in the legal field, the interdisciplinary study of AM on legal texts has significant promise. For years, a pipeline architecture has been used as the standard paradigm in this area. Although this simplifies the development and management of AM systems, the connection between different parts of the pipeline causes inevitable shortcomings such as cascading error propagation. This paper presents an alternative perspective of the AM task, whereby legal documents are represented as graph structures and the AM task is undertaken as a hybrid approach incorporating Graph Neural Networks (GNNs), graph augmentation and collective classification. GNNs have been demonstrated to be an effective method for representation learning on graphs, and they have been successfully applied to many other NLP tasks. In contrast to previous pipeline-based architecture, our approach results in a single end-to-end classifier for the identification and classification of argumentative text segments. Experiments based on corpora from both
the European Court of Human Rights (ECHR) and the Court of Jus-
tice of the European Union (CJEU) show that our approach achieves
strong results compared to state-of-the-art baselines. Both the graph
augmentation and collective classification steps are shown to improve performance on both datasets when compared to using GNNs
alone
A decade of legal argumentation mining: datasets and approaches
The growing research field of argumentation mining (AM) in the past ten years has made it a popular topic in Natural Language Processing. However, there are still limited studies focusing on AM in the context of legal text (Legal AM), despite the fact that legal text analysis more generally has received much attention as an interdisciplinary field of traditional humanities and data science. The goal of this work is to provide a critical data-driven analysis of the current situation in Legal AM. After outlining the background of this topic, we explore the availability of annotated datasets and the mechanisms by which these are created. This includes a discussion of how arguments and their relationships can be modelled, as well as a number of different approaches to divide the overall Legal AM task into constituent sub-tasks. Finally we review the dominant approaches that have been applied to this task in the past decade, and outline some future directions for Legal AM research
Enhancing legal argument mining with domain pre-training and neural networks
The contextual word embedding model, BERT, has proved its ability on downstream tasks with limited
quantities of annotated data. BERT and its variants help to reduce the burden of complex annotation
work in many interdisciplinary research areas, for example, legal argument mining in digital humanities.
Argument mining aims to develop text analysis tools that can automatically retrieve arguments and
identify relationships between argumentation clauses. Since argumentation is one of the key aspects
of case law, argument mining tools for legal texts are applicable to both academic and non-academic
legal research. Domain-specific BERT variants (pre-trained with corpora from a particular background)
have also achieved strong performance in many tasks. To our knowledge, previous machine learning
studies of argument mining on judicial case law still heavily rely on statistical models. In this paper,
we provide a broad study of both classic and contextual embedding models and their performance on
practical case law from the European Court of Human Rights (ECHR). During our study, we also explore
a number of neural networks when being combined with different embeddings. Our experiments provide
a comprehensive overview of a variety of approaches to the legal argument mining task. We conclude that
domain pre-trained transformer models have great potential in this area, although traditional embeddings
can also achieve strong performance when combined with additional neural network layers
quanteda: An R package for the quantitative analysis of textual data
quanteda is an R package providing a comprehensive workflow and toolkit for natural language processing tasks such as corpus management, tokenization, analysis, and visualization. It has extensive functions for applying dictionary analysis, exploring texts using keywords-in-context, computing document and feature similarities, and discovering multi-word expressions through collocation scoring. Based entirely on sparse operations,it provides highly efficient methods for compiling document-feature matrices and for manipulating these or using them in further quantitative analysis. Using C++ and multi-threading extensively, quanteda is also considerably faster and more efficient than other R and Python packages in processing large textual data.
The package is designed for R users needing to apply natural language processing to texts,from documents to final analysis. Its capabilities match or exceed those provided in many end-user software applications, many of which are expensive and not open source. The package is therefore of great benefit to researchers, students, and other analysts with fewer financial resources. While using quanteda requires R programming knowledge, its API is designed to enable powerful, efficient analysis with a minimum of steps. By emphasizing consistent design, furthermore, quanteda lowers the barriers to learning and using NLP and quantitative text analysis even for proficient R programmers
- …