98 research outputs found
Topology Analysis of International Networks Based on Debates in the United Nations
In complex, high dimensional and unstructured data it is often difficult to
extract meaningful patterns. This is especially the case when dealing with
textual data. Recent studies in machine learning, information theory and
network science have developed several novel instruments to extract the
semantics of unstructured data, and harness it to build a network of relations.
Such approaches serve as an efficient tool for dimensionality reduction and
pattern detection. This paper applies semantic network science to extract
ideological proximity in the international arena, by focusing on the data from
General Debates in the UN General Assembly on the topics of high salience to
international community. UN General Debate corpus (UNGDC) covers all high-level
debates in the UN General Assembly from 1970 to 2014, covering all UN member
states. The research proceeds in three main steps. First, Latent Dirichlet
Allocation (LDA) is used to extract the topics of the UN speeches, and
therefore semantic information. Each country is then assigned a vector
specifying the exposure to each of the topics identified. This intermediate
output is then used in to construct a network of countries based on information
theoretical metrics where the links capture similar vectorial patterns in the
topic distributions. Topology of the networks is then analyzed through network
properties like density, path length and clustering. Finally, we identify
specific topological features of our networks using the map equation framework
to detect communities in our networks of countries
Detecting Policy Preferences and Dynamics in the UN General Debate with Neural Word Embeddings
Foreign policy analysis has been struggling to find ways to measure policy
preferences and paradigm shifts in international political systems. This paper
presents a novel, potential solution to this challenge, through the application
of a neural word embedding (Word2vec) model on a dataset featuring speeches by
heads of state or government in the United Nations General Debate. The paper
provides three key contributions based on the output of the Word2vec model.
First, it presents a set of policy attention indices, synthesizing the semantic
proximity of political speeches to specific policy themes. Second, it
introduces country-specific semantic centrality indices, based on topological
analyses of countries' semantic positions with respect to each other. Third, it
tests the hypothesis that there exists a statistical relation between the
semantic content of political speeches and UN voting behavior, falsifying it
and suggesting that political speeches contain information of different nature
then the one behind voting outcomes. The paper concludes with a discussion of
the practical use of its results and consequences for foreign policy analysis,
public accountability, and transparency
Estimating Government Discretion in Fiscal Policy Making
Varieties of Capitalism (VoC) is a relatively new approach to describe macroeconomic differences across countries, classifying them into coordinated market economies (CMEs) and liberal market economies (LMEs). VoC already had a significant impact on the field but has been criticised for its lack of linkage to political systems. Recent studies focused on the similarities between CMEs and the Lijphartian consensus political systems, and LMEs and majoritarian political systems. One of the practical consequences of this classification is that governments in LMEs should enjoy more discretion over fiscal policy while governments in CMEs are more constrained in their decisions. In this paper we evaluate this proposition in two LME states -- Ireland and the UK -- where the latter is an example of a pure majoritarian state while the former bares several institutional characteristics of the consensus state (e.g. electoral system and coalition governments). We show that governments in both states enjoy relatively high degrees of discretion over fiscal policy, but that in Ireland policy outcomes are more well balanced in respect to interests represented by social partners. We thus provide empirical evidence that supports the classification proposed in the VoC approach. However, we also demonstrate that the context of decision-making has a crucial impact on the discretionary power of government, and that such context effects can change over time, even within the same system type.fiscal policy, computerised text analysis, EU Structural Funds, budgetary process
A new Database of Parliamentary Debates in Ireland, 1922--2008
We present a new database of parliamentary debates and written answers in Dáil Éireann for the entire time period from the third Dáil in 1922 to the thirtieth Dáil in 2008. This database was built from the Official Records of the Houses of the Oireachtas. Unlike its original version, our database integrates information about debates and information about deputies into a single database. This database therefore allows to search and retrieve contributions from individual deputies of the Dáil (Teachta Dála or TD) and to combine information about TDs' parties and constituencies with the history of political speeches and written answers. In addition, our database facilitates the application of content analysis software such as Wordscore (Laver, Benoit and Garry, 2003) or Wordfish (Slapin and Proksch, 2008) and makes it possible to estimate TDs' policy preferences from speeches. In this paper we document the structure of the database and how it was generated. We furthermore demonstrate how political debates can be used in social science research through a series of examples. These include an analysis of the policy agenda in all budget speeches from 1922 to today, the estimation of speakers' policy positions in the 2008 budget debate, and the estimation of ministers' policy positions in the 26th government in 2002.parliamentary debates, policy point estimation, budget speeches, text analysis
Learning to Predict with Highly Granular Temporal Data: Estimating individual behavioral profiles with smart meter data
Big spatio-temporal datasets, available through both open and administrative
data sources, offer significant potential for social science research. The
magnitude of the data allows for increased resolution and analysis at
individual level. While there are recent advances in forecasting techniques for
highly granular temporal data, little attention is given to segmenting the time
series and finding homogeneous patterns. In this paper, it is proposed to
estimate behavioral profiles of individuals' activities over time using
Gaussian Process-based models. In particular, the aim is to investigate how
individuals or groups may be clustered according to the model parameters. Such
a Bayesian non-parametric method is then tested by looking at the
predictability of the segments using a combination of models to fit different
parts of the temporal profiles. Model validity is then tested on a set of
holdout data. The dataset consists of half hourly energy consumption records
from smart meters from more than 100,000 households in the UK and covers the
period from 2015 to 2016. The methodological approach developed in the paper
may be easily applied to datasets of similar structure and granularity, for
example social media data, and may lead to improved accuracy in the prediction
of social dynamics and behavior
What Drives the International Development Agenda? An NLP Analysis of the United Nations General Debate 1970-2016
There is surprisingly little known about agenda setting for international
development in the United Nations (UN) despite it having a significant
influence on the process and outcomes of development efforts. This paper
addresses this shortcoming using a novel approach that applies natural language
processing techniques to countries' annual statements in the UN General Debate.
Every year UN member states deliver statements during the General Debate on
their governments' perspective on major issues in world politics. These
speeches provide invaluable information on state preferences on a wide range of
issues, including international development, but have largely been overlooked
in the study of global politics. This paper identifies the main international
development topics that states raise in these speeches between 1970 and 2016,
and examine the country-specific drivers of international development rhetoric
Multiplex Communities and the Emergence of International Conflict
Advances in community detection reveal new insights into multiplex and
multilayer networks. Less work, however, investigates the relationship between
these communities and outcomes in social systems. We leverage these advances to
shed light on the relationship between the cooperative mesostructure of the
international system and the onset of interstate conflict. We detect
communities based upon weaker signals of affinity expressed in United Nations
votes and speeches, as well as stronger signals observed across multiple layers
of bilateral cooperation. Communities of diplomatic affinity display an
expected negative relationship with conflict onset. Ties in communities based
upon observed cooperation, however, display no effect under a standard model
specification and a positive relationship with conflict under an alternative
specification. These results align with some extant hypotheses but also point
to a paucity in our understanding of the relationship between community
structure and behavioral outcomes in networks.Comment: arXiv admin note: text overlap with arXiv:1802.0039
Application of Natural Language Processing to Determine User Satisfaction in Public Services
Research on customer satisfaction has increased substantially in recent
years. However, the relative importance and relationships between different
determinants of satisfaction remains uncertain. Moreover, quantitative studies
to date tend to test for significance of pre-determined factors thought to have
an influence with no scalable means to identify other causes of user
satisfaction. The gaps in knowledge make it difficult to use available
knowledge on user preference for public service improvement. Meanwhile, digital
technology development has enabled new methods to collect user feedback, for
example through online forums where users can comment freely on their
experience. New tools are needed to analyze large volumes of such feedback. Use
of topic models is proposed as a feasible solution to aggregate open-ended user
opinions that can be easily deployed in the public sector. Generated insights
can contribute to a more inclusive decision-making process in public service
provision. This novel methodological approach is applied to a case of service
reviews of publicly-funded primary care practices in England. Findings from the
analysis of 145,000 reviews covering almost 7,700 primary care centers indicate
that the quality of interactions with staff and bureaucratic exigencies are the
key issues driving user satisfaction across England
Data Innovation for International Development: An overview of natural language processing for qualitative data analysis
Availability, collection and access to quantitative data, as well as its
limitations, often make qualitative data the resource upon which development
programs heavily rely. Both traditional interview data and social media
analysis can provide rich contextual information and are essential for
research, appraisal, monitoring and evaluation. These data may be difficult to
process and analyze both systematically and at scale. This, in turn, limits the
ability of timely data driven decision-making which is essential in fast
evolving complex social systems. In this paper, we discuss the potential of
using natural language processing to systematize analysis of qualitative data,
and to inform quick decision-making in the development context. We illustrate
this with interview data generated in a format of micro-narratives for the UNDP
Fragments of Impact project
Transfer Topic Labeling with Domain-Specific Knowledge Base: An Analysis of UK House of Commons Speeches 1935-2014
Topic models are widely used in natural language processing, allowing
researchers to estimate the underlying themes in a collection of documents.
Most topic models use unsupervised methods and hence require the additional
step of attaching meaningful labels to estimated topics. This process of manual
labeling is not scalable and suffers from human bias. We present a
semi-automatic transfer topic labeling method that seeks to remedy these
problems. Domain-specific codebooks form the knowledge-base for automated topic
labeling. We demonstrate our approach with a dynamic topic model analysis of
the complete corpus of UK House of Commons speeches 1935-2014, using the coding
instructions of the Comparative Agendas Project to label topics. We show that
our method works well for a majority of the topics we estimate; but we also
find that institution-specific topics, in particular on subnational governance,
require manual input. We validate our results using human expert coding
- …