98 research outputs found

    Topology Analysis of International Networks Based on Debates in the United Nations

    Get PDF
    In complex, high dimensional and unstructured data it is often difficult to extract meaningful patterns. This is especially the case when dealing with textual data. Recent studies in machine learning, information theory and network science have developed several novel instruments to extract the semantics of unstructured data, and harness it to build a network of relations. Such approaches serve as an efficient tool for dimensionality reduction and pattern detection. This paper applies semantic network science to extract ideological proximity in the international arena, by focusing on the data from General Debates in the UN General Assembly on the topics of high salience to international community. UN General Debate corpus (UNGDC) covers all high-level debates in the UN General Assembly from 1970 to 2014, covering all UN member states. The research proceeds in three main steps. First, Latent Dirichlet Allocation (LDA) is used to extract the topics of the UN speeches, and therefore semantic information. Each country is then assigned a vector specifying the exposure to each of the topics identified. This intermediate output is then used in to construct a network of countries based on information theoretical metrics where the links capture similar vectorial patterns in the topic distributions. Topology of the networks is then analyzed through network properties like density, path length and clustering. Finally, we identify specific topological features of our networks using the map equation framework to detect communities in our networks of countries

    Detecting Policy Preferences and Dynamics in the UN General Debate with Neural Word Embeddings

    Get PDF
    Foreign policy analysis has been struggling to find ways to measure policy preferences and paradigm shifts in international political systems. This paper presents a novel, potential solution to this challenge, through the application of a neural word embedding (Word2vec) model on a dataset featuring speeches by heads of state or government in the United Nations General Debate. The paper provides three key contributions based on the output of the Word2vec model. First, it presents a set of policy attention indices, synthesizing the semantic proximity of political speeches to specific policy themes. Second, it introduces country-specific semantic centrality indices, based on topological analyses of countries' semantic positions with respect to each other. Third, it tests the hypothesis that there exists a statistical relation between the semantic content of political speeches and UN voting behavior, falsifying it and suggesting that political speeches contain information of different nature then the one behind voting outcomes. The paper concludes with a discussion of the practical use of its results and consequences for foreign policy analysis, public accountability, and transparency

    Estimating Government Discretion in Fiscal Policy Making

    Get PDF
    Varieties of Capitalism (VoC) is a relatively new approach to describe macroeconomic differences across countries, classifying them into coordinated market economies (CMEs) and liberal market economies (LMEs). VoC already had a significant impact on the field but has been criticised for its lack of linkage to political systems. Recent studies focused on the similarities between CMEs and the Lijphartian consensus political systems, and LMEs and majoritarian political systems. One of the practical consequences of this classification is that governments in LMEs should enjoy more discretion over fiscal policy while governments in CMEs are more constrained in their decisions. In this paper we evaluate this proposition in two LME states -- Ireland and the UK -- where the latter is an example of a pure majoritarian state while the former bares several institutional characteristics of the consensus state (e.g. electoral system and coalition governments). We show that governments in both states enjoy relatively high degrees of discretion over fiscal policy, but that in Ireland policy outcomes are more well balanced in respect to interests represented by social partners. We thus provide empirical evidence that supports the classification proposed in the VoC approach. However, we also demonstrate that the context of decision-making has a crucial impact on the discretionary power of government, and that such context effects can change over time, even within the same system type.fiscal policy, computerised text analysis, EU Structural Funds, budgetary process

    A new Database of Parliamentary Debates in Ireland, 1922--2008

    Get PDF
    We present a new database of parliamentary debates and written answers in Dáil Éireann for the entire time period from the third Dáil in 1922 to the thirtieth Dáil in 2008. This database was built from the Official Records of the Houses of the Oireachtas. Unlike its original version, our database integrates information about debates and information about deputies into a single database. This database therefore allows to search and retrieve contributions from individual deputies of the Dáil (Teachta Dála or TD) and to combine information about TDs' parties and constituencies with the history of political speeches and written answers. In addition, our database facilitates the application of content analysis software such as Wordscore (Laver, Benoit and Garry, 2003) or Wordfish (Slapin and Proksch, 2008) and makes it possible to estimate TDs' policy preferences from speeches. In this paper we document the structure of the database and how it was generated. We furthermore demonstrate how political debates can be used in social science research through a series of examples. These include an analysis of the policy agenda in all budget speeches from 1922 to today, the estimation of speakers' policy positions in the 2008 budget debate, and the estimation of ministers' policy positions in the 26th government in 2002.parliamentary debates, policy point estimation, budget speeches, text analysis

    Learning to Predict with Highly Granular Temporal Data: Estimating individual behavioral profiles with smart meter data

    Get PDF
    Big spatio-temporal datasets, available through both open and administrative data sources, offer significant potential for social science research. The magnitude of the data allows for increased resolution and analysis at individual level. While there are recent advances in forecasting techniques for highly granular temporal data, little attention is given to segmenting the time series and finding homogeneous patterns. In this paper, it is proposed to estimate behavioral profiles of individuals' activities over time using Gaussian Process-based models. In particular, the aim is to investigate how individuals or groups may be clustered according to the model parameters. Such a Bayesian non-parametric method is then tested by looking at the predictability of the segments using a combination of models to fit different parts of the temporal profiles. Model validity is then tested on a set of holdout data. The dataset consists of half hourly energy consumption records from smart meters from more than 100,000 households in the UK and covers the period from 2015 to 2016. The methodological approach developed in the paper may be easily applied to datasets of similar structure and granularity, for example social media data, and may lead to improved accuracy in the prediction of social dynamics and behavior

    What Drives the International Development Agenda? An NLP Analysis of the United Nations General Debate 1970-2016

    Get PDF
    There is surprisingly little known about agenda setting for international development in the United Nations (UN) despite it having a significant influence on the process and outcomes of development efforts. This paper addresses this shortcoming using a novel approach that applies natural language processing techniques to countries' annual statements in the UN General Debate. Every year UN member states deliver statements during the General Debate on their governments' perspective on major issues in world politics. These speeches provide invaluable information on state preferences on a wide range of issues, including international development, but have largely been overlooked in the study of global politics. This paper identifies the main international development topics that states raise in these speeches between 1970 and 2016, and examine the country-specific drivers of international development rhetoric

    Multiplex Communities and the Emergence of International Conflict

    Full text link
    Advances in community detection reveal new insights into multiplex and multilayer networks. Less work, however, investigates the relationship between these communities and outcomes in social systems. We leverage these advances to shed light on the relationship between the cooperative mesostructure of the international system and the onset of interstate conflict. We detect communities based upon weaker signals of affinity expressed in United Nations votes and speeches, as well as stronger signals observed across multiple layers of bilateral cooperation. Communities of diplomatic affinity display an expected negative relationship with conflict onset. Ties in communities based upon observed cooperation, however, display no effect under a standard model specification and a positive relationship with conflict under an alternative specification. These results align with some extant hypotheses but also point to a paucity in our understanding of the relationship between community structure and behavioral outcomes in networks.Comment: arXiv admin note: text overlap with arXiv:1802.0039

    Application of Natural Language Processing to Determine User Satisfaction in Public Services

    Get PDF
    Research on customer satisfaction has increased substantially in recent years. However, the relative importance and relationships between different determinants of satisfaction remains uncertain. Moreover, quantitative studies to date tend to test for significance of pre-determined factors thought to have an influence with no scalable means to identify other causes of user satisfaction. The gaps in knowledge make it difficult to use available knowledge on user preference for public service improvement. Meanwhile, digital technology development has enabled new methods to collect user feedback, for example through online forums where users can comment freely on their experience. New tools are needed to analyze large volumes of such feedback. Use of topic models is proposed as a feasible solution to aggregate open-ended user opinions that can be easily deployed in the public sector. Generated insights can contribute to a more inclusive decision-making process in public service provision. This novel methodological approach is applied to a case of service reviews of publicly-funded primary care practices in England. Findings from the analysis of 145,000 reviews covering almost 7,700 primary care centers indicate that the quality of interactions with staff and bureaucratic exigencies are the key issues driving user satisfaction across England

    Data Innovation for International Development: An overview of natural language processing for qualitative data analysis

    Get PDF
    Availability, collection and access to quantitative data, as well as its limitations, often make qualitative data the resource upon which development programs heavily rely. Both traditional interview data and social media analysis can provide rich contextual information and are essential for research, appraisal, monitoring and evaluation. These data may be difficult to process and analyze both systematically and at scale. This, in turn, limits the ability of timely data driven decision-making which is essential in fast evolving complex social systems. In this paper, we discuss the potential of using natural language processing to systematize analysis of qualitative data, and to inform quick decision-making in the development context. We illustrate this with interview data generated in a format of micro-narratives for the UNDP Fragments of Impact project

    Transfer Topic Labeling with Domain-Specific Knowledge Base: An Analysis of UK House of Commons Speeches 1935-2014

    Get PDF
    Topic models are widely used in natural language processing, allowing researchers to estimate the underlying themes in a collection of documents. Most topic models use unsupervised methods and hence require the additional step of attaching meaningful labels to estimated topics. This process of manual labeling is not scalable and suffers from human bias. We present a semi-automatic transfer topic labeling method that seeks to remedy these problems. Domain-specific codebooks form the knowledge-base for automated topic labeling. We demonstrate our approach with a dynamic topic model analysis of the complete corpus of UK House of Commons speeches 1935-2014, using the coding instructions of the Comparative Agendas Project to label topics. We show that our method works well for a majority of the topics we estimate; but we also find that institution-specific topics, in particular on subnational governance, require manual input. We validate our results using human expert coding
    • …
    corecore