32 research outputs found

    A Diffusion Network Event History Estimator

    Get PDF
    Research on the diffusion of political decisions across jurisdictions typically accounts for units’ influence over each other with (1) observable measures or (2) by inferring latent network ties from past decisions. The former approach assumes that interdependence is static and perfectly captured by the data. The latter mitigates these issues but requires analytical tools that are separate from the main empirical methods for studying diffusion. As a solution, we introduce network event history analysis (NEHA), which incorporates latent network inference into conventional discrete-time event history models. We demonstrate NEHA’s unique methodological and substantive benefits in applications to policy adoption in the American states. Researchers can analyze the ties and structure of inferred networks to refine model specifications, evaluate diffusion mechanisms, or test new or existing hypotheses. By capturing targeted relationships unexplained by standard covariates, NEHA can improve models, facilitate richer theoretical development, and permit novel analyses of the diffusion process

    Active learning approaches for labeling text: review and assessment of the performance of active learning approaches

    Get PDF
    Supervised machine learning methods are increasingly employed in political science. Such models require costly manual labeling of documents. In this paper, we introduce active learning, a framework in which data to be labeled by human coders are not chosen at random but rather targeted in such a way that the required amount of data to train a machine learning model can be minimized. We study the benefits of active learning using text data examples. We perform simulation studies that illustrate conditions where active learning can reduce the cost of labeling text data. We perform these simulations on three corpora that vary in size, document length, and domain. We find that in cases where the document class of interest is not balanced, researchers can label a fraction of the documents one would need using random sampling (or “passive” learning) to achieve equally performing classifiers. We further investigate how varying levels of intercoder reliability affect the active learning procedures and find that even with low reliability, active learning performs more efficiently than does random sampling

    Replication Data for: Measuring Policy Similarity Through Bill Text Reuse

    No full text
    Bill data on which the paper Measuring Policy Similarity Through Bill Text Reuse is based and computed alignments. Please refer to the paper for details and the README.md for information on the data. All code pertaining to the project is available on github: https://github.com/desmarais-lab/text_reus

    Replication Data for: Active Learning Approaches for Labeling Text: Review and Assessment of the Performance of Active Learning Approaches

    No full text
    Supervised machine learning methods are increasingly employed in political science. Such models require costly manual labeling of documents. In this paper we introduce active learning, a framework in which data to be labeled by human coders are not chosen at random but rather targeted in such a way that the required amount of data to train a machine learning model can be minimized. We study the benefits of active learning using text data examples. We perform simulation studies that illustrate conditions where active learning can reduce the cost of labeling text data. We perform these simulations on three corpora that vary in size, document length and domain. We find that in cases where the document class of interest is not balanced, researchers can label a fraction of the documents one would need using random sampling (or `passive' learning) to achieve equally performing classifiers. We further investigate how varying levels of inter-coder reliability affect the active learning procedures and find that even with low-reliability active learning performs more efficiently than does random sampling

    SPID: A New Database for Inferring Public Policy Innovativeness and Diffusion Networks

    No full text
    Despite its rich tradition, there are key limitations to researchers\u27 ability to make generalizable inferences about state policy innovation and diffusion. This paper introduces new data and methods to move from empirical analyses of single policies to the analysis of comprehensive populations of policies and rigorously inferred diffusion networks. We have gathered policy adoption data appropriate for estimating policy innovativeness and tracing diffusion ties in a targeted manner (e.g., by policy domain, time period, or policy type) and extended the development of methods necessary to accurately and efficiently infer those ties. Our state policy innovation and diffusion (SPID) database includes 728 different policies coded by topic area. We provide an overview of this new dataset and illustrate two key uses: (i) static and dynamic innovativeness measures and (ii) latent diffusion networks that capture common pathways of diffusion between states across policies. The scope of the data allows us to compare patterns in both across policy topic areas. We conclude that these new resources will enable researchers to empirically investigate classes of questions that were difficult or impossible to study previously, but whose roots go back to the origins of the political science policy innovation and diffusion literature

    SPID: A New Database for Inferring Public Policy Innovativeness and Diffusion Networks

    No full text
    Despite its rich tradition, there are key limitations to researchers\u27 ability to make generalizable inferences about state policy innovation and diffusion. This paper introduces new data and methods to move from empirical analyses of single policies to the analysis of comprehensive populations of policies and rigorously inferred diffusion networks. We have gathered policy adoption data appropriate for estimating policy innovativeness and tracing diffusion ties in a targeted manner (e.g., by policy domain, time period, or policy type) and extended the development of methods necessary to accurately and efficiently infer those ties. Our state policy innovation and diffusion (SPID) database includes 728 different policies coded by topic area. We provide an overview of this new dataset and illustrate two key uses: (i) static and dynamic innovativeness measures and (ii) latent diffusion networks that capture common pathways of diffusion between states across policies. The scope of the data allows us to compare patterns in both across policy topic areas. We conclude that these new resources will enable researchers to empirically investigate classes of questions that were difficult or impossible to study previously, but whose roots go back to the origins of the political science policy innovation and diffusion literature

    State Diffusion Networks - Latent Network Ties from SPID v1.0

    No full text
    This study includes data on estimated latent policy diffusion networks from the SPID data, version 1.0. Here we provide the latent diffusion ties estimated for each year from 1960 to 2014 based on a 100-year window of adoptions

    A Diffusion Network Event History Estimator

    No full text
    Preanalysis plan for a methodological replication analysi
    corecore