Replication Data for: Improving the Selection of News Reports for Event Coding Using Ensemble Classification

Abstract

We introduce an automatic classification system to eliminate irrelevant source material for the coding of political event data from global news-wires. Our pipeline relies on a high-performance supervised heterogeneous ensemble classifier working on extremely unbalanced training classes. The output is then supplied to human coders for further information extraction, creating a semi-automatic pipeline. The package includes the software required to train and test the classifier, as well as documentation on how to use it

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 15/12/2019