82 research outputs found

    Two Tales of the World: Comparison of Widely Used World News Datasets GDELT and EventRegistry

    Full text link
    In this work, we compare GDELT and Event Registry, which monitor news articles worldwide and provide big data to researchers regarding scale, news sources, and news geography. We found significant differences in scale and news sources, but surprisingly, we observed high similarity in news geography between the two datasets.Comment: To be appeared in ICWSM'1

    A Dynamic Embedding Model of the Media Landscape

    Full text link
    Information about world events is disseminated through a wide variety of news channels, each with specific considerations in the choice of their reporting. Although the multiplicity of these outlets should ensure a variety of viewpoints, recent reports suggest that the rising concentration of media ownership may void this assumption. This observation motivates the study of the impact of ownership on the global media landscape and its influence on the coverage the actual viewer receives. To this end, the selection of reported events has been shown to be informative about the high-level structure of the news ecosystem. However, existing methods only provide a static view into an inherently dynamic system, providing underperforming statistical models and hindering our understanding of the media landscape as a whole. In this work, we present a dynamic embedding method that learns to capture the decision process of individual news sources in their selection of reported events while also enabling the systematic detection of large-scale transformations in the media landscape over prolonged periods of time. In an experiment covering over 580M real-world event mentions, we show our approach to outperform static embedding methods in predictive terms. We demonstrate the potential of the method for news monitoring applications and investigative journalism by shedding light on important changes in programming induced by mergers and acquisitions, policy changes, or network-wide content diffusion. These findings offer evidence of strong content convergence trends inside large broadcasting groups, influencing the news ecosystem in a time of increasing media ownership concentration

    Comparing Events Coverage in Online News and Social Media: The Case of Climate Change

    Get PDF
    Social media is becoming more and more integrated in the distribution and consumption of news. How is news in social media different from mainstream news? % This paper presents a comparative analysis covering a span of 17 months and hundreds of news events, using a method that combines automatic and manual annotations. We focus on climate change, a topic that is frequently present in the news through a number of arguments, from current practices and causes (e.g. fracking, CO2 emissions) to consequences and solutions (e.g. extreme weather, electric cars). The coverage that these different aspects receive is often dependent on how they are framed---typically by mainstream media. Yet, evidence suggests an existing gap between what the news media publishes online and what the general public shares in social media. Through the analysis of a series of events, including awareness campaigns, natural disasters, governmental meetings and publications, among others, we uncover differences in terms of the triggers, actions, and news values that are prevalent in both types of media. This methodology can be extended to other important topics present in the news

    Creating an Agglomerative Clustering Approach Using GDELT

    Get PDF
    GDELT is a project with a large scale, continuously updated databank that provides a real-time image of the global news picture by outputting these as files that can be downloaded and used by anyone. However, this data is of low granularity, and each source of data does not provide much information on its own. This thesis attempts to leverage the large amount of data available by utilizing a Hierarchical Agglomerative Cluster method to identify news articles that report about the same real life event. To do this, the thesis also explores if the GDELT data is granular enough to be used without extensive preprocessing, and if a distance metric for the cluster algorithm can be created. The findings show promising results when regarded with qualitative measures, but the quantitative measures are not yet optimized. Inherent flaws in GDELT and clustering algorithms are a hurdle to be overcome before the real potential of GDELT’s data can be unleashed, and this thesis will explore some of these difficulties and make recommendations for how to circumvent them in future works.Masteroppgave i informasjonsvitenskapINFO390MASV-INF

    D6.1 Report on the specifications and architecture of the EMT platform

    Get PDF
    This deliverable aims to provide a first view on the design principles of the EU MigraTool that will be developed within the ITFLOWS project. The EUMigraTool (EMT for short) is a software platform that will integrate all the knowledge created within the ITFLOWS project. It will provide to relevant stakeholders a set of tools to enable them to do simulations and predictions on various migration aspects, ranging from the number of people expected to leave a certain region within selected countries of origin towards EU, to potential challenges when migration populations arrive in EU territories
    • …
    corecore