23,867 research outputs found

    Two Tales of the World: Comparison of Widely Used World News Datasets GDELT and EventRegistry

    Full text link
    In this work, we compare GDELT and Event Registry, which monitor news articles worldwide and provide big data to researchers regarding scale, news sources, and news geography. We found significant differences in scale and news sources, but surprisingly, we observed high similarity in news geography between the two datasets.Comment: To be appeared in ICWSM'1

    Indirect Match Highlights Detection with Deep Convolutional Neural Networks

    Full text link
    Highlights in a sport video are usually referred as actions that stimulate excitement or attract attention of the audience. A big effort is spent in designing techniques which find automatically highlights, in order to automatize the otherwise manual editing process. Most of the state-of-the-art approaches try to solve the problem by training a classifier using the information extracted on the tv-like framing of players playing on the game pitch, learning to detect game actions which are labeled by human observers according to their perception of highlight. Obviously, this is a long and expensive work. In this paper, we reverse the paradigm: instead of looking at the gameplay, inferring what could be exciting for the audience, we directly analyze the audience behavior, which we assume is triggered by events happening during the game. We apply deep 3D Convolutional Neural Network (3D-CNN) to extract visual features from cropped video recordings of the supporters that are attending the event. Outputs of the crops belonging to the same frame are then accumulated to produce a value indicating the Highlight Likelihood (HL) which is then used to discriminate between positive (i.e. when a highlight occurs) and negative samples (i.e. standard play or time-outs). Experimental results on a public dataset of ice-hockey matches demonstrate the effectiveness of our method and promote further research in this new exciting direction.Comment: "Social Signal Processing and Beyond" workshop, in conjunction with ICIAP 201

    A Dynamic Embedding Model of the Media Landscape

    Full text link
    Information about world events is disseminated through a wide variety of news channels, each with specific considerations in the choice of their reporting. Although the multiplicity of these outlets should ensure a variety of viewpoints, recent reports suggest that the rising concentration of media ownership may void this assumption. This observation motivates the study of the impact of ownership on the global media landscape and its influence on the coverage the actual viewer receives. To this end, the selection of reported events has been shown to be informative about the high-level structure of the news ecosystem. However, existing methods only provide a static view into an inherently dynamic system, providing underperforming statistical models and hindering our understanding of the media landscape as a whole. In this work, we present a dynamic embedding method that learns to capture the decision process of individual news sources in their selection of reported events while also enabling the systematic detection of large-scale transformations in the media landscape over prolonged periods of time. In an experiment covering over 580M real-world event mentions, we show our approach to outperform static embedding methods in predictive terms. We demonstrate the potential of the method for news monitoring applications and investigative journalism by shedding light on important changes in programming induced by mergers and acquisitions, policy changes, or network-wide content diffusion. These findings offer evidence of strong content convergence trends inside large broadcasting groups, influencing the news ecosystem in a time of increasing media ownership concentration

    Selection Bias in News Coverage: Learning it, Fighting it

    Get PDF
    News entities must select and filter the coverage they broadcast through their respective channels since the set of world events is too large to be treated exhaustively. The subjective nature of this filtering induces biases due to, among other things, resource constraints, editorial guidelines, ideological affinities, or even the fragmented nature of the information at a journalist's disposal. The magnitude and direction of these biases are, however, widely unknown. The absence of ground truth, the sheer size of the event space, or the lack of an exhaustive set of absolute features to measure make it difficult to observe the bias directly, to characterize the leaning's nature and to factor it out to ensure a neutral coverage of the news. In this work, we introduce a methodology to capture the latent structure of media's decision process on a large scale. Our contribution is multi-fold. First, we show media coverage to be predictable using personalization techniques, and evaluate our approach on a large set of events collected from the GDELT database. We then show that a personalized and parametrized approach not only exhibits higher accuracy in coverage prediction, but also provides an interpretable representation of the selection bias. Last, we propose a method able to select a set of sources by leveraging the latent representation. These selected sources provide a more diverse and egalitarian coverage, all while retaining the most actively covered events
    • …
    corecore