Search CORE

65,478 research outputs found

Analysis and Forecasting of Trending Topics in Online Media Streams

Author: Althoff Tim
Borth Damian
Dengel Andreas
Hees Jörn
Publication venue
Publication date: 01/01/2013
Field of study

Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems. Correctly utilizing trending topics requires a better understanding of their various characteristics in different social media streams. To this end, we present the first comprehensive study across three major online and social media streams, Twitter, Google, and Wikipedia, covering thousands of trending topics during an observation period of an entire year. Our results indicate that depending on one's requirements one does not necessarily have to turn to Twitter for information about current events and that some media streams strongly emphasize content of specific categories. As our second key contribution, we further present a novel approach for the challenging task of forecasting the life cycle of trending topics in the very moment they emerge. Our fully automated approach is based on a nearest neighbor forecasting technique exploiting our assumption that semantically similar topics exhibit similar behavior. We demonstrate on a large-scale dataset of Wikipedia page view statistics that forecasts by the proposed approach are about 9-48k views closer to the actual viewing statistics compared to baseline methods and achieve a mean average percentage error of 45-19% for time periods of up to 14 days.Comment: ACM Multimedia 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Scalable Bayesian modeling, monitoring and analysis of dynamic network flow data

Author: Banks David
Chen Xi
Haslinger Robert
Irie Kaoru
Thomas Jewell
West Mike
Publication venue: 'Informa UK Limited'
Publication date: 09/07/2016
Field of study

Traffic flow count data in networks arise in many applications, such as automobile or aviation transportation, certain directed social network contexts, and Internet studies. Using an example of Internet browser traffic flow through site-segments of an international news website, we present Bayesian analyses of two linked classes of models which, in tandem, allow fast, scalable and interpretable Bayesian inference. We first develop flexible state-space models for streaming count data, able to adaptively characterize and quantify network dynamics efficiently in real-time. We then use these models as emulators of more structured, time-varying gravity models that allow formal dissection of network dynamics. This yields interpretable inferences on traffic flow characteristics, and on dynamics in interactions among network nodes. Bayesian monitoring theory defines a strategy for sequential model assessment and adaptation in cases when network flow data deviates from model-based predictions. Exploratory and sequential monitoring analyses of evolving traffic on a network of web site-segments in e-commerce demonstrate the utility of this coupled Bayesian emulation approach to analysis of streaming network count data.Comment: 29 pages, 16 figure

arXiv.org e-Print Archive

Global disease monitoring and forecasting with Wikipedia

Author: Del Valle Sara Y.
Deshpande Alina
Fairchild Geoffrey
Generous Nicholas
Priedhorsky Reid
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 15/07/2014
Field of study

Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data such as social media and search queries are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with

r^2

up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein and adjust novelty claims accordingly; revise title; various revisions for clarit

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

PubMed Central

FigShare

AUGUR: Forecasting the Emergence of New Research Topics

Author: Duvvuru A.
Erten C.
Leydesdorff L.
Osborne F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/05/2018
Field of study

Being able to rapidly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. The literature presents several approaches to identifying the emergence of new research topics, which rely on the assumption that the topic is already exhibiting a certain degree of popularity and consistently referred to by a community of researchers. However, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. We address this issue by introducing Augur, a novel approach to the early detection of research topics. Augur analyses the diachronic relationships between research areas and is able to detect clusters of topics that exhibit dynamics correlated with the emergence of new research topics. Here we also present the Advanced Clique Percolation Method (ACPM), a new community detection algorithm developed specifically for supporting this task. Augur was evaluated on a gold standard of 1,408 debutant topics in the 2000-2011 interval and outperformed four alternative approaches in terms of both precision and recall

Crossref

Open Research Online (The Open University)

Recommended from our members

Weather, climate, and hydrologic forecasting for the US Southwest: A survey

Author: Bales R
Hartmann HC
Sorooshian S
Publication venue: eScholarship, University of California
Publication date: 01/01/2002
Field of study

As part of a regional integrated assessment of climate vulnerability, a survey was conducted from June 1998 to May 2000 of weather, climate, and hydrologic forecasts with coverage of the US Southwest and an emphasis on the Colorado River Basin. The survey addresses the types of forecasts that were issued, the organizations that provided them, and techniques used in their generation. It reflects discussions with key personnel from organizations involved in producing or issuing forecasts, providing data for making forecasts, or serving as a link for communicating forecasts. During the survey period, users faced a complex and constantly changing mix of forecast products available from a variety of sources. The abundance of forecasts was not matched in the provision of corresponding interpretive materials, documentation about how the forecasts were generated, or reviews of past performance. Potential existed for confusing experimental and research products with others that had undergone a thorough review process, including official products issued by the National Weather Service. Contrasts between the state of meteorologic and hydrologic forecasting were notable, especially in the former's greater operational flexibility and more rapid incorporation of new observations and research products. Greater attention should be given to forecast content and communication, including visualization, expression of probabilistic forecasts and presentation of ancillary information. Regional climate models and use of climate forecasts in water supply forecasting offer rapid improvements in predictive capabilities for the Southwest. Forecasts and production details should be archived, and publicly available forecasts should be accompanied by performance evaluations that are relevant to users

eScholarship - University of California

Fostering collective intelligence education

Author: AW Woolley
B Cornu
C Thompson
DC Engelbart
E Bonabeau
G Mathioudakis
H Petreski
I Paus-Hasebrink
JM Monguet
L Ilon
N Harvey
P Lévy
P Lévy
RS Gilliver
W Tsai
Y Pérez-Gallardo
Publication venue: 'European Alliance for Innovation n.o.'
Publication date: 01/01/2016
Field of study

New educational models are necessary to update learning environments to the digitally shared communication and information. Collective intelligence is an emerging field that already has a significant impact in many areas and will have great implications in education, not only from the side of new methodologies but also as a challenge for education. This paper proposes an approach to a collective intelligence model of teaching using Internet to combine two strategies: idea management and real time assessment in the class. A digital tool named Fabricius has been created supporting these two elements to foster the collaboration and engagement of students in the learning process. As a result of the research we propose a list of KPI trying to measure individual and collective performance. We are conscious that this is just a first approach to define which aspects of a class following a course can be qualified and quantified.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Directory of Open Access Journals