182,464 research outputs found

    A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection

    Full text link
    Time series are the primary data type used to record dynamic system measurements and generated in great volume by both physical sensors and online processes (virtual sensors). Time series analytics is therefore crucial to unlocking the wealth of information implicit in available data. With the recent advancements in graph neural networks (GNNs), there has been a surge in GNN-based approaches for time series analysis. Approaches can explicitly model inter-temporal and inter-variable relationships, which traditional and other deep neural network-based methods struggle to do. In this survey, we provide a comprehensive review of graph neural networks for time series analysis (GNN4TS), encompassing four fundamental dimensions: Forecasting, classification, anomaly detection, and imputation. Our aim is to guide designers and practitioners to understand, build applications, and advance research of GNN4TS. At first, we provide a comprehensive task-oriented taxonomy of GNN4TS. Then, we present and discuss representative research works and, finally, discuss mainstream applications of GNN4TS. A comprehensive discussion of potential future research directions completes the survey. This survey, for the first time, brings together a vast array of knowledge on GNN-based time series research, highlighting both the foundations, practical applications, and opportunities of graph neural networks for time series analysis.Comment: 27 pages, 6 figures, 5 table

    Label-efficient Time Series Representation Learning: A Review

    Full text link
    The scarcity of labeled data is one of the main challenges of applying deep learning models on time series data in the real world. Therefore, several approaches, e.g., transfer learning, self-supervised learning, and semi-supervised learning, have been recently developed to promote the learning capability of deep learning models from the limited time series labels. In this survey, for the first time, we provide a novel taxonomy to categorize existing approaches that address the scarcity of labeled data problem in time series data based on their dependency on external data sources. Moreover, we present a review of the recent advances in each approach and conclude the limitations of the current works and provide future directions that could yield better progress in the field.Comment: Under Revie

    Why We Read Wikipedia

    Get PDF
    Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table

    Why does taxonomy take so long?

    Get PDF

    Household trajectories in rural Ethiopia – what can a mixed method approach tell us about the impact of poverty on children?

    Get PDF
    The paper explores the dynamics of child and household poverty in rural Ethiopia using three rounds of household survey and qualitative data collected by Young Lives, a longitudinal study of child poverty. It uses a mixed-method taxonomy of poverty (Roelen and Camfield 2011) to classify children and their households into four groups: ultra-poor, poor, near-poor and non-poor. Survey and qualitative data are then used to analyse the movements in and out of poverty and explore the factors that underpin these movements. The use of mixed methods in both the identification of the poor and analysis of their mobility illustrates that the combined use of qualitative and quantitative information can lead to deeper insights and understandings. The paper reports a reduction in the percentage of poor households from 50 to 20 percent between rounds 1 and 3 (2002-9), following the ‘stages of progress’ posited in Roelen and Camfield (2011). However, these changes were not unequivocally beneficial to children (for example, the acquisition of livestock might mean dropping out of school to herd them). Ultra-poverty proved persistent with little change in the circumstances of the one in ten households classified as ultra-poor, who were vulnerable to illness, lending or ‘sharecropping-out’ land on unfavourable terms and exclusion from the government’s food-for-work scheme

    Knowledge Management for Foundations: Planning Study

    Get PDF
    Outlines objectives, methodologies, and issues for components of a study on knowledge management among foundations and solutions to challenges: existing practice, a market study, copyright issues, technical standards, taxonomies, and a pilot repository

    On Machine-Learned Classification of Variable Stars with Sparse and Noisy Time-Series Data

    Full text link
    With the coming data deluge from synoptic surveys, there is a growing need for frameworks that can quickly and automatically produce calibrated classification probabilities for newly-observed variables based on a small number of time-series measurements. In this paper, we introduce a methodology for variable-star classification, drawing from modern machine-learning techniques. We describe how to homogenize the information gleaned from light curves by selection and computation of real-numbered metrics ("feature"), detail methods to robustly estimate periodic light-curve features, introduce tree-ensemble methods for accurate variable star classification, and show how to rigorously evaluate the classification results using cross validation. On a 25-class data set of 1542 well-studied variable stars, we achieve a 22.8% overall classification error using the random forest classifier; this represents a 24% improvement over the best previous classifier on these data. This methodology is effective for identifying samples of specific science classes: for pulsational variables used in Milky Way tomography we obtain a discovery efficiency of 98.2% and for eclipsing systems we find an efficiency of 99.1%, both at 95% purity. We show that the random forest (RF) classifier is superior to other machine-learned methods in terms of accuracy, speed, and relative immunity to features with no useful class information; the RF classifier can also be used to estimate the importance of each feature in classification. Additionally, we present the first astronomical use of hierarchical classification methods to incorporate a known class taxonomy in the classifier, which further reduces the catastrophic error rate to 7.8%. Excluding low-amplitude sources, our overall error rate improves to 14%, with a catastrophic error rate of 3.5%.Comment: 23 pages, 9 figure

    Keeping Research Data Safe 2: Final Report

    Get PDF
    The first Keeping Research Data Safe study funded by JISC made a major contribution to understanding of long-term preservation costs for research data by developing a cost model and indentifying cost variables for preserving research data in UK universities (Beagrie et al, 2008). However it was completed over a very constrained timescale of four months with little opportunity to follow up other major issues or sources of preservation cost information it identified. It noted that digital preservation costs are notoriously difficult to address in part because of the absence of good case studies and longitudinal information for digital preservation costs or cost variables. In January 2009 JISC issued an ITT for a study on the identification of long-lived digital datasets for the purposes of cost analysis. The aim of this work was to provide a larger body of material and evidence against which existing and future data preservation cost modelling exercises could be tested and validated. The proposal for the KRDS2 study was submitted in response by a consortium consisting of 4 partners involved in the original Keeping Research Data Safe study (Universities of Cambridge and Southampton, Charles Beagrie Ltd, and OCLC Research) and 4 new partners with significant data collections and interests in preservation costs (Archaeology Data Service, University of London Computer Centre, University of Oxford, and the UK Data Archive). A range of supplementary materials in support of this main report have been made available on the KRDS2 project website at http://www.beagrie.com/jisc.php. That website will be maintained and continuously updated with future work as a resource for KRDS users

    A History and Informal Assessment of the Slacker Astronomy Podcast

    Get PDF
    Slacker Astronomy is a weekly podcast that covers a recent astronomical news event or discovery. The show has a unique style consisting of irreverent, over-the-top humor combined with a healthy dose of hard science. According to our demographic analysis, the combination of this style and the unique podcasting distribution mechanism allows the show to reach audiences younger and busier than those reached via traditional channels. We report on the successes and challenges of the first year of the show, and provide an informal assessment of its role as a source for astronomical news and concepts for its approximately 15,500 weekly listeners.Comment: 14 page
    • …
    corecore