85 research outputs found

    Using Twitter to Understand Public Interest in Climate Change: The case of Qatar

    Full text link
    Climate change has received an extensive attention from public opinion in the last couple of years, after being considered for decades as an exclusive scientific debate. Governments and world-wide organizations such as the United Nations are working more than ever on raising and maintaining public awareness toward this global issue. In the present study, we examine and analyze Climate Change conversations in Qatar's Twittersphere, and sense public awareness towards this global and shared problem in general, and its various related topics in particular. Such topics include but are not limited to politics, economy, disasters, energy and sandstorms. To address this concern, we collect and analyze a large dataset of 109 million tweets posted by 98K distinct users living in Qatar -- one of the largest emitters of CO2 worldwide. We use a taxonomy of climate change topics created as part of the United Nations Pulse project to capture the climate change discourse in more than 36K tweets. We also examine which topics people refer to when they discuss climate change, and perform different analysis to understand the temporal dynamics of public interest toward these topics.Comment: Will appear in the proceedings of the International Workshop on Social Media for Environment and Ecological Monitoring (SWEEM'16

    Making the End-User a Priority in Benchmarking: OrionBench for Unsupervised Time Series Anomaly Detection

    Full text link
    Time series anomaly detection is a prevalent problem in many application domains such as patient monitoring in healthcare, forecasting in finance, or predictive maintenance in energy. This has led to the emergence of a plethora of anomaly detection methods, including more recently, deep learning based methods. Although several benchmarks have been proposed to compare newly developed models, they usually rely on one-time execution over a limited set of datasets and the comparison is restricted to a few models. We propose OrionBench -- a user centric continuously maintained benchmark for unsupervised time series anomaly detection. The framework provides universal abstractions to represent models, extensibility to add new pipelines and datasets, hyperparameter standardization, pipeline verification, and frequent releases with published benchmarks. We demonstrate the usage of OrionBench, and the progression of pipelines across 15 releases published over the course of three years. Moreover, we walk through two real scenarios we experienced with OrionBench that highlight the importance of continuous benchmarks in unsupervised time series anomaly detection

    Le projet SABRE : de l’ontologie à l’inférence

    Get PDF
    International audienceLe projet SABRE a pour objet le développement d'un didacticiel destiné à faciliter l'apprentissage des comportements militaires dans les écoles de formation de l'Armée de Terre. La formation militaire générale des élÚves officiers s'appuie sur un corpus de textes de référence et sur l'utilisation de cas concrets issus du retour d'expérience (fiches structurées en XML), qui permettent aux formateurs d'animer une séance pédagogique visant à l'appropriation des comportements militaires. La conception du systÚme a débuté par la création d'une ontologie, préalable à la réalisation d'un analyseur syntaxique permettant d'extraire des rÚgles d'inférence à partir des documents XML

    AER: Auto-Encoder with Regression for Time Series Anomaly Detection

    Full text link
    Anomaly detection on time series data is increasingly common across various industrial domains that monitor metrics in order to prevent potential accidents and economic losses. However, a scarcity of labeled data and ambiguous definitions of anomalies can complicate these efforts. Recent unsupervised machine learning methods have made remarkable progress in tackling this problem using either single-timestamp predictions or time series reconstructions. While traditionally considered separately, these methods are not mutually exclusive and can offer complementary perspectives on anomaly detection. This paper first highlights the successes and limitations of prediction-based and reconstruction-based methods with visualized time series signals and anomaly scores. We then propose AER (Auto-encoder with Regression), a joint model that combines a vanilla auto-encoder and an LSTM regressor to incorporate the successes and address the limitations of each method. Our model can produce bi-directional predictions while simultaneously reconstructing the original time series by optimizing a joint objective function. Furthermore, we propose several ways of combining the prediction and reconstruction errors through a series of ablation studies. Finally, we compare the performance of the AER architecture against two prediction-based methods and three reconstruction-based methods on 12 well-known univariate time series datasets from NASA, Yahoo, Numenta, and UCR. The results show that AER has the highest averaged F1 score across all datasets (a 23.5% improvement compared to ARIMA) while retaining a runtime similar to its vanilla auto-encoder and regressor components. Our model is available in Orion, an open-source benchmarking tool for time series anomaly detection.Comment: This work is accepted by IEEE BigData 2022. The paper contains 10 pages, 6 figures, and 4 table

    Sintel: A Machine Learning Framework to Extract Insights from Signals

    Full text link
    The detection of anomalies in time series data is a critical task with many monitoring applications. Existing systems often fail to encompass an end-to-end detection process, to facilitate comparative analysis of various anomaly detection methods, or to incorporate human knowledge to refine output. This precludes current methods from being used in real-world settings by practitioners who are not ML experts. In this paper, we introduce Sintel, a machine learning framework for end-to-end time series tasks such as anomaly detection. The framework uses state-of-the-art approaches to support all steps of the anomaly detection process. Sintel logs the entire anomaly detection journey, providing detailed documentation of anomalies over time. It enables users to analyze signals, compare methods, and investigate anomalies through an interactive visualization tool, where they can annotate, modify, create, and remove events. Using these annotations, the framework leverages human knowledge to improve the anomaly detection pipeline. We demonstrate the usability, efficiency, and effectiveness of Sintel through a series of experiments on three public time series datasets, as well as one real-world use case involving spacecraft experts tasked with anomaly analysis tasks. Sintel's framework, code, and datasets are open-sourced at https://github.com/sintel-dev/.Comment: This work is accepted by ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD 2022

    Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence

    Full text link
    The Web has enabled the availability of a huge amount of useful information, but has also eased the ability to spread false information and rumors across multiple sources, making it hard to distinguish between what is true and what is not. Recent examples include the premature Steve Jobs obituary, the second bankruptcy of United airlines, the creation of Black Holes by the operation of the Large Hadron Collider, etc. Since it is important to permit the expression of dissenting and conflicting opinions, it would be a fallacy to try to ensure that the Web provides only consistent information. However, to help in separating the wheat from the chaff, it is essential to be able to determine dependence between sources. Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is extremely challenging and has not been addressed by existing work yet. In this paper, we present a set of research problems and propose some preliminary solutions on the issues involved in discovering dependence between sources. We also discuss how this knowledge can benefit a variety of technologies, such as data integration and Web 2.0, that help users manage and access the totality of the available information from various sources.Comment: CIDR 200
    • 

    corecore