1,304 research outputs found

    Integrating Data Science and Earth Science

    Get PDF
    This open access book presents the results of three years collaboration between earth scientists and data scientist, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows

    Detecting impacts of extreme events with ecological in situ monitoring networks

    Get PDF
    Extreme hydrometeorological conditions typically impact ecophysiological processes on land. Satellite-based observations of the terrestrial biosphere provide an important reference for detecting and describing the spatiotemporal development of such events. However, in-depth investigations of ecological processes during extreme events require additional in situ observations. The question is whether the density of existing ecological in situ networks is sufficient for analysing the impact of extreme events, and what are expected event detection rates of ecological in situ networks of a given size. To assess these issues, we build a baseline of extreme reductions in the fraction of absorbed photosynthetically active radiation (FAPAR), identified by a new event detection method tailored to identify extremes of regional relevance. We then investigate the event detection success rates of hypothetical networks of varying sizes. Our results show that large extremes can be reliably detected with relatively small networks, but also reveal a linear decay of detection probabilities towards smaller extreme events in log–log space. For instance, networks with  ≈  100 randomly placed sites in Europe yield a  ≥  90 % chance of detecting the eight largest (typically very large) extreme events; but only a  ≥  50 % chance of capturing the 39 largest events. These findings are consistent with probability-theoretic considerations, but the slopes of the decay rates deviate due to temporal autocorrelation and the exact implementation of the extreme event detection algorithm. Using the examples of AmeriFlux and NEON, we then investigate to what degree ecological in situ networks can capture extreme events of a given size. Consistent with our theoretical considerations, we find that today's systematically designed networks (i.e. NEON) reliably detect the largest extremes, but that the extreme event detection rates are not higher than would be achieved by randomly designed networks. Spatio-temporal expansions of ecological in situ monitoring networks should carefully consider the size distribution characteristics of extreme events if the aim is also to monitor the impacts of such events in the terrestrial biosphere

    Spatiotemporal anomaly detection: streaming architecture and algorithms

    Get PDF
    Includes bibliographical references.2020 Summer.Anomaly detection is the science of identifying one or more rare or unexplainable samples or events in a dataset or data stream. The field of anomaly detection has been extensively studied by mathematicians, statisticians, economists, engineers, and computer scientists. One open research question remains the design of distributed cloud-based architectures and algorithms that can accurately identify anomalies in previously unseen, unlabeled streaming, multivariate spatiotemporal data. With streaming data, time is of the essence, and insights are perishable. Real-world streaming spatiotemporal data originate from many sources, including mobile phones, supervisory control and data acquisition enabled (SCADA) devices, the internet-of-things (IoT), distributed sensor networks, and social media. Baseline experiments are performed on four (4) non-streaming, static anomaly detection multivariate datasets using unsupervised offline traditional machine learning (TML), and unsupervised neural network techniques. Multiple architectures, including autoencoders, generative adversarial networks, convolutional networks, and recurrent networks, are adapted for experimentation. Extensive experimentation demonstrates that neural networks produce superior detection accuracy over TML techniques. These same neural network architectures can be extended to process unlabeled spatiotemporal streaming using online learning. Space and time relationships are further exploited to provide additional insights and increased anomaly detection accuracy. A novel domain-independent architecture and set of algorithms called the Spatiotemporal Anomaly Detection Environment (STADE) is formulated. STADE is based on federated learning architecture. STADE streaming algorithms are based on a geographically unique, persistently executing neural networks using online stochastic gradient descent (SGD). STADE is designed to be pluggable, meaning that alternative algorithms may be substituted or combined to form an ensemble. STADE incorporates a Stream Anomaly Detector (SAD) and a Federated Anomaly Detector (FAD). The SAD executes at multiple locations on streaming data, while the FAD executes at a single server and identifies global patterns and relationships among the site anomalies. Each STADE site streams anomaly scores to the centralized FAD server for further spatiotemporal dependency analysis and logging. The FAD is based on recent advances in DNN-based federated learning. A STADE testbed is implemented to facilitate globally distributed experimentation using low-cost, commercial cloud infrastructure provided by Microsoft™. STADE testbed sites are situated in the cloud within each continent: Africa, Asia, Australia, Europe, North America, and South America. Communication occurs over the commercial internet. Three STADE case studies are investigated. The first case study processes commercial air traffic flows, the second case study processes global earthquake measurements, and the third case study processes social media (i.e., Twitter™) feeds. These case studies confirm that STADE is a viable architecture for the near real-time identification of anomalies in streaming data originating from (possibly) computationally disadvantaged, geographically dispersed sites. Moreover, the addition of the FAD provides enhanced anomaly detection capability. Since STADE is domain-independent, these findings can be easily extended to additional application domains and use cases

    Estimating Fire Weather Indices via Semantic Reasoning over Wireless Sensor Network Data Streams

    Full text link
    Wildfires are frequent, devastating events in Australia that regularly cause significant loss of life and widespread property damage. Fire weather indices are a widely-adopted method for measuring fire danger and they play a significant role in issuing bushfire warnings and in anticipating demand for bushfire management resources. Existing systems that calculate fire weather indices are limited due to low spatial and temporal resolution. Localized wireless sensor networks, on the other hand, gather continuous sensor data measuring variables such as air temperature, relative humidity, rainfall and wind speed at high resolutions. However, using wireless sensor networks to estimate fire weather indices is a challenge due to data quality issues, lack of standard data formats and lack of agreement on thresholds and methods for calculating fire weather indices. Within the scope of this paper, we propose a standardized approach to calculating Fire Weather Indices (a.k.a. fire danger ratings) and overcome a number of the challenges by applying Semantic Web Technologies to the processing of data streams from a wireless sensor network deployed in the Springbrook region of South East Queensland. This paper describes the underlying ontologies, the semantic reasoning and the Semantic Fire Weather Index (SFWI) system that we have developed to enable domain experts to specify and adapt rules for calculating Fire Weather Indices. We also describe the Web-based mapping interface that we have developed, that enables users to improve their understanding of how fire weather indices vary over time within a particular region.Finally, we discuss our evaluation results that indicate that the proposed system outperforms state-of-the-art techniques in terms of accuracy, precision and query performance.Comment: 20pages, 12 figure

    A Framework and Classification for Fault Detection Approaches in Wireless Sensor Networks with an Energy Efficiency Perspective

    Get PDF
    Wireless Sensor Networks (WSNs) are more and more considered a key enabling technology for the realisation of the Internet of Things (IoT) vision. With the long term goal of designing fault-tolerant IoT systems, this paper proposes a fault detection framework for WSNs with the perspective of energy efficiency to facilitate the design of fault detection methods and the evaluation of their energy efficiency. Following the same design principle of the fault detection framework, the paper proposes a classification for fault detection approaches. The classification is applied to a number of fault detection approaches for the comparison of several characteristics, namely, energy efficiency, correlation model, evaluation method, and detection accuracy. The design guidelines given in this paper aim at providing an insight into better design of energy-efficient detection approaches in resource-constraint WSNs

    Predicting Cetacean Distributions in the Eastern North Atlantic to Support Marine Management

    Get PDF
    Acknowledgments We thank all the volunteers for their contribution and dedication during the monitoring campaigns. This manscript is a product of the work of every observer who participated in the CETUS Project. We are extremely grateful to TRANSINSULAR, the cargo ship company that provided all the logistic support, and to the ships’ crews for their hospitality. We also thank Vasilis Valavanis for his valuable advice about the use of oceanographic variables.Peer reviewedPublisher PD

    Deep learning enhanced principal component analysis for structural health monitoring

    Get PDF
    This paper proposes a Deep Learning Enhanced Principal Component Analysis (PCA) approach for outlier detection to assess the structural condition of bridges. We employ partially explainable autoencoder architecture to replicate and enhance the data compression and reconstruction ability of PCA. The particularity of the method lies in the addition of residual connections to account for nonlinearities. We apply the proposed method to monitoring data obtained from two bridges under real operation conditions and compare the results before and after adding the residual connections. Results show that the addition of residual connections enhances the outlier detection ability of the network, allowing to detect lighter damages
    corecore