6 research outputs found

    Data Mining and Machine Learning for Environmental Systems Modelling and Analysis

    Get PDF
    This thesis provides an investigation of environmental systems modelling and analysis based on system identification techniques. In particular, this work focuses on adapting and developing a new Nonlinear AutoRegressive with eXogenous inputs (NARX) framework, and its application to analyse some environmental case studies. Such a framework has proved to be very convenient to model systems with nonlinear dynamics because it builds a model using the Orthogonal Forward Regression (OFR) algorithm by recursively selecting model regressors from a pool of candidate terms. This selection is performed by means of a dependency metric, which measures the contribution of a candidate term to explain a signal of interest. For the first time, this thesis introduces a package in the R programming language for the construction of NARX models. This includes a set of features for effectively performing system identification, including model selection, parameter estimation, model validation, model visualisation and model evaluation. This package is used extensively throughout this thesis. This thesis highlights two new components of the original OFR algorithm. The first one aims to extend the deterministic notion of the NARX methodology by introducing the distance correlation metric, which can provide interpretability of nonlinear dependencies, together with the bagging method, which can provide an uncertainty analysis. This implementation produces a bootstrap distribution not only for the parameter estimates, but also for the forecasts. The biggest advantage is that it does not require the specification of prior distributions, as it is usually done in Bayesian analysis. The NARX methodology has been employed with systems where both inputs and outputs are continuous variables. Nevertheless, in real-life problems, variables can also appear in categorical form. Of special interest are systems where the output signal is binary. The second new component of the OFR algorithm is able to deal with this type of variable by finding relationships with regressors that are continuous lagged input variables. This improvement helps to identify model terms that have a key role in a classification process. Furthermore, this thesis discusses two environmental case studies: the first one on the analysis of the Atlantic Meridional Overturning Circulation (AMOC) anomaly, and the second one on the study of global magnetic disturbances in near-Earth space. Although the AMOC anomaly has been studied in the past, this thesis analyses it using NARX models for the first time. The task is challenging given that the sample size available is small. This requires some preprocessing steps in order to obtain a feasible model that can forecast future AMOC values, and hindcast back to January of 1980. In the second case study, magnetic disturbances in near-Earth space are studied by means of the Kp index. This index goes from 0 (very quiet) to 9 (very disturbed) in 28 levels. There is special interest in the forecast of high magnetic disturbances given their impact on terrestrial technology and astronauts' safety, but these events are rare and therefore, difficult to predict. Two approaches are analysed using the NARX methodology in order to assess the best modelling strategy. Although this phenomenon has been studied with other techniques providing very promising results, the NARX models are able to provide an insightful relationship of the Kp index to solar wind parameters, which can be useful in other geomagnetic analyses

    A novel logistic-NARX model as a classifier for dynamic binary classification

    Get PDF
    System identification and data-driven modeling techniques have seen ubiquitous applications in the past decades. In particular, parametric modeling methodologies such as linear and nonlinear autoregressive with exogenous input models (ARX and NARX) and other similar and related model types have been preferably applied to handle diverse data-driven modeling problems due to their easy-to-compute linear-in-the-parameter structure, which allows the resultant models to be easily interpreted. In recent years, several variations of the NARX methodology have been proposed that improve the performance of the original algorithm. Nevertheless, in most cases, NARX models are applied to regression problems where all output variables involve continuous or discrete-time sequences sampled from a continuous process, and little attention has been paid to classification problems where the output signal is a binary sequence. Therefore, we developed a novel classification algorithm that combines the NARX methodology with logistic regression and the proposed method is referred to as logistic-NARX model. Such a combination is advantageous since the NARX methodology helps to deal with the multicollinearity problem while the logistic regression produces a model that predicts categorical outcomes. Furthermore, the NARX approach allows for the inclusion of lagged terms and interactions between them in a straight forward manner resulting in interpretable models where users can identify which input variables play an important role individually and/or interactively in the classification process, something that is not achievable using other classification techniques like random forests, support vector machines, and k-nearest neighbors. The efficiency of the proposed method is tested with five case studies

    Combined dark matter searches towards dwarf spheroidal galaxies with Fermi-LAT, HAWC, H.E.S.S., MAGIC, and VERITAS

    No full text
    Cosmological and astrophysical observations suggest that 85\% of the total matter of the Universe is made of Dark Matter (DM). However, its nature remains one of the most challenging and fundamental open questions of particle physics. Assuming particle DM, this exotic form of matter cannot consist of Standard Model (SM) particles. Many models have been developed to attempt unraveling the nature of DM such as Weakly Interacting Massive Particles (WIMPs), the most favored particle candidates. WIMP annihilations and decay could produce SM particles which in turn hadronize and decay to give SM secondaries such as high energy Îł\gamma rays. In the framework of indirect DM search, observations of promising targets are used to search for signatures of DM annihilation. Among these, the dwarf spheroidal galaxies (dSphs) are commonly favored owing to their expected high DM content and negligible astrophysical background. In this work, we present the very first combination of 20 dSph observations, performed by the Fermi-LAT, HAWC, H.E.S.S., MAGIC, and VERITAS collaborations in order to maximize the sensitivity of DM searches and improve the current results. We use a joint maximum likelihood approach combining each experiment's individual analysis to derive more constraining upper limits on the WIMP DM self-annihilation cross-section as a function of DM particle mass. We present new DM constraints over the widest mass range ever reported, extending from 5 GeV to 100 TeV thanks to the combination of these five different Îł\gamma-ray instruments

    Multimessenger observations of a flaring blazar coincident with high-energy neutrino IceCube-170922A

    No full text
    corecore