122 research outputs found

    Quaternion Information Theoretic Learning Adaptive Algorithms for Nonlinear Adaptive

    Get PDF
    Information Theoretic Learning (ITL) is gaining popularity for designing adaptive filters for a non-stationary or non-Gaussian environment [1] [2] . ITL cost functions such as the Minimum Error Entropy (MEE) have been applied to both linear and nonlinear adaptive filtering with better overall performance compared with the typical mean squared error (MSE) and least-squares type adaptive filtering, especially for nonlinear systems in higher-order statistic noise environments [3]. Quaternion valued data processing is beneficial in applications such as robotics and image processing, particularly for performing transformations in 3-dimensional space. Particularly the benefit for quaternion valued processing includes performing data transformations in a 3 or 4-dimensional space in a more convenient fashion than using vector algebra [4, 5, 6, 7, 8]. Adaptive filtering in quaterion domain operates intrinsically based on the augmented statistics which the quaternion input vector covariance is taken into account naturally and as a result it incorporates component-wise real valued cross-correlation or the coupling within the dimensions of the quaternion input [9]. The generalized Hamilton-real calculus (GHR) for the quaternion data simplified product and chain rules and allows us to calculate the gradient and Hessian of quaternion based cost function of the learning algorithms eciently [10][11] . The quaternion reproducing kernel Hilbert spaces and its uniqueness provide a mathematical foundation to develop the quaternion value kernel learning algorithms [12]. The reproducing property of the feature space replace the inner product of feature samples with kernel evaluation. In this dissertation, we first propose a kernel adaptive filter for quaternion data based on minimum error entropy cost function. The new algorithm is based on error entropy function and is referred to as the quaternion kernel minimum error entropy (QKMEE) algorithm [13]. We apply generalized Hamilton-real (GHR) calculus that is applicable to quaternion Hilbert space for evaluating the cost function gradient to develop the QKMEE algorithm. The minimum error entropy (MEE) algorithm [3, 14, 15] minimizes Renyis quadratic entropy of the error between the lter output and desired response or indirectly maximizing the error information potential. ITL methodology improves the performance of adaptive algorithm in biased or non-Gaussian signals and noise enviorments compared to the mean squared error (MSE) criterion algorithms such as the kernel least mean square algorithm. Second, we develop a kernel adaptive filter for quaternion data based on normalized minimum error entropy cost function [14]. We apply generalized Hamilton-real GHR) calculus that is applicable to Hilbert space for evaluating the cost function gradient to develop the quaternion kernel normalized minimum error entropy (QKNMEE) algorithm [16]. The new proposed algorithm enhanced QKMEE algorithm where the filter update stepsize selection will be independent of the input power and the kernel size. Third, we develop a kernel adaptive lter for quaternion domain data, based on information theoretic learning cost function which could be useful for quaternion based kernel applications of nonlinear filtering. The new algorithm is based on error entropy function with fiducial point and is referred to as the quaternion kernel minimum error entropy with fiducial point (QKMEEF) algorithm [17]. In our previous work we developed quaternion kernel adaptive lter based on minimum error entropy referred to as the QKMEE algorithm [13]. Since entropy does not change with the mean of the distribution, the algorithm may converge to a set of optimal weights without having zero mean error. Traditionally, to make the zero mean output error, the output during testing session was biased with the mean of errors of training session. However, for non-symmetric or heavy tails error PDF the estimation of error mean is problematic [18]. The minimum error entropy criterion, minimizes Renyi\u27s quadratic entropy of the error between the filter output and desired response or indirectly maximizing the error information potential [19]. Here, the approach is applied to quaternions. Adaptive filtering in quaterion domain intrinsically incorporates component-wise real valued cross-correlation or the coupling within the dimensions of the quaternion input. We apply generalized Hamilton-real (GHR) calculus that is applicable to Hilbert space for evaluating the cost function gradient to develop the Quaternion Minimum Error Entropy Algorithm with Fiducial point. Simulation results are used to show the behavior of the new algorithm (QKMEEF) when signal is non-Gaussian in presence of unimodal noise versus bi-modal noise distributions. Simulation results also show that the new algorithm QKMEEF can track and predict the 4-Dimensional non-stationary process signals where there are correlations between components better than quadruple real-valued KMEEF and Quat-KLMS algorithms. Fourth, we develop a kernel adaptive filter for quaternion data, using stochastic information gradient (SIG) cost function based on the information theoretic learning (ITL) approach. The new algorithm (QKSIG) is useful for quaternion-based kernel applications of nonlinear ltering [20]. Adaptive filtering in quaterion domain intrinsically incorporates component-wise real valued cross-correlation or the coupling within the dimensions of the quaternion input. We apply generalized Hamilton-real (GHR) calculus that is applicable to quaternion Hilbert space for evaluating the cost function gradient. The QKSIG algorithm minimizes Shannon\u27s entropy of the error between the filter output and desired response and minimizes the divergence between the joint densities of input-desired and input-output pairs. The SIG technique reduces the computational complexity of the error entropy estimation. Here, ITL with SIG approach is applied to quaternion adaptive filtering for three different reasons. First, it reduces the algorithm computational complexity compared to our previous work quaternion kernel minimum error entropy algorithm (QKMEE). Second, it improves the filtering performance by considering the coupling within the dimensions of the quaternion input. Third, it performs better in biased or non-Gaussian signal and noise environments due to ITL approach. We present convergence analysis and steady-state performance analysis results of the new algorithm (QKSIG). Simulation results are used to show the behavior of the new algorithm QKSIG in quaternion non-Gaussian signal and noise environments compared to the existing ones such as quadruple real-valued kernel stochastic information gradient (KSIG) and quaternion kernel LMS (QKLMS) algorithms. Fifth, we develop a kernel adaptive filter for quaternion data, based on stochastic information gradient (SIG) cost function with self adjusting step-size. The new algorithm (QKSIG-SAS) is based on the information theoretic learning (ITL) approach. The new algorithm (QKSIG-SAS) has faster speed of convergence as compared to our previous work QKSIG algorithm

    Time series forecasting using wavelet and support vector machine

    Get PDF
    Master'sMASTER OF ENGINEERIN

    The Challenge of Machine Learning in Space Weather Nowcasting and Forecasting

    Get PDF
    The numerous recent breakthroughs in machine learning (ML) make imperative to carefully ponder how the scientific community can benefit from a technology that, although not necessarily new, is today living its golden age. This Grand Challenge review paper is focused on the present and future role of machine learning in space weather. The purpose is twofold. On one hand, we will discuss previous works that use ML for space weather forecasting, focusing in particular on the few areas that have seen most activity: the forecasting of geomagnetic indices, of relativistic electrons at geosynchronous orbits, of solar flares occurrence, of coronal mass ejection propagation time, and of solar wind speed. On the other hand, this paper serves as a gentle introduction to the field of machine learning tailored to the space weather community and as a pointer to a number of open challenges that we believe the community should undertake in the next decade. The recurring themes throughout the review are the need to shift our forecasting paradigm to a probabilistic approach focused on the reliable assessment of uncertainties, and the combination of physics-based and machine learning approaches, known as gray-box.Comment: under revie

    Broad Learning System Based on Maximum Correntropy Criterion

    Full text link
    As an effective and efficient discriminative learning method, Broad Learning System (BLS) has received increasing attention due to its outstanding performance in various regression and classification problems. However, the standard BLS is derived under the minimum mean square error (MMSE) criterion, which is, of course, not always a good choice due to its sensitivity to outliers. To enhance the robustness of BLS, we propose in this work to adopt the maximum correntropy criterion (MCC) to train the output weights, obtaining a correntropy based broad learning system (C-BLS). Thanks to the inherent superiorities of MCC, the proposed C-BLS is expected to achieve excellent robustness to outliers while maintaining the original performance of the standard BLS in Gaussian or noise-free environment. In addition, three alternative incremental learning algorithms, derived from a weighted regularized least-squares solution rather than pseudoinverse formula, for C-BLS are developed.With the incremental learning algorithms, the system can be updated quickly without the entire retraining process from the beginning, when some new samples arrive or the network deems to be expanded. Experiments on various regression and classification datasets are reported to demonstrate the desirable performance of the new methods

    Applications of fuzzy counterpropagation neural networks to non-linear function approximation and background noise elimination

    Get PDF
    An adaptive filter which can operate in an unknown environment by performing a learning mechanism that is suitable for the speech enhancement process. This research develops a novel ANN model which incorporates the fuzzy set approach and which can perform a non-linear function approximation. The model is used as the basic structure of an adaptive filter. The learning capability of ANN is expected to be able to reduce the development time and cost of the designing adaptive filters based on fuzzy set approach. A combination of both techniques may result in a learnable system that can tackle the vagueness problem of a changing environment where the adaptive filter operates. This proposed model is called Fuzzy Counterpropagation Network (Fuzzy CPN). It has fast learning capability and self-growing structure. This model is applied to non-linear function approximation, chaotic time series prediction and background noise elimination

    proceedings of a workshop held at Göttingen September 27 - 29, 2006

    Get PDF
    An international workshop entitled: Modern Solar Facilities - Advanced Solar Science was held in Göttingen from September 27 until September 29, 2006. The workshop, which was attended by 88 participants from 24 different countries, gave a broad overview of the current state of solar research, with emphasis on modern telescopes and techniques, advanced observational methods and results, and on modern theoretical methods of modelling, computation, and data reduction in solar physics. This book collects written versions of contributions that were presented at the workshop as invited or contributed talks, and as poster contributions.Vom 27. bis 29. September 2006 fand in Göttingen ein internationaler Workshop zum Thema: Modern Solar Facilities - Advanced Solar Science statt, der von 88 Teilnehmern aus 24 verschiedenen Ländern besucht wurde und der einen breiten Überblick über den gegenwärtigen Stand der sonnenphysikalischen Forschung gab, unter Betonung moderner Teleskope und Techniken, fortschrittlicher Beobachtungsmethoden und Ergebnisse, sowie zu modernen theoretischen Verfahren der Modellierung, Berechnung und Datenreduktion in der Sonnenphysik. Dieser Band fasst die schriftlichen Versionen von Beiträgen zusammen, die auf der Konferenz als eingeladene oder angemeldete Vorträge, sowie als Posterbeiträge präsentiert worden sind.conferenc

    Interdisciplinary application of nonlinear time series methods

    Full text link
    This paper reports on the application to field measurements of time series methods developed on the basis of the theory of deterministic chaos. The major difficulties are pointed out that arise when the data cannot be assumed to be purely deterministic and the potential that remains in this situation is discussed. For signals with weakly nonlinear structure, the presence of nonlinearity in a general sense has to be inferred statistically. The paper reviews the relevant methods and discusses the implications for deterministic modeling. Most field measurements yield nonstationary time series, which poses a severe problem for their analysis. Recent progress in the detection and understanding of nonstationarity is reported. If a clear signature of approximate determinism is found, the notions of phase space, attractors, invariant manifolds etc. provide a convenient framework for time series analysis. Although the results have to be interpreted with great care, superior performance can be achieved for typical signal processing tasks. In particular, prediction and filtering of signals are discussed, as well as the classification of system states by means of time series recordings.Comment: 86 pages, 26 figure

    Statistical Inference for the Duffing Process

    Get PDF
    The aim of the research concerns inference methods for non-linear dynamical systems. In particular, the focus is on a differential equation called Duffing oscillator. This equation is suitable to model non-linear phenomena like jumps, hysteresis, or subharmonics and it may lead to chaotic behaviour as control parameters vary. Such behaviour have been observed in many different real-world scenarios, as in economics or biology. Inference in the Duffing process is performed with the unscented Kalman filter (UKF) by casting the system in state space form. In the context of ordinary differential equations, the uncertainty of the UKF estimates for chaotic systems is quantified by a simulation study. To overcome the limitations of the UKF when applied to the Duffing process, a new algorithm that matches Bayesian optimization (BO) and approximate Bayesian computation (ABC) within the UKF scheme is proposed. The novelty consists in (i) optimizing the sigma points location by means of maximization of the likelihood of observations with BO, and (ii) initialize the UKF with candidate parameters coming from the ABC scheme. The proposed algorithm can outperform the UKF in complex systems where the likelihood function is highly multi-modal. Concerning stochastic differential equations, a massive simulation study is presented to evaluate the performance of the UKF for parameter estimation. Finally, illustrations of the method with real data and further developments of the research are discussed.La presente ricerca ha l'obiettivo di sviluppare metodi d'inferenza per sistemi dinamici non lineari. In particolare, l'analisi è incentrata su una equazione differenziale chiamata l'oscillatore di Duffing. Tale equazione è utilizzata per modellare diversi fenomeni non lineari, quali salti, isteresi o subarmoniche, e, in generale, può mostrare comportamenti caotici al variare di parametri di controllo. Tali fenomeni sono diffusi in diversi scenari reali, sia in economia sia in biologia. L'inferenza nel processo di Duffing è condotta tramite unscented Kalman filter (UKF) attraverso la riscrittura del sistema nella forma stato-spazio. Nel contesto di equazioni differenziali ordinarie, l'incertezza delle stime di UKF per sistemi caotici è quantificato tramite uno studio di simulazione. Per superare le limitazioni di UKF quando applicato al sistema di Duffing, viene proposto un nuovo algoritmo che unisce ottimizzazione bayesiana (BO) e approximate bayesian computation (ABC) all'interno dello schema UKF. Le novità del metodo consistono in: (i) ottimizzazione della posizione dei punti sigma tramite la massimizzazione della verosimiglianza delle osservazioni e (ii) inizializzazione di UKF con valori provenienti dallo schema ABC. L'algoritmo proposto può portare stime dei parametri migliori rispetto a UKF nel caso di sistemi complessi dove la funzione di verosimiglianza è altamente multi-modale. Per l'analisi di equazioni differenziali stocastiche, viene presentato un cospicuo studio di simulazione al fine di valutare i risultati del UKF per la stima dei parametri. Infine, si illustra un'applicazione del metodo su dati reali e si discutono gli sviluppi futuri della ricerca

    Deep learning architectures applied to wind time series multi-step forecasting

    Get PDF
    Forecasting is a critical task for the integration of wind-generated energy into electricity grids. Numerical weather models applied to wind prediction, work with grid sizes too large to reproduce all the local features that influence wind, thus making the use of time series with past observations a necessary tool for wind forecasting. This research work is about the application of deep neural networks to multi-step forecasting using multivariate time series as an input, to forecast wind speed at 12 hours ahead. Wind time series are sequences of meteorological observations like wind speed, temperature, pressure, humidity, and direction. Wind series have two statistically relevant properties; non-linearity and non-stationarity, which makes the modelling with traditional statistical tools very inaccurate. In this thesis we design, test and validate novel deep learning models for the wind energy prediction task, applying new deep architectures to the largest open wind data repository available from the National Renewable Laboratory of the US (NREL) with 126,692 wind sites evenly distributed on the US geography. The heterogeneity of the series, obtained from several data origins, allows us to obtain conclusions about the level of fitness of each model to time series that range from highly stationary locations to variable sites from complex areas. We propose Multi-Layer, Convolutional and recurrent Networks as basic building blocks, and then combined into heterogeneous architectures with different variants, trained with optimisation strategies like drop and skip connections, early stopping, adaptive learning rates, filters and kernels of different sizes, between others. The architectures are optimised by the use of structured hyper-parameter setting strategies to obtain the best performing model across the whole dataset. The learning capabilities of the architectures applied to the various sites find relationships between the site characteristics (terrain complexity, wind variability, geographical location) and the model accuracy, establishing novel measures of site predictability relating the fit of the models with indexes from time series spectral or stationary analysis. The designed methods offer new, and superior, alternatives to traditional methods.La predicció de vent és clau per a la integració de l'energia eòlica en els sistemes elèctrics. Els models meteorològics es fan servir per predicció, però tenen unes graelles geogràfiques massa grans per a reproduir totes les característiques locals que influencien la formació de vent, fent necessària la predicció d'acord amb les sèries temporals de mesures passades d'una localització concreta. L'objectiu d'aquest treball d'investigació és l'aplicació de xarxes neuronals profundes a la predicció \textit{multi-step} utilitzant com a entrada series temporals de múltiples variables meteorològiques, per a fer prediccions de vent d'ací a 12 hores. Les sèries temporals de vent són seqüències d'observacions meteorològiques tals com, velocitat del vent, temperatura, humitat, pressió baromètrica o direcció. Les sèries temporals de vent tenen dues propietats estadístiques rellevants, que són la no linearitat i la no estacionalitat, que fan que la modelització amb eines estadístiques sigui poc precisa. En aquesta tesi es validen i proven models de deep learning per la predicció de vent, aquests models d'arquitectures d'autoaprenentatge s'apliquen al conjunt de dades de vent més gran del món, que ha produït el National Renewable Laboratory dels Estats Units (NREL) i que té 126,692 ubicacions físiques de vent distribuïdes per total la geografia de nord Amèrica. L'heterogeneïtat d'aquestes sèries de dades permet establir conclusions fermes en la precisió de cada mètode aplicat a sèries temporals generades en llocs geogràficament molt diversos. Proposem xarxes neuronals profundes de tipus multi-capa, convolucionals i recurrents com a blocs bàsics sobre els quals es fan combinacions en arquitectures heterogènies amb variants, que s'entrenen amb estratègies d'optimització com drops, connexions skip, estratègies de parada, filtres i kernels de diferents mides entre altres. Les arquitectures s'optimitzen amb algorismes de selecció de paràmetres que permeten obtenir el model amb el millor rendiment, en totes les dades. Les capacitats d'aprenentatge de les arquitectures aplicades a ubicacions heterogènies permet establir relacions entre les característiques d'un lloc (complexitat del terreny, variabilitat del vent, ubicació geogràfica) i la precisió dels models, establint mesures de predictibilitat que relacionen la capacitat dels models amb les mesures definides a partir d'anàlisi espectral o d'estacionalitat de les sèries temporals. Els mètodes desenvolupats ofereixen noves i superiors alternatives als algorismes estadístics i mètodes tradicionals.Arquitecturas de aprendizaje profundo aplicadas a la predición en múltiple escalón de series temporales de viento. La predicción de viento es clave para la integración de esta energía eólica en los sistemas eléctricos. Los modelos meteorológicos tienen una resolución geográfica demasiado amplia que no reproduce todas las características locales que influencian en la formación del viento, haciendo necesaria la predicción en base a series temporales de cada ubicación concreta. El objetivo de este trabajo de investigación es la aplicación de redes neuronales profundas a la predicción multi-step usando como entrada series temporales de múltiples variables meteorológicas, para realizar predicciones de viento a 12 horas. Las series temporales de viento son secuencias de observaciones meteorológicas tales como, velocidad de viento, temperatura, humedad, presión barométrica o dirección. Las series temporales de viento tienen dos propiedades estadísticas relevantes, que son la no linealidad y la no estacionalidad, lo que implica que su modelización con herramientas estadísticas sea poco precisa. En esta tesis se validan y verifican modelos de aprendizaje profundo para la predicción de viento, estos modelos de arquitecturas de aprendizaje automático se aplican al conjunto de datos de viento más grande del mundo, que ha sido generado por el National Renewable Laboratory de los Estados Unidos (NREL) y que tiene 126,682 ubicaciones físicas de viento distribuidas por toda la geografía de Estados Unidos. La heterogeneidad de estas series de datos permite establecer conclusiones válidas sobre la validez de cada método al ser aplicado en series temporales generadas en ubicaciones físicas muy diversas. Proponemos redes neuronales profundas de tipo multi capa, convolucionales y recurrentes como tipos básicos, sobre los que se han construido combinaciones en arquitecturas heterogéneas con variantes de entrenamiento como drops, conexiones skip, estrategias de parada, filtros y kernels de distintas medidas, entre otros. Las arquitecturas se optimizan con algoritmos de selección de parámetros que permiten obtener el mejor modelo buscando el mejor rendimiento, incluyendo todos los datos. Las capacidades de aprendizaje de las arquitecturas aplicadas a localizaciones físicas muy variadas permiten establecer relaciones entre las características de una ubicación (complejidad del terreno, variabilidad de viento, ubicación geográfica) y la precisión de los modelos, estableciendo medidas de predictibilidad que relacionan la capacidad de los algoritmos con índices que se definen a partir del análisis espectral o de estacionalidad de las series temporales. Los métodos desarrollados ofrecen nuevas alternativas a los algoritmos estadísticos tradicionales.Postprint (published version
    corecore