290 research outputs found

    A First Application of Independent Component Analysis to Extracting Structure from Stock Returns

    Get PDF
    This paper discusses the application of a modern signal processing technique known as independent component analysis (ICA) or blind source separation to multivariate financial time series such as a portfolio of stocks. The key idea of ICA is to linearly map the observed multivariate time series into a new space of statistically independent components (ICs). This can be viewed as a factorization of the portfolio since joint probabilities become simple products in the coordinate system of the ICs. We apply ICA to three years of daily returns of the 28 largest Japanese stocks and compare the results with those obtained using principal component analysis. The results indicate that the estimated ICs fall into two categories, (i) infrequent but large shocks (responsible for the major changes in the stock prices), and (ii) frequent smaller fluctuations (contributing little to the overall level of the stocks). We show that the overall stock price can be reconstructed surprisingly well by using a small number of thresholded weighted ICs. In contrast, when using shocks derived from principal components instead of independent components, the reconstructed price is less similar to the original one. Independent component analysis is a potentially powerful method of analyzing and understanding driving mechanisms in financial markets. There are further promising applications to risk management since ICA focuses on higher-order statistics.Information Systems Working Papers Serie

    Different Estimation Methods for the Basic Independent Component Analysis Model

    Get PDF
    Inspired by classic cocktail-party problem, the basic Independent Component Analysis (ICA) model is created. What differs Independent Component Analysis (ICA) from other kinds of analysis is the intrinsic non-Gaussian assumption of the data. Several approaches are proposed based on maximizing the non-Gaussianity of the data, which is measured by kurtosis, mutual information, and others. With each estimation, we need to optimize the functions of expectations of non-quadratic functions since it can help us to access the higher-order statistics of non-Gaussian part of the data. In this thesis, our goal is to review the one of the most efficient estimation methods, that is, the Fast Fixed-Point Independent Component Analysis (FastICA) algorithm, illustrate it with some examples using an R package

    On Independent Component Analysis and Supervised Dimension Reduction for Time Series

    Get PDF
    The main goal of this thesis work has been to develop tools to recover hidden structures, latent variables, or latent subspaces for multivariate and dependent time series data. The secondary goal has been to write computationally efficient algorithms for the methods to an R-package. In Blind Source Separation (BSS) the goal is to find uncorrelated latent sources by transforming the observed data in an appropriate way. In Independent Component Analysis (ICA) the latent sources are assumed to be independent. The well-known ICA methods FOBI and JADE are generalized to work with multivariate time series, where the latent components exhibit stochastic volatility. In such time series the volatility cannot be regarded as a constant in time, as often there are periods of high and periods of low volatility. The new methods are called gFOBI and gJADE. Also SOBI, a classic method which works well once the volatility is assumed to be constant, is given a variant called vSOBI, that also works with time series with stochastic volatility. In dimension reduction the idea is to transform the data into a new coordinate system, where the components are uncorrelated or even independent, and then keep only some of the transformed variables in such way that we do not lose too much of the important information of the data. The aforementioned BSS methods can be used in unsupervised dimension reduction; all the variables or time series have the same role. In supervised dimension reduction the relationship between a response and predictor variables needs to be considered as well. Wellknown supervised dimension reduction methods for independent and identically distributed data, SIR and SAVE, are generalized to work for time series data. The methods TSIR and TSAVE are introduced and shown to work well for time series, as they also use the information on the past values of the predictor time series. Also TSSH, a hybrid version of TSIR and TSAVE, is introduced. All the methods that have been developed in this thesis have also been implemented in R package tsBSS

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Nonlinear independent component analysis for discrete-time and continuous-time signals

    Get PDF
    We study the classical problem of recovering a multidimensional source signal from observations of nonlinear mixtures of this signal. We show that this recovery is possible (up to a permutation and monotone scaling of the source's original component signals) if the mixture is due to a sufficiently differentiable and invertible but otherwise arbitrarily nonlinear function and the component signals of the source are statistically independent with 'non-degenerate' second-order statistics. The latter assumption requires the source signal to meet one of three regularity conditions which essentially ensure that the source is sufficiently far away from the non-recoverable extremes of being deterministic or constant in time. These assumptions, which cover many popular time series models and stochastic processes, allow us to reformulate the initial problem of nonlinear blind source separation as a simple-to-state problem of optimisation-based function approximation. We propose to solve this approximation problem by minimizing a novel type of objective function that efficiently quantifies the mutual statistical dependence between multiple stochastic processes via cumulant-like statistics. This yields a scalable and direct new method for nonlinear Independent Component Analysis with widely applicable theoretical guarantees and for which our experiments indicate good performance

    Hyperspectral Image Classification With Independent Component Discriminant Analysis

    Full text link
    • …
    corecore