6 research outputs found

    Stochastic trapping in a solvable model of on-line independent component analysis

    Full text link
    Previous analytical studies of on-line Independent Component Analysis (ICA) learning rules have focussed on asymptotic stability and efficiency. In practice the transient stages of learning will often be more significant in determining the success of an algorithm. This is demonstrated here with an analysis of a Hebbian ICA algorithm which can find a small number of non-Gaussian components given data composed of a linear mixture of independent source signals. An idealised data model is considered in which the sources comprise a number of non-Gaussian and Gaussian sources and a solution to the dynamics is obtained in the limit where the number of Gaussian sources is infinite. Previous stability results are confirmed by expanding around optimal fixed points, where a closed form solution to the learning dynamics is obtained. However, stochastic effects are shown to stabilise otherwise unstable sub-optimal fixed points. Conditions required to destabilise one such fixed point are obtained for the case of a single non-Gaussian component, indicating that the initial learning rate \eta required to successfully escape is very low (\eta = O(N^{-2}) where N is the data dimension) resulting in very slow learning typically requiring O(N^3) iterations. Simulations confirm that this picture holds for a finite system.Comment: 17 pages, 3 figures. To appear in Neural Computatio

    Further study of independent component analysis in foreign exchange rate markets.

    Get PDF
    by Zhi-Bin Lai.Thesis submitted in: December 1998.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 111-116).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- ICA Model --- p.1Chapter 1.2 --- ICA Algorithms --- p.3Chapter 1.3 --- Foreign Exchange Rate Scheme --- p.9Chapter 1.4 --- Problem Motivation --- p.10Chapter 1.5 --- Main Contribution of the Thesis --- p.10Chapter 1.6 --- Other Contribution of the Thesis --- p.11Chapter 1.7 --- Organization of the Thesis --- p.11Chapter 2 --- Heuristic Dominant ICs Sorting --- p.13Chapter 2.1 --- L1 Norm Sorting --- p.13Chapter 2.2 --- Lp Norm (L3 Norm) Sorting --- p.14Chapter 2.3 --- Problem Motivation --- p.15Chapter 2.4 --- Determination of Dominant ICs --- p.15Chapter 2.5 --- ICA in Foreign Exchange Rate Markets --- p.16Chapter 2.6 --- Comparison of Two Heuristic Methods --- p.16Chapter 2.6.1 --- Experiment 1: US Dollar vs Swiss Franc --- p.18Chapter 2.6.2 --- Experiment 2: US Dollar vs Australian Dollar --- p.21Chapter 2.6.3 --- Experiment 3: US Dollar vs Canadian Dollar --- p.24Chapter 2.6.4 --- Experiment 4: US Dollar vs French Franc --- p.27Chapter 3 --- Forward Selection under MSE Measurement --- p.30Chapter 3.1 --- Order-Sorting Criterion --- p.30Chapter 3.2 --- Order Sorting Approaches --- p.30Chapter 3.3 --- Forward Selection Approach --- p.31Chapter 3.4 --- Comparison of Three Dominant ICs Sorting Methods --- p.32Chapter 3.4.1 --- Experiment 1: US Dollar vs Swiss Franc --- p.33Chapter 3.4.2 --- Experiment 2: US Dollar vs Australian Dollar --- p.37Chapter 3.4.3 --- Experiment 3: US Dollar vs Canadian Dollar --- p.41Chapter 3.4.4 --- Experiment 4: US Dollar vs French Franc --- p.45Chapter 4 --- Backward Elimination Tendency Error --- p.49Chapter 4.1 --- Tendency Error Scheme --- p.49Chapter 4.2 --- Order-Sorting Criterion --- p.50Chapter 4.3 --- Order Sorting Approaches --- p.50Chapter 4.4 --- Backward Elimination Tendency Error Approach --- p.51Chapter 4.5 --- Determination of Dominant ICs --- p.52Chapter 4.6 --- Comparison Between Three Approaches --- p.53Chapter 4.6.1 --- Experiment Results on USD-SWF Return --- p.53Chapter 4.6.2 --- Experiment Results on USD-AUD Return --- p.57Chapter 4.6.3 --- Experiment Results on USD-CAD Return --- p.61Chapter 4.6.4 --- Experiment Results on USD-FRN Return --- p.65Chapter 5 --- Other Analysis of ICA in Foreign Exchange Rate Markets --- p.69Chapter 5.1 --- Variance Characteristics of ICs and PCs --- p.69Chapter 5.2 --- Reconstruction Ability between PCA and ICA --- p.70Chapter 5.3 --- Properties of Independent Components --- p.70Chapter 5.4 --- Autocorrelation --- p.73Chapter 5.5 --- Rescaled Analysis --- p.73Chapter 6 --- Conclusion and Further Work - --- p.78Chapter 6.1 --- Conclusion --- p.78Chapter 6.2 --- Further Work --- p.79Chapter A --- Fast Implement of LPM Algorithm --- p.80Chapter A.1 --- Review of Selecting Subsets from Regression Variables --- p.80Chapter A.2 --- Unconstrained Gradient Based Optimization Methods Survey --- p.85Chapter A.3 --- Characteristics of the Original LPM Algorithm --- p.88Chapter A.4 --- Constrained Learning Rate Adaptation Method --- p.89Chapter A.5 --- Gradient Descent with Momentum Method --- p.9

    Slowness learning for curiosity-driven agents

    Get PDF
    In the absence of external guidance, how can a robot learn to map the many raw pixels of high-dimensional visual inputs to useful action sequences? I study methods that achieve this by making robots self-motivated (curious) to continually build compact representations of sensory inputs that encode different aspects of the changing environment. Previous curiosity-based agents acquired skills by associating intrinsic rewards with world model improvements, and used reinforcement learning (RL) to learn how to get these intrinsic rewards. But unlike in previous implementations, I consider streams of high-dimensional visual inputs, where the world model is a set of compact low-dimensional representations of the high-dimensional inputs. To learn these representations, I use the slowness learning principle, which states that the underlying causes of the changing sensory inputs vary on a much slower time scale than the observed sensory inputs. The representations learned through the slowness learning principle are called slow features (SFs). Slow features have been shown to be useful for RL, since they capture the underlying transition process by extracting spatio-temporal regularities in the raw sensory inputs. However, existing techniques that learn slow features are not readily applicable to curiosity-driven online learning agents, as they estimate computationally expensive covariance matrices from the data via batch processing. The first contribution called the incremental SFA (IncSFA), is a low-complexity, online algorithm that extracts slow features without storing any input data or estimating costly covariance matrices, thereby making it suitable to be used for several online learning applications. However, IncSFA gradually forgets previously learned representations whenever the statistics of the input change. In open-ended online learning, it becomes essential to store learned representations to avoid re- learning previously learned inputs. The second contribution is an online active modular IncSFA algorithm called the curiosity-driven modular incremental slow feature analysis (Curious Dr. MISFA). Curious Dr. MISFA addresses the forgetting problem faced by IncSFA and learns expert slow feature abstractions in order from least to most costly, with theoretical guarantees. The third contribution uses the Curious Dr. MISFA algorithm in a continual curiosity-driven skill acquisition framework that enables robots to acquire, store, and re-use both abstractions and skills in an online and continual manner. I provide (a) a formal analysis of the working of the proposed algorithms; (b) compare them to the existing methods; and (c) use the iCub humanoid robot to demonstrate their application in real-world environments. These contributions together demonstrate that the online implementations of slowness learning make it suitable for an open-ended curiosity-driven RL agent to acquire a repertoire of skills that map the many raw pixels of high-dimensional images to multiple sets of action sequences

    Adaptive blind signal separation.

    Get PDF
    by Chi-Chiu Cheung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1997.Includes bibliographical references (leaves 124-131).Abstract --- p.iAcknowledgments --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- The Blind Signal Separation Problem --- p.1Chapter 1.2 --- Contributions of this Thesis --- p.3Chapter 1.3 --- Applications of the Problem --- p.4Chapter 1.4 --- Organization of the Thesis --- p.5Chapter 2 --- The Blind Signal Separation Problem --- p.7Chapter 2.1 --- The General Blind Signal Separation Problem --- p.7Chapter 2.2 --- Convolutive Linear Mixing Process --- p.8Chapter 2.3 --- Instantaneous Linear Mixing Process --- p.9Chapter 2.4 --- Problem Definition and Assumptions in this Thesis --- p.9Chapter 3 --- Literature Review --- p.13Chapter 3.1 --- Previous Works on Blind Signal Separation with Instantaneous Mixture --- p.13Chapter 3.1.1 --- Algebraic Approaches --- p.14Chapter 3.1.2 --- Neural approaches --- p.15Chapter 3.2 --- Previous Works on Blind Signal Separation with Convolutive Mixture --- p.20Chapter 4 --- The Information-theoretic ICA Scheme --- p.22Chapter 4.1 --- The Bayesian YING-YANG Learning Scheme --- p.22Chapter 4.2 --- The Information-theoretic ICA Scheme --- p.25Chapter 4.2.1 --- Derivation of the cost function from YING-YANG Machine --- p.25Chapter 4.2.2 --- Connections to previous information-theoretic approaches --- p.26Chapter 4.2.3 --- Derivation of the Algorithms --- p.27Chapter 4.2.4 --- Roles and Constraints on the Nonlinearities --- p.30Chapter 4.3 --- Direction and Motivation for the Analysis of the Nonlinearity --- p.30Chapter 5 --- Properties of the Cost Function and the Algorithms --- p.32Chapter 5.1 --- Lemmas and Corollaries --- p.32Chapter 5.1.1 --- Singularity of J(V) --- p.33Chapter 5.1.2 --- Continuity of J(V) --- p.34Chapter 5.1.3 --- Behavior of J(V) along a radially outward line --- p.35Chapter 5.1.4 --- Impossibility of divergence of the information-theoretic ICA al- gorithms with a large class of nonlinearities --- p.36Chapter 5.1.5 --- Number and stability of correct solutions in the 2-channel case --- p.37Chapter 5.1.6 --- Scale for the equilibrium points --- p.39Chapter 5.1.7 --- Absence of local maximum of J(V) --- p.43Chapter 6 --- The Algorithms with Cubic Nonlinearity --- p.44Chapter 6.1 --- The Cubic Nonlinearity --- p.44Chapter 6.2 --- Theoretical Results on the 2-Channel Case --- p.46Chapter 6.2.1 --- Equilibrium points --- p.46Chapter 6.2.2 --- Stability of the equilibrium points --- p.49Chapter 6.2.3 --- An alternative proof for the stability of the equilibrium points --- p.50Chapter 6.2.4 --- Convergence Analysis --- p.52Chapter 6.3 --- Experiments on the 2-Channel Case --- p.53Chapter 6.3.1 --- Experiments on two sub-Gaussian sources --- p.54Chapter 6.3.2 --- Experiments on two super-Gaussian sources --- p.55Chapter 6.3.3 --- Experiments on one super-Gaussian source and one sub-Gaussian source which are globally sub-Gaussian --- p.57Chapter 6.3.4 --- Experiments on one super-Gaussian source and one sub-Gaussian source which are globally super-Gaussian --- p.59Chapter 6.3.5 --- Experiments on asymmetric exponentially distributed signals .。 --- p.60Chapter 6.3.6 --- Demonstration on exactly and nearly singular initial points --- p.61Chapter 6.4 --- Theoretical Results on the 3-Channel Case --- p.63Chapter 6.4.1 --- Equilibrium points --- p.63Chapter 6.4.2 --- Stability --- p.66Chapter 6.5 --- Experiments on the 3-Channel Case --- p.66Chapter 6.5.1 --- Experiments on three pairwise globally sub-Gaussian sources --- p.67Chapter 6.5.2 --- Experiments on three sources consisting of globally sub-Gaussian and globally super-Gaussian pairs --- p.67Chapter 6.5.3 --- Experiments on three pairwise globally super-Gaussian sources --- p.69Chapter 7 --- Nonlinearity and Separation Capability --- p.71Chapter 7.1 --- Theoretical Argument --- p.71Chapter 7.1.1 --- Nonlinearities that strictly match the source distribution --- p.72Chapter 7.1.2 --- Nonlinearities that loosely match the source distribution --- p.72Chapter 7.2 --- Experiment Verification --- p.76Chapter 7.2.1 --- Experiments on reversed sigmoid --- p.76Chapter 7.2.2 --- Experiments on the cubic root nonlinearity --- p.77Chapter 7.2.3 --- Experimental verification of Theorem 2 --- p.77Chapter 7.2.4 --- Experiments on the MMI algorithm --- p.78Chapter 8 --- Implementation with Mixture of Densities --- p.80Chapter 8.1 --- Implementation of the Information-theoretic ICA scheme with Mixture of Densities --- p.80Chapter 8.1.1 --- The mixture of densities --- p.81Chapter 8.1.2 --- Derivation of the algorithms --- p.82Chapter 8.2 --- Experimental Verification on the Nonlinearity Adaptation --- p.84Chapter 8.2.1 --- Experiment 1: Two channels of sub-Gaussian sources --- p.84Chapter 8.2.2 --- Experiment 2: Two channels of super-Gaussian sources --- p.85Chapter 8.2.3 --- Experiment 3: Three channels of different signals --- p.89Chapter 8.3 --- Seeking the Simplest Workable Mixtures of Densities ......... .。 --- p.91Chapter 8.3.1 --- Number of components --- p.91Chapter 8.3.2 --- Mixture of two densities with only biases changeable --- p.93Chapter 9 --- ICA with Non-Kullback Cost Function --- p.97Chapter 9.1 --- Derivation of ICA Algorithms from Non-Kullback Separation Functionals --- p.97Chapter 9.1.1 --- Positive Convex Divergence --- p.97Chapter 9.1.2 --- Lp Divergence --- p.100Chapter 9.1.3 --- De-correlation Index --- p.102Chapter 9.2 --- Experiments on the ICA Algorithm Based on Positive Convex Divergence --- p.103Chapter 9.2.1 --- Experiments on the algorithm with fixed nonlinearities --- p.103Chapter 9.2.2 --- Experiments on the algorithm with mixture of densities --- p.106Chapter 10 --- Conclusions --- p.107Chapter A --- Proof for Stability of the Equilibrium Points of the Algorithm with Cubic Nonlinearity on Two Channels of Signals --- p.110Chapter A.1 --- Stability of Solution Group A --- p.110Chapter A.2 --- Stability of Solution Group B --- p.111Chapter B --- Proof for Stability of the Equilibrium Points of the Algorithm with Cubic Nonlinearity on Three Channels of Signals --- p.119Chapter C --- Proof for Theorem2 --- p.122Bibliography --- p.12
    corecore