3,698 research outputs found

    A Fast Learning Variable Lambda TD Model: Used to Realize Home Aware Robot Navigation

    Get PDF

    Role of homeostasis in learning sparse representations

    Full text link
    Neurons in the input layer of primary visual cortex in primates develop edge-like receptive fields. One approach to understanding the emergence of this response is to state that neural activity has to efficiently represent sensory data with respect to the statistics of natural scenes. Furthermore, it is believed that such an efficient coding is achieved using a competition across neurons so as to generate a sparse representation, that is, where a relatively small number of neurons are simultaneously active. Indeed, different models of sparse coding, coupled with Hebbian learning and homeostasis, have been proposed that successfully match the observed emergent response. However, the specific role of homeostasis in learning such sparse representations is still largely unknown. By quantitatively assessing the efficiency of the neural representation during learning, we derive a cooperative homeostasis mechanism that optimally tunes the competition between neurons within the sparse coding algorithm. We apply this homeostasis while learning small patches taken from natural images and compare its efficiency with state-of-the-art algorithms. Results show that while different sparse coding algorithms give similar coding results, the homeostasis provides an optimal balance for the representation of natural images within the population of neurons. Competition in sparse coding is optimized when it is fair. By contributing to optimizing statistical competition across neurons, homeostasis is crucial in providing a more efficient solution to the emergence of independent components

    Enabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)

    Get PDF
    In applications of Gaussian processes where quantification of uncertainty is of primary interest, it is necessary to accurately characterize the posterior distribution over covariance parameters. This paper proposes an adaptation of the Stochastic Gradient Langevin Dynamics algorithm to draw samples from the posterior distribution over covariance parameters with negligible bias and without the need to compute the marginal likelihood. In Gaussian process regression, this has the enormous advantage that stochastic gradients can be computed by solving linear systems only. A novel unbiased linear systems solver based on parallelizable covariance matrix-vector products is developed to accelerate the unbiased estimation of gradients. The results demonstrate the possibility to enable scalable and exact (in a Monte Carlo sense) quantification of uncertainty in Gaussian processes without imposing any special structure on the covariance or reducing the number of input vectors.Comment: 10 pages - paper accepted at ICML 201

    Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

    Full text link
    A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over long time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary. This ensures eigenvalues with unit norm and thus stable dynamics and training. However this comes at the cost of reduced expressivity due to the limited variety of orthogonal transformations. We propose a novel connectivity structure based on the Schur decomposition and a splitting of the Schur form into normal and non-normal parts. This allows to parametrize matrices with unit-norm eigenspectra without orthogonality constraints on eigenbases. The resulting architecture ensures access to a larger space of spectrally constrained matrices, of which orthogonal matrices are a subset. This crucial difference retains the stability advantages and training speed of orthogonal RNNs while enhancing expressivity, especially on tasks that require computations over ongoing input sequences

    Hybrid artificial intelligence algorithms for short-term load and price forecasting in competitive electric markets

    Get PDF
    The liberalization and deregulation of electric markets forced the various participants to accommodate several challenges, including: a considerable accumulation of new generation capacity from renewable sources (fundamentally wind energy), the unpredictability associated with these new forms of generation and new consumption patterns, contributing to further electricity prices volatility (e.g. the Iberian market). Given the competitive framework in which market participants operate, the existence of efficient computational forecasting techniques is a distinctive factor. Based on these forecasts a suitable bidding strategy and an effective generation systems operation planning is achieved, together with an improved installed transmission capacity exploitation, results in maximized profits, all this contributing to a better energy resources utilization. This dissertation presents a new hybrid method for load and electricity prices forecasting, for one day ahead time horizon. The optimization scheme presented in this method, combines the efforts from different techniques, notably artificial neural networks, several optimization algorithms and wavelet transform. The method’s validation was made using different real case studies. The subsequent comparison (accuracy wise) with published results, in reference journals, validated the proposed hybrid method suitability.O processo de liberalização e desregulação dos mercados de energia elétrica, obrigou os diversos participantes a acomodar uma série de desafios, entre os quais: a acumulação considerável de nova capacidade de geração proveniente de origem renovável (fundamentalmente energia eólica), a imprevisibilidade associada a estas novas formas de geração e novos padrões de consumo. Resultando num aumento da volatilidade associada aos preços de energia elétrica (como é exemplo o mercado ibérico). Dado o quadro competitivo em que os agentes de mercado operam, a existência de técnicas computacionais de previsão eficientes, constituí um fator diferenciador. É com base nestas previsões que se definem estratégias de licitação e se efetua um planeamento da operação eficaz dos sistemas de geração que, em conjunto com um melhor aproveitamento da capacidade de transmissão instalada, permite maximizar os lucros, realizando ao mesmo tempo um melhor aproveitamento dos recursos energéticos. Esta dissertação apresenta um novo método híbrido para a previsão da carga e dos preços da energia elétrica, para um horizonte temporal a 24 horas. O método baseia-se num esquema de otimização que reúne os esforços de diferentes técnicas, nomeadamente redes neuronais artificiais, diversos algoritmos de otimização e da transformada de wavelet. A validação do método foi feita em diferentes casos de estudo reais. A posterior comparação com resultados já publicados em revistas de referência, revelou um excelente desempenho do método hibrido proposto

    Parallelizable sparse inverse formulation Gaussian processes (SpInGP)

    Full text link
    We propose a parallelizable sparse inverse formulation Gaussian process (SpInGP) for temporal models. It uses a sparse precision GP formulation and sparse matrix routines to speed up the computations. Due to the state-space formulation used in the algorithm, the time complexity of the basic SpInGP is linear, and because all the computations are parallelizable, the parallel form of the algorithm is sublinear in the number of data points. We provide example algorithms to implement the sparse matrix routines and experimentally test the method using both simulated and real data.Comment: Presented at Machine Learning in Signal Processing (MLSP2017

    Computational Methods for Support Vector Machine Classification and Large-Scale Kalman Filtering

    Get PDF
    The first half of this dissertation focuses on computational methods for solving the constrained quadratic program (QP) within the support vector machine (SVM) classifier. One of the SVM formulations requires the solution of bound and equality constrained QPs. We begin by describing an augmented Lagrangian approach which incorporates the equality constraint into the objective function, resulting in a bound constrained QP. Furthermore, all constraints may be incorporated into the objective function to yield an unconstrained quadratic program, allowing us to apply the conjugate gradient (CG) method. Lastly, we adapt the scaled gradient projection method of [10] to the SVM QP and compare the performance of these methods with the state-of-the-art sequential minimal optimization algorithm and MATLAB\u27s built in constrained QP solver, quadprog. The augmented Lagrangian method outperforms other state-of-the-art methods on three image test cases. The second half of this dissertation focuses on computational methods for large-scale Kalman filtering applications. The Kalman filter (KF) is a method for solving a dynamic, coupled system of equations. While these methods require only linear algebra, standard KF is often infeasible in large-scale implementations due to the storage requirements and inverse calculations of large, dense covariance matrices. We introduce the use of the CG and Lanczos methods into various forms of the Kalman filter for low-rank approximations of the covariance matrices, with low-storage requirements. We also use CG for efficient Gaussian sampling within the ensemble Kalman filter method. The CG-based KF methods perform similarly in root-mean-square error when compared to the standard KF methods, when the standard implementations are feasible, and outperform the limited-memory Broyden-Fletcher-Goldfarb-Shanno approximation method

    Dynamic analysis of synchronous machine using neural network based characterization clustering and pattern recognition

    Get PDF
    Synchronous generators form the principal source of electric energy in power systems. Dynamic analysis for transient condition of a synchronous machine is done under different fault conditions. Synchronous machine models are simulated numerically based on mathematical models where saturation on main flux was ignored in one model and taken into account in another. The developed models were compared and scrutinized for transient conditions under different kind of faults – loss of field (LOF), disturbance in torque (DIT) & short circuit (SC). The simulation was done for LOF and DIT for different levels of fault and time durations, whereas, for SC simulation was done for different time durations. The model is also scrutinized for stability stipulations. Based on the synchronous machine model, a neural network model of synchronous machine is developed using neural network based characterization. The model is trained to approximate different transient conditions; such as – loss of field, disturbance in torque and short circuit conditions. In the case of multiple or mixture of different kinds of faults, neural network based clustering is used to distinguish and identify specific fault conditions by looking at the behaviour of the load angle. By observing the weight distribution pattern of the Self Organizing Map (SOM) space, specific kinds of faults is recognized. Neural network patter identification is used to identify and specify unknown fault patterns. Once the faults are identified neural network pattern identification is used to recognize and indicate the level or time duration of the fault

    Experiments with Infinite-Horizon, Policy-Gradient Estimation

    Full text link
    In this paper, we present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (POMDP). These algorithms are based on GPOMDP, an algorithm introduced in a companion paper (Baxter and Bartlett, this volume), which computes biased estimates of the performance gradient in POMDPs. The algorithm's chief advantages are that it uses only one free parameter beta, which has a natural interpretation in terms of bias-variance trade-off, it requires no knowledge of the underlying state, and it can be applied to infinite state, control and observation spaces. We show how the gradient estimates produced by GPOMDP can be used to perform gradient ascent, both with a traditional stochastic-gradient algorithm, and with an algorithm based on conjugate-gradients that utilizes gradient information to bracket maxima in line searches. Experimental results are presented illustrating both the theoretical results of (Baxter and Bartlett, this volume) on a toy problem, and practical aspects of the algorithms on a number of more realistic problems
    corecore