3,698 research outputs found
Role of homeostasis in learning sparse representations
Neurons in the input layer of primary visual cortex in primates develop
edge-like receptive fields. One approach to understanding the emergence of this
response is to state that neural activity has to efficiently represent sensory
data with respect to the statistics of natural scenes. Furthermore, it is
believed that such an efficient coding is achieved using a competition across
neurons so as to generate a sparse representation, that is, where a relatively
small number of neurons are simultaneously active. Indeed, different models of
sparse coding, coupled with Hebbian learning and homeostasis, have been
proposed that successfully match the observed emergent response. However, the
specific role of homeostasis in learning such sparse representations is still
largely unknown. By quantitatively assessing the efficiency of the neural
representation during learning, we derive a cooperative homeostasis mechanism
that optimally tunes the competition between neurons within the sparse coding
algorithm. We apply this homeostasis while learning small patches taken from
natural images and compare its efficiency with state-of-the-art algorithms.
Results show that while different sparse coding algorithms give similar coding
results, the homeostasis provides an optimal balance for the representation of
natural images within the population of neurons. Competition in sparse coding
is optimized when it is fair. By contributing to optimizing statistical
competition across neurons, homeostasis is crucial in providing a more
efficient solution to the emergence of independent components
Enabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)
In applications of Gaussian processes where quantification of uncertainty is
of primary interest, it is necessary to accurately characterize the posterior
distribution over covariance parameters. This paper proposes an adaptation of
the Stochastic Gradient Langevin Dynamics algorithm to draw samples from the
posterior distribution over covariance parameters with negligible bias and
without the need to compute the marginal likelihood. In Gaussian process
regression, this has the enormous advantage that stochastic gradients can be
computed by solving linear systems only. A novel unbiased linear systems solver
based on parallelizable covariance matrix-vector products is developed to
accelerate the unbiased estimation of gradients. The results demonstrate the
possibility to enable scalable and exact (in a Monte Carlo sense)
quantification of uncertainty in Gaussian processes without imposing any
special structure on the covariance or reducing the number of input vectors.Comment: 10 pages - paper accepted at ICML 201
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
A recent strategy to circumvent the exploding and vanishing gradient problem
in RNNs, and to allow the stable propagation of signals over long time scales,
is to constrain recurrent connectivity matrices to be orthogonal or unitary.
This ensures eigenvalues with unit norm and thus stable dynamics and training.
However this comes at the cost of reduced expressivity due to the limited
variety of orthogonal transformations. We propose a novel connectivity
structure based on the Schur decomposition and a splitting of the Schur form
into normal and non-normal parts. This allows to parametrize matrices with
unit-norm eigenspectra without orthogonality constraints on eigenbases. The
resulting architecture ensures access to a larger space of spectrally
constrained matrices, of which orthogonal matrices are a subset. This crucial
difference retains the stability advantages and training speed of orthogonal
RNNs while enhancing expressivity, especially on tasks that require
computations over ongoing input sequences
Hybrid artificial intelligence algorithms for short-term load and price forecasting in competitive electric markets
The liberalization and deregulation of electric markets forced the various participants to accommodate several challenges, including: a considerable accumulation of new generation capacity from renewable sources (fundamentally wind energy), the unpredictability associated with these new forms of generation and new consumption patterns, contributing to further electricity prices volatility (e.g. the Iberian market).
Given the competitive framework in which market participants operate, the existence of efficient computational forecasting techniques is a distinctive factor. Based on these forecasts a suitable bidding strategy and an effective generation systems operation planning is achieved, together with an improved installed transmission capacity exploitation, results in maximized profits, all this contributing to a better energy resources utilization.
This dissertation presents a new hybrid method for load and electricity prices forecasting, for one day ahead time horizon. The optimization scheme presented in this method, combines the efforts from different techniques, notably artificial neural networks, several optimization algorithms and wavelet transform. The method’s validation was made using different real case studies. The subsequent comparison (accuracy wise) with published results, in reference journals, validated the proposed hybrid method suitability.O processo de liberalização e desregulação dos mercados de energia elétrica, obrigou os diversos participantes a acomodar uma série de desafios, entre os quais: a acumulação considerável de nova capacidade de geração proveniente de origem renovável (fundamentalmente energia eólica), a imprevisibilidade associada a estas novas formas de geração e novos padrões de consumo. Resultando num aumento da volatilidade associada aos preços de energia elétrica (como é exemplo o mercado ibérico).
Dado o quadro competitivo em que os agentes de mercado operam, a existência de técnicas computacionais de previsão eficientes, constituí um fator diferenciador. É com base nestas previsões que se definem estratégias de licitação e se efetua um planeamento da operação eficaz dos sistemas de geração que, em conjunto com um melhor aproveitamento da capacidade de transmissão instalada, permite maximizar os lucros, realizando ao mesmo tempo um melhor aproveitamento dos recursos energéticos.
Esta dissertação apresenta um novo método híbrido para a previsão da carga e dos preços da energia elétrica, para um horizonte temporal a 24 horas. O método baseia-se num esquema de otimização que reúne os esforços de diferentes técnicas, nomeadamente redes neuronais artificiais, diversos algoritmos de otimização e da transformada de wavelet. A validação do método foi feita em diferentes casos de estudo reais. A posterior comparação com resultados já publicados em revistas de referência, revelou um excelente desempenho do método hibrido proposto
Parallelizable sparse inverse formulation Gaussian processes (SpInGP)
We propose a parallelizable sparse inverse formulation Gaussian process
(SpInGP) for temporal models. It uses a sparse precision GP formulation and
sparse matrix routines to speed up the computations. Due to the state-space
formulation used in the algorithm, the time complexity of the basic SpInGP is
linear, and because all the computations are parallelizable, the parallel form
of the algorithm is sublinear in the number of data points. We provide example
algorithms to implement the sparse matrix routines and experimentally test the
method using both simulated and real data.Comment: Presented at Machine Learning in Signal Processing (MLSP2017
Computational Methods for Support Vector Machine Classification and Large-Scale Kalman Filtering
The first half of this dissertation focuses on computational methods for solving the constrained quadratic program (QP) within the support vector machine (SVM) classifier. One of the SVM formulations requires the solution of bound and equality constrained QPs. We begin by describing an augmented Lagrangian approach which incorporates the equality constraint into the objective function, resulting in a bound constrained QP. Furthermore, all constraints may be incorporated into the objective function to yield an unconstrained quadratic program, allowing us to apply the conjugate gradient (CG) method. Lastly, we adapt the scaled gradient projection method of [10] to the SVM QP and compare the performance of these methods with the state-of-the-art sequential minimal optimization algorithm and MATLAB\u27s built in constrained QP solver, quadprog. The augmented Lagrangian method outperforms other state-of-the-art methods on three image test cases. The second half of this dissertation focuses on computational methods for large-scale Kalman filtering applications. The Kalman filter (KF) is a method for solving a dynamic, coupled system of equations. While these methods require only linear algebra, standard KF is often infeasible in large-scale implementations due to the storage requirements and inverse calculations of large, dense covariance matrices. We introduce the use of the CG and Lanczos methods into various forms of the Kalman filter for low-rank approximations of the covariance matrices, with low-storage requirements. We also use CG for efficient Gaussian sampling within the ensemble Kalman filter method. The CG-based KF methods perform similarly in root-mean-square error when compared to the standard KF methods, when the standard implementations are feasible, and outperform the limited-memory Broyden-Fletcher-Goldfarb-Shanno approximation method
Dynamic analysis of synchronous machine using neural network based characterization clustering and pattern recognition
Synchronous generators form the principal source of electric energy in power systems. Dynamic analysis for transient condition of a synchronous machine is done under different fault conditions. Synchronous machine models are simulated numerically based on mathematical models where saturation on main flux was ignored in one model and taken into account in another. The developed models were compared and scrutinized for transient conditions under different kind of faults – loss of field (LOF), disturbance in torque (DIT) & short circuit (SC). The simulation was done for LOF and DIT for different levels of fault and time durations, whereas, for SC simulation was done for different time durations. The model is also scrutinized for stability stipulations.
Based on the synchronous machine model, a neural network model of synchronous machine is developed using neural network based characterization. The model is trained to approximate different transient conditions; such as – loss of field, disturbance in torque and short circuit conditions. In the case of multiple or mixture of different kinds of faults, neural network based clustering is used to distinguish and identify specific fault conditions by looking at the behaviour of the load angle. By observing the weight distribution pattern of the Self Organizing Map (SOM) space, specific kinds of faults is recognized. Neural network patter identification is used to identify and specify unknown fault patterns. Once the faults are identified neural network pattern identification is used to recognize and indicate the level or time duration of the fault
Experiments with Infinite-Horizon, Policy-Gradient Estimation
In this paper, we present algorithms that perform gradient ascent of the
average reward in a partially observable Markov decision process (POMDP). These
algorithms are based on GPOMDP, an algorithm introduced in a companion paper
(Baxter and Bartlett, this volume), which computes biased estimates of the
performance gradient in POMDPs. The algorithm's chief advantages are that it
uses only one free parameter beta, which has a natural interpretation in terms
of bias-variance trade-off, it requires no knowledge of the underlying state,
and it can be applied to infinite state, control and observation spaces. We
show how the gradient estimates produced by GPOMDP can be used to perform
gradient ascent, both with a traditional stochastic-gradient algorithm, and
with an algorithm based on conjugate-gradients that utilizes gradient
information to bracket maxima in line searches. Experimental results are
presented illustrating both the theoretical results of (Baxter and Bartlett,
this volume) on a toy problem, and practical aspects of the algorithms on a
number of more realistic problems
- …