Search CORE

66 research outputs found

Laplace Approximation for Divisive Gaussian Processes for Nonstationary Regression

Author: Figueiras-Vidal AR
Lázaro-Gredilla M
Muñoz-González L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/06/2015
Field of study

The standard Gaussian Process regression (GP) is usually formulated under stationary hypotheses: The noise power is considered constant throughout the input space and the covariance of the prior distribution is typically modeled as depending only on the difference between input samples. These assumptions can be too restrictive and unrealistic for many real-world problems. Although nonstationarity can be achieved using specific covariance functions, they require a prior knowledge of the kind of nonstationarity, not available for most applications. In this paper we propose to use the Laplace approximation to make inference in a divisive GP model to perform nonstationary regression, including heteroscedastic noise cases. The log-concavity of the likelihood ensures a unimodal posterior and makes that the Laplace approximation converges to a unique maximum. The characteristics of the likelihood also allow to obtain accurate posterior approximations when compared to the Expectation Propagation (EP) approximations and the asymptotically exact posterior provided by a Markov Chain Monte Carlo implementation with Elliptical Slice Sampling (ESS), but at a reduced computational load with respect to both, EP and ESS

Spiral - Imperial College Digital Repository

Sujeción de taludes en instalaciones fotovoltaicas mediante el empleo de especies arbustivas

Author: Lázaro Gredilla Pablo
Publication venue
Publication date: 01/01/2012
Field of study

El objetivo de este trabajo fue determinar si la implantación de especies arbustivas puede minimizar el movimiento de tierra ocurrido en un talud producto de la construcción de una instalación solar fotovoltaica. Para ello se implantaron en 2008 diversas especies arbustivas y se tomaron diversas medidas de movimiento de tierra y superficie foliar para poder determinar qué especie podía resultar más adecuada. Desde abril de 2011 hasta abril de 2012 se registraron nuevas mediciones y se compararon con las previas. No se pudo encontrar relación directa entre el movimiento de tierra producido y la superficie foliar, ni resultó evidente que la vegetación implantada fuera más eficaz minimizando el movimiento de tierra que la nacida espontáneamente en parcelas testigoIngeniería Agrícola y ForestalMáster en Investigación en Ingeniería para el Desarrollo Agroforesta

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Documental de la Universidad de Valladolid

Sparse gaussian processes for large-scale machine learning

Author: Lázaro Gredilla Miguel
Publication venue
Publication date: 20/05/2010
Field of study

Gaussian Processes (GPs) are non-parametric, Bayesian models able to achieve state-of-the-art performance in supervised learning tasks such as non-linear regression and classification, thus being used as building blocks for more sophisticated machine learning applications. GPs also enjoy a number of other desirable properties: They are virtually overfitting-free, have sound and convenient model selection procedures, and provide so-called “error bars”, i.e., estimations of their predictions’ uncertainty. Unfortunately, full GPs cannot be directly applied to real-world, large-scale data sets due to their high computational cost. For n data samples, training a GP requires O(n3) computation time, which renders modern desktop computers unable to handle databases with more than a few thousand instances. Several sparse approximations that scale linearly with the number of data samples have been recently proposed, with the Sparse Pseudo-inputs GP (SPGP) representing the current state of the art. Sparse GP approximations can be used to deal with large databases, but, of course, do not usually achieve the performance of full GPs. In this thesis we present several novel sparse GP models that compare favorably with SPGP, both in terms of predictive performance and error bar quality. Our models converge to the full GP under some conditions, but our goal is not so much to faithfully approximate full GPs as it is to develop useful models that provide high-quality probabilistic predictions. By doing so, even full GPs are occasionally outperformed. We provide two broad classes of models: Marginalized Networks (MNs) and Inter- Domain GPs (IDGPs). MNs can be seen as models that lie in between classical Neural Networks (NNs) and full GPs, trying to combine the advantages of both. Though trained differently, when used for prediction they retain the structure of classical NNs, so they can be interpreted as a novel way to train a classical NN, while adding the benefit of input-dependent error bars and overfitting resistance. IDGPs generalize SPGP by allowing the “pseudo-inputs” to lie in a different domain, thus adding extra flexibility and performance. Furthermore, they provide a convenient probabilistic framework in which previous sparse methods can be more easily understood. All the proposed algorithms are tested and compared with the current state of the art on several standard, large-scale data sets with different properties Their strengths and weaknesses are also discussed and compared, so that it is easier to select the best suited candidate for each potential application.Los procesos Gaussianos (Gaussian Processes, GPs) son modelos Bayesianos noparamétricos que representan el actual estado del arte en tareas de aprendizaje supervisado tales como regresión y clasificación. Por este motivo, son uno de los bloques básicos usados en la construcción de otros algoritmos de aprendizaje máquina más sofisticados. Asimismo, los GPs tienen una variedad de propiedades muy deseables: Son prácticamente inmunes al sobreajuste, disponen de mecanismos sensatos y cómodos para la selección de modelo y proporcionan las llamadas "barras de error", es decir, son capaces de estimar la incertidumbre de sus propias predicciones. Desafortunadamente, los GPs completos no pueden aplicarse directamente a bases de datos de gran tamaño, cada vez más fecuentes en la actualidad. Para n muestras, el tiempo de cómputo necesario para entrenar un GP escala como O(n3), lo que hace que un ordenador doméstico actual sea incapaz de manejar conjuntos de datos con más de unos pocos miles de muestras. Para solventar este problema se han propuesto recientemente varias aproximaciones "dispersas", que escalan linealmente con el número de muestras. De entre éstas, el método conocido como "procesos Gaussianos dispersos usando pseudo-entradas"(Sparse Pseudo-inputs GP, SPGP), representa el actual estado del arte. Aunque este tipo de aproximaciones dispersas permiten tratar bases de datos mucho mayores, obviamente no alcanzan el rendimiento de los GPs completos. En esta tesis se introducen varios modelos de GP disperso que presentan un rendimiento mayor que el del SPGP, tanto en cuanto a capacidad predictiva como a calidad de las barras de error. Los modelos propuestos convergen al GP completo que aproximan bajo determinadas condiciones, pero el objetivo de esta tesis no es tanto aproximar fielmente el GP completo original como proporcionar modelos prácticos de alta capacidad predictiva. Tanto es así que, en ocasiones, los nuevos modelos llegan a batir al GP completo que los inspira. Se proporcionan dos clases generales de modelos: Redes marginalizadas (Marginalized Networks, MNs) y GPs inter-dominio (Inter-Domain GPs, IDGPs). Las MNs pueden verse como modelos que se encuentran a mitad de camino entre las redes neuronales clásicas (Neural Networks, NNs) y los GPs completos, intentando combinar las ventajas de ambos. Aunque la fase de entrenamiento de una MN es diferente, cuando se utiliza para predicción mantiene la estructura de una NN clásica, de manera que las MNs pueden ser interpretadas como una manera novedosa de entrenar NNs clásicas, al tiempo que se añaden beneficios adicionales, como resistencia al sobreajuste y "barras de error"dependientes de la entrada. Los IDGPs generalizan el SPGP, permitiendo a las "pseudo-entradas"residir en un dominio diferente del de entrada, incrementado así la flexibilidad y el rendimiento. Además, proporcionan un marco probabilístico adecuado para entender modelos dispersos anteriores Así pues, todos los algoritmos propuestos son puestos a prueba y comparados con el SPGP sobre varios conjuntos de datos estándar de diferentes propiedades y de gran tamaño. Se intentan identificar además las fortalezas y debilidades de cada uno de los métodos, de manera que sea más sencillo elegir el mejor candidato para cada aplicación potencial

Universidad Carlos III de Madrid e-Archivo

Overlapping Mixtures of Gaussian Processes for the data association problem

Author: Lawrence ND
Lázaro-Gredilla M
Van Vaerenbergh S
Publication venue: Pattern Recognition
Publication date: 16/08/2011
Field of study

In this work we introduce a mixture of GPs to address the data association problem, i.e. to label a group of observations according to the sources that generated them. Unlike several previously proposed GP mixtures, the novel mixture has the distinct characteristic of using no gating function to determine the association of samples and mixture components. Instead, all the GPs in the mixture are global and samples are clustered following "trajectories" across input space. We use a non-standard variational Bayesian algorithm to efficiently recover sample labels and learn the hyperparameters. We show how multi-object tracking problems can be disambiguated and also explore the characteristics of the model in traditional regression settings

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Pseudospectral Model Predictive Control under Partially Learned Dynamics

Author: Alvarez M. A.
Gandhi M.
Lázaro-Gredilla M.
Pan Y.
Rahimi A.
Rudin W.
Wang J.
Publication venue
Publication date: 15/02/2017
Field of study

Trajectory optimization of a controlled dynamical system is an essential part of autonomy, however many trajectory optimization techniques are limited by the fidelity of the underlying parametric model. In the field of robotics, a lack of model knowledge can be overcome with machine learning techniques, utilizing measurements to build a dynamical model from the data. This paper aims to take the middle ground between these two approaches by introducing a semi-parametric representation of the underlying system dynamics. Our goal is to leverage the considerable information contained in a traditional physics based model and combine it with a data-driven, non-parametric regression technique known as a Gaussian Process. Integrating this semi-parametric model with model predictive pseudospectral control, we demonstrate this technique on both a cart pole and quadrotor simulation with unmodeled damping and parametric error. In order to manage parametric uncertainty, we introduce an algorithm that utilizes Sparse Spectrum Gaussian Processes (SSGP) for online learning after each rollout. We implement this online learning technique on a cart pole and quadrator, then demonstrate the use of online learning and obstacle avoidance for the dubin vehicle dynamics.Comment: Accepted but withdrawn from AIAA Scitech 201

arXiv.org e-Print Archive

Crossref

Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments

Author: Dedieu Antoine
George Dileep
Lehrach Wolfgang
Lázaro-Gredilla Miguel
Zhou Guangyao
Publication venue
Publication date: 11/01/2024
Field of study

Despite their stellar performance on a wide range of tasks, including in-context tasks only revealed during inference, vanilla transformers and variants trained for next-token predictions (a) do not learn an explicit world model of their environment which can be flexibly queried and (b) cannot be used for planning or navigation. In this paper, we consider partially observed environments (POEs), where an agent receives perceptually aliased observations as it navigates, which makes path planning hard. We introduce a transformer with (multiple) discrete bottleneck(s), TDB, whose latent codes learn a compressed representation of the history of observations and actions. After training a TDB to predict the future observation(s) given the history, we extract interpretable cognitive maps of the environment from its active bottleneck(s) indices. These maps are then paired with an external solver to solve (constrained) path planning problems. First, we show that a TDB trained on POEs (a) retains the near perfect predictive performance of a vanilla transformer or an LSTM while (b) solving shortest path problems exponentially faster. Second, a TDB extracts interpretable representations from text datasets, while reaching higher in-context accuracy than vanilla sequence models. Finally, in new POEs, a TDB (a) reaches near-perfect in-context accuracy, (b) learns accurate in-context cognitive maps (c) solves in-context path planning problems

arXiv.org e-Print Archive