66 research outputs found
Laplace Approximation for Divisive Gaussian Processes for Nonstationary Regression
The standard Gaussian Process regression (GP) is usually formulated under stationary hypotheses: The noise power is considered constant throughout the input space and the covariance of the prior distribution is typically modeled as depending only on the difference between input samples. These assumptions can be too restrictive and unrealistic for many real-world problems. Although nonstationarity can be achieved using specific covariance functions, they require a prior knowledge of the kind of nonstationarity, not available for most applications. In this paper we propose to use the Laplace approximation to make inference in a divisive GP model to perform nonstationary regression, including heteroscedastic noise cases. The log-concavity of the likelihood ensures a unimodal posterior and makes that the Laplace approximation converges to a unique maximum. The characteristics of the likelihood also allow to obtain accurate posterior approximations when compared to the Expectation Propagation (EP) approximations and the asymptotically exact posterior provided by a Markov Chain Monte Carlo implementation with Elliptical Slice Sampling (ESS), but at a reduced computational load with respect to both, EP and ESS
Sujeción de taludes en instalaciones fotovoltaicas mediante el empleo de especies arbustivas
El objetivo de este trabajo fue determinar si la implantación de especies arbustivas puede minimizar el movimiento de tierra ocurrido en un talud producto de la construcción de una instalación solar fotovoltaica. Para ello se implantaron en 2008 diversas especies arbustivas y se tomaron diversas medidas de movimiento de tierra y superficie foliar para poder determinar qué especie podía resultar más adecuada. Desde abril de 2011 hasta abril de 2012 se registraron nuevas mediciones y se compararon con las previas. No se pudo encontrar relación directa entre el movimiento de tierra producido y la superficie foliar, ni resultó evidente que la vegetación implantada fuera más eficaz minimizando el movimiento de tierra que la nacida espontáneamente en parcelas testigoIngeniería Agrícola y ForestalMáster en Investigación en Ingeniería para el Desarrollo Agroforesta
Sparse gaussian processes for large-scale machine learning
Gaussian Processes (GPs) are non-parametric, Bayesian models able to achieve state-of-the-art performance in supervised learning tasks such as non-linear regression and classification, thus being used as building blocks for more sophisticated machine learning applications. GPs also enjoy a number of other desirable properties: They are virtually overfitting-free, have sound and convenient model selection procedures, and provide so-called “error bars”, i.e., estimations of their predictions’ uncertainty. Unfortunately, full GPs cannot be directly applied to real-world, large-scale data sets due to their high computational cost. For n data samples, training a GP requires O(n3) computation time, which renders modern desktop computers unable to handle databases with more than a few thousand instances. Several sparse approximations that scale linearly with the number of data samples have been recently proposed, with the Sparse Pseudo-inputs GP (SPGP) representing the current state of the art. Sparse GP approximations can be used to deal with large databases, but, of course, do not usually achieve the performance of full GPs. In this thesis we present several novel sparse GP models that compare favorably with SPGP, both in terms of predictive performance and error bar quality. Our models converge to the full GP under some conditions, but our goal is not so much to faithfully approximate full GPs as it is to develop useful models that provide high-quality probabilistic predictions. By doing so, even full GPs are occasionally outperformed. We provide two broad classes of models: Marginalized Networks (MNs) and Inter- Domain GPs (IDGPs). MNs can be seen as models that lie in between classical Neural Networks (NNs) and full GPs, trying to combine the advantages of both. Though trained differently, when used for prediction they retain the structure of classical NNs, so they can be interpreted as a novel way to train a classical NN, while adding the benefit of input-dependent error bars and overfitting resistance. IDGPs generalize SPGP by allowing the “pseudo-inputs” to lie in a different domain, thus adding extra flexibility and performance. Furthermore, they provide a convenient probabilistic framework in which previous sparse methods can be more easily understood. All the proposed algorithms are tested and compared with the current state of the art on several standard, large-scale data sets with different properties Their strengths and weaknesses are also discussed and compared, so that it is easier to select the best suited candidate for each potential application.Los procesos Gaussianos (Gaussian Processes, GPs) son modelos Bayesianos noparamétricos que representan el actual estado del arte en tareas de aprendizaje supervisado tales como regresión y clasificación. Por este motivo, son uno de los bloques básicos usados en la construcción de otros algoritmos de aprendizaje máquina más sofisticados. Asimismo, los GPs tienen una variedad de propiedades muy deseables: Son prácticamente inmunes al sobreajuste, disponen de mecanismos sensatos y cómodos para la selección de modelo y proporcionan las llamadas "barras de error", es decir, son capaces de estimar la incertidumbre de sus propias predicciones. Desafortunadamente, los GPs completos no pueden aplicarse directamente a bases de datos de gran tamaño, cada vez más fecuentes en la actualidad. Para n muestras, el tiempo de cómputo necesario para entrenar un GP escala como O(n3), lo que hace que un ordenador doméstico actual sea incapaz de manejar conjuntos de datos con más de unos pocos miles de muestras. Para solventar este problema se han propuesto recientemente varias aproximaciones "dispersas", que escalan linealmente con el número de muestras. De entre éstas, el método conocido como "procesos Gaussianos dispersos usando pseudo-entradas"(Sparse Pseudo-inputs GP, SPGP), representa el actual estado del arte. Aunque este tipo de aproximaciones dispersas permiten tratar bases de datos mucho mayores, obviamente no alcanzan el rendimiento de los GPs completos. En esta tesis se introducen varios modelos de GP disperso que presentan un rendimiento mayor que el del SPGP, tanto en cuanto a capacidad predictiva como a calidad de las barras de error. Los modelos propuestos convergen al GP completo que aproximan bajo determinadas condiciones, pero el objetivo de esta tesis no es tanto aproximar fielmente el GP completo original como proporcionar modelos prácticos de alta capacidad predictiva. Tanto es así que, en ocasiones, los nuevos modelos llegan a batir al GP completo que los inspira. Se proporcionan dos clases generales de modelos: Redes marginalizadas (Marginalized Networks, MNs) y GPs inter-dominio (Inter-Domain GPs, IDGPs). Las MNs pueden verse como modelos que se encuentran a mitad de camino entre las redes neuronales clásicas (Neural Networks, NNs) y los GPs completos, intentando combinar las ventajas de ambos. Aunque la fase de entrenamiento de una MN es diferente, cuando se utiliza para predicción mantiene la estructura de una NN clásica, de manera que las MNs pueden ser interpretadas como una manera novedosa de entrenar NNs clásicas, al tiempo que se añaden beneficios adicionales, como resistencia al sobreajuste y "barras de error"dependientes de la entrada. Los IDGPs generalizan el SPGP, permitiendo a las "pseudo-entradas"residir en un dominio diferente del de entrada, incrementado así la flexibilidad y el rendimiento. Además, proporcionan un marco probabilístico adecuado para entender modelos dispersos anteriores Así pues, todos los algoritmos propuestos son puestos a prueba y comparados con el SPGP sobre varios conjuntos de datos estándar de diferentes propiedades y de gran tamaño. Se intentan identificar además las fortalezas y debilidades de cada uno de los métodos, de manera que sea más sencillo elegir el mejor candidato para cada aplicación potencial
Overlapping Mixtures of Gaussian Processes for the data association problem
In this work we introduce a mixture of GPs to address the data association
problem, i.e. to label a group of observations according to the sources that
generated them. Unlike several previously proposed GP mixtures, the novel
mixture has the distinct characteristic of using no gating function to
determine the association of samples and mixture components. Instead, all the
GPs in the mixture are global and samples are clustered following
"trajectories" across input space. We use a non-standard variational Bayesian
algorithm to efficiently recover sample labels and learn the hyperparameters.
We show how multi-object tracking problems can be disambiguated and also
explore the characteristics of the model in traditional regression settings
Pseudospectral Model Predictive Control under Partially Learned Dynamics
Trajectory optimization of a controlled dynamical system is an essential part
of autonomy, however many trajectory optimization techniques are limited by the
fidelity of the underlying parametric model. In the field of robotics, a lack
of model knowledge can be overcome with machine learning techniques, utilizing
measurements to build a dynamical model from the data. This paper aims to take
the middle ground between these two approaches by introducing a semi-parametric
representation of the underlying system dynamics. Our goal is to leverage the
considerable information contained in a traditional physics based model and
combine it with a data-driven, non-parametric regression technique known as a
Gaussian Process. Integrating this semi-parametric model with model predictive
pseudospectral control, we demonstrate this technique on both a cart pole and
quadrotor simulation with unmodeled damping and parametric error. In order to
manage parametric uncertainty, we introduce an algorithm that utilizes Sparse
Spectrum Gaussian Processes (SSGP) for online learning after each rollout. We
implement this online learning technique on a cart pole and quadrator, then
demonstrate the use of online learning and obstacle avoidance for the dubin
vehicle dynamics.Comment: Accepted but withdrawn from AIAA Scitech 201
Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments
Despite their stellar performance on a wide range of tasks, including
in-context tasks only revealed during inference, vanilla transformers and
variants trained for next-token predictions (a) do not learn an explicit world
model of their environment which can be flexibly queried and (b) cannot be used
for planning or navigation. In this paper, we consider partially observed
environments (POEs), where an agent receives perceptually aliased observations
as it navigates, which makes path planning hard. We introduce a transformer
with (multiple) discrete bottleneck(s), TDB, whose latent codes learn a
compressed representation of the history of observations and actions. After
training a TDB to predict the future observation(s) given the history, we
extract interpretable cognitive maps of the environment from its active
bottleneck(s) indices. These maps are then paired with an external solver to
solve (constrained) path planning problems. First, we show that a TDB trained
on POEs (a) retains the near perfect predictive performance of a vanilla
transformer or an LSTM while (b) solving shortest path problems exponentially
faster. Second, a TDB extracts interpretable representations from text
datasets, while reaching higher in-context accuracy than vanilla sequence
models. Finally, in new POEs, a TDB (a) reaches near-perfect in-context
accuracy, (b) learns accurate in-context cognitive maps (c) solves in-context
path planning problems
- …