50 research outputs found
Contextual directed acyclic graphs
Estimating the structure of directed acyclic graphs (DAGs) from observational
data remains a significant challenge in machine learning. Most research in this
area concentrates on learning a single DAG for the entire population. This
paper considers an alternative setting where the graph structure varies across
individuals based on available "contextual" features. We tackle this contextual
DAG problem via a neural network that maps the contextual features to a DAG,
represented as a weighted adjacency matrix. The neural network is equipped with
a novel projection layer that ensures the output matrices are sparse and
satisfy a recently developed characterization of acyclicity. We devise a
scalable computational framework for learning contextual DAGs and provide a
convergence guarantee and an analytical gradient for backpropagating through
the projection layer. Our experiments suggest that the new approach can recover
the true context-specific graph where existing approaches fail
Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations
Variational inference techniques based on inducing variables provide an
elegant framework for scalable posterior estimation in Gaussian process (GP)
models. Besides enabling scalability, one of their main advantages over sparse
approximations using direct marginal likelihood maximization is that they
provide a robust alternative for point estimation of the inducing inputs, i.e.
the location of the inducing variables. In this work we challenge the common
wisdom that optimizing the inducing inputs in the variational framework yields
optimal performance. We show that, by revisiting old model approximations such
as the fully-independent training conditionals endowed with powerful
sampling-based inference methods, treating both inducing locations and GP
hyper-parameters in a Bayesian way can improve performance significantly. Based
on stochastic gradient Hamiltonian Monte Carlo, we develop a fully Bayesian
approach to scalable GP and deep GP models, and demonstrate its
state-of-the-art performance through an extensive experimental campaign across
several regression and classification problems
Portable compiler optimisation across embedded programs and microarchitectures using machine learning
Building an optimising compiler is a difficult and time consuming task which must be repeated for each generation of a microprocessor. As the underlying microarchitecture changes from one generation to the next, the compiler must be retuned to optimise specifically for that new system. It may take several releases of the compiler to effectively exploit a processor’s performance potential, by which time a new generation has appeared and the process starts again. We address this challenge by developing a portable optimising compiler. Our approach employs machine learning to automatically learn the best optimisations to apply for any new program on a new microarchitectural configuration. It achieves this by learning a model off-line which maps a microarchitecture description plus the hardware counters from a single run of the program to the best compiler optimisation passes. Our compiler gains 67 % of the maximum speedup obtainable by an iterative compiler search using 1000 evaluations. We obtain, on average, a 1.16x speedup over the highest default optimisation level across an entire microarchitecture configuration space, achieving a 4.3x speedup in the best case. We demonstrate the robustness of this technique by applying it to an extended microarchitectural space where we achieve comparable performance