1,272 research outputs found
Variational Counterfactual Prediction under Runtime Domain Corruption
To date, various neural methods have been proposed for causal effect
estimation based on observational data, where a default assumption is the same
distribution and availability of variables at both training and inference
(i.e., runtime) stages. However, distribution shift (i.e., domain shift) could
happen during runtime, and bigger challenges arise from the impaired
accessibility of variables. This is commonly caused by increasing privacy and
ethical concerns, which can make arbitrary variables unavailable in the entire
runtime data and imputation impractical. We term the co-occurrence of domain
shift and inaccessible variables runtime domain corruption, which seriously
impairs the generalizability of a trained counterfactual predictor. To counter
runtime domain corruption, we subsume counterfactual prediction under the
notion of domain adaptation. Specifically, we upper-bound the error w.r.t. the
target domain (i.e., runtime covariates) by the sum of source domain error and
inter-domain distribution distance. In addition, we build an adversarially
unified variational causal effect model, named VEGAN, with a novel two-stage
adversarial domain adaptation scheme to reduce the latent distribution
disparity between treated and control groups first, and between training and
runtime variables afterwards. We demonstrate that VEGAN outperforms other
state-of-the-art baselines on individual-level treatment effect estimation in
the presence of runtime domain corruption on benchmark datasets
Sampling-Based Optimization for Multi-Agent Model Predictive Control
We systematically review the Variational Optimization, Variational Inference
and Stochastic Search perspectives on sampling-based dynamic optimization and
discuss their connections to state-of-the-art optimizers and Stochastic Optimal
Control (SOC) theory. A general convergence and sample complexity analysis on
the three perspectives is provided through the unifying Stochastic Search
perspective. We then extend these frameworks to their distributed versions for
multi-agent control by combining them with consensus Alternating Direction
Method of Multipliers (ADMM) to decouple the full problem into local
neighborhood-level ones that can be solved in parallel. Model Predictive
Control (MPC) algorithms are then developed based on these frameworks, leading
to fully decentralized sampling-based dynamic optimizers. The capabilities of
the proposed algorithms framework are demonstrated on multiple complex
multi-agent tasks for vehicle and quadcopter systems in simulation. The results
compare different distributed sampling-based optimizers and their centralized
counterparts using unimodal Gaussian, mixture of Gaussians, and stein
variational policies. The scalability of the proposed distributed algorithms is
demonstrated on a 196-vehicle scenario where a direct application of
centralized sampling-based methods is shown to be prohibitive
Probabilistic Models of Motor Production
N. Bernstein defined the ability of the central neural system (CNS) to control many degrees of freedom of a physical body with all its redundancy and flexibility as the main problem in motor control. He pointed at that man-made mechanisms usually have one, sometimes two degrees of freedom (DOF); when the number of DOF increases further, it becomes prohibitively hard to control them. The brain, however, seems to perform such control effortlessly. He suggested the way the brain might deal with it: when a motor skill is being acquired, the brain artificially limits the degrees of freedoms, leaving only one or two. As the skill level increases, the brain gradually "frees" the previously fixed DOF, applying control when needed and in directions which have to be corrected, eventually arriving to the control scheme where all the DOF are "free". This approach of reducing the dimensionality of motor control remains relevant even today.
One the possibles solutions of the Bernstetin's problem is the hypothesis of motor primitives (MPs) - small building blocks that constitute complex movements and facilitite motor learnirng and task completion. Just like in the visual system, having a homogenious hierarchical architecture built of similar computational elements may be beneficial.
Studying such a complicated object as brain, it is important to define at which level of details one works and which questions one aims to answer. David Marr suggested three levels of analysis: 1. computational, analysing which problem the system solves; 2. algorithmic, questioning which representation the system uses and which computations it performs; 3. implementational, finding how such computations are performed by neurons in the brain. In this thesis we stay at the first two levels, seeking for the basic representation of motor output.
In this work we present a new model of motor primitives that comprises multiple interacting latent dynamical systems, and give it a full Bayesian treatment. Modelling within the Bayesian framework, in my opinion, must become the new standard in hypothesis testing in neuroscience. Only the Bayesian framework gives us guarantees when dealing with the inevitable plethora of hidden variables and uncertainty.
The special type of coupling of dynamical systems we proposed, based on the Product of Experts, has many natural interpretations in the Bayesian framework. If the dynamical systems run in parallel, it yields Bayesian cue integration. If they are organized hierarchically due to serial coupling, we get hierarchical priors over the dynamics. If one of the dynamical systems represents sensory state, we arrive to the sensory-motor primitives. The compact representation that follows from the variational treatment allows learning of a motor primitives library. Learned separately, combined motion can be represented as a matrix of coupling values.
We performed a set of experiments to compare different models of motor primitives. In a series of 2-alternative forced choice (2AFC) experiments participants were discriminating natural and synthesised movements, thus running a graphics Turing test. When available, Bayesian model score predicted the naturalness of the perceived movements. For simple movements, like walking, Bayesian model comparison and psychophysics tests indicate that one dynamical system is sufficient to describe the data. For more complex movements, like walking and waving, motion can be better represented as a set of coupled dynamical systems. We also experimentally confirmed that Bayesian treatment of model learning on motion data is superior to the simple point estimate of latent parameters. Experiments with non-periodic movements show that they do not benefit from more complex latent dynamics, despite having high kinematic complexity.
By having a fully Bayesian models, we could quantitatively disentangle the influence of motion dynamics and pose on the perception of naturalness. We confirmed that rich and correct dynamics is more important than the kinematic representation.
There are numerous further directions of research. In the models we devised, for multiple parts, even though the latent dynamics was factorized on a set of interacting systems, the kinematic parts were completely independent. Thus, interaction between the kinematic parts could be mediated only by the latent dynamics interactions. A more flexible model would allow a dense interaction on the kinematic level too.
Another important problem relates to the representation of time in Markov chains. Discrete time Markov chains form an approximation to continuous dynamics. As time step is assumed to be fixed, we face with the problem of time step selection. Time is also not a explicit parameter in Markov chains. This also prohibits explicit optimization of time as parameter and reasoning (inference) about it. For example, in optimal control boundary conditions are usually set at exact time points, which is not an ecological scenario, where time is usually a parameter of optimization. Making time an explicit parameter in dynamics may alleviate this
Roq: Robust Query Optimization Based on a Risk-aware Learned Cost Model
Query optimizers in relational database management systems (RDBMSs) search
for execution plans expected to be optimal for a given queries. They use
parameter estimates, often inaccurate, and make assumptions that may not hold
in practice. Consequently, they may select execution plans that are suboptimal
at runtime, when these estimates and assumptions are not valid, which may
result in poor query performance. Therefore, query optimizers do not
sufficiently support robust query optimization. Recent years have seen a surge
of interest in using machine learning (ML) to improve efficiency of data
systems and reduce their maintenance overheads, with promising results obtained
in the area of query optimization in particular. In this paper, inspired by
these advancements, and based on several years of experience of IBM Db2 in this
journey, we propose Robust Optimization of Queries, (Roq), a holistic framework
that enables robust query optimization based on a risk-aware learning approach.
Roq includes a novel formalization of the notion of robustness in the context
of query optimization and a principled approach for its quantification and
measurement based on approximate probabilistic ML. It also includes novel
strategies and algorithms for query plan evaluation and selection. Roq also
includes a novel learned cost model that is designed to predict query execution
cost and the associated risks and performs query optimization accordingly. We
demonstrate experimentally that Roq provides significant improvements to robust
query optimization compared to the state-of-the-art.Comment: 13 pages, 9 figures, submitted to SIGMOD 202
Autonomous Exploration over Continuous Domains
Motion planning is an essential aspect of robot autonomy, and as such it has been studied for decades, producing a wide range of planning methodologies. Path planners are generally categorised as either trajectory optimisers or sampling-based planners. The latter is the predominant planning paradigm as it can resolve a path efficiently while explicitly reasoning about path safety. Yet, with a limited budget, the resulting paths are far from optimal. In contrast, state-of-the-art trajectory optimisers explicitly trade-off between path safety and efficiency to produce locally optimal paths. However, these planners cannot incorporate updates from a partially observed model such as an occupancy map and fail in planning around information gaps caused by incomplete sensor coverage. Autonomous exploration adds another twist to path planning. The objective of exploration is to safely and efficiently traverse through an unknown environment in order to map it. The desired output of such a process is a sequence of paths that efficiently and safely minimise the uncertainty of the map. However, optimising over the entire space of trajectories is computationally intractable. Therefore, most exploration algorithms relax the general formulation by optimising a simpler one, for example finding the single next best view, resulting in suboptimal performance. This thesis investigates methodologies for optimal and safe exploration over continuous paths. Contrary to existing exploration algorithms that break exploration into independent sub-problems of finding goal points and planning safe paths to these points, our holistic approach simultaneously optimises the coupled problems of where and how to explore. Thus, offering a shift in paradigm from next best view to next best path. With exploration defined as an optimisation problem over continuous paths, this thesis explores two different optimisation paradigms; Bayesian and functional
ProSpar-GP: scalable Gaussian process modeling with massive non-stationary datasets
Gaussian processes (GPs) are a popular class of Bayesian nonparametric
models, but its training can be computationally burdensome for massive training
datasets. While there has been notable work on scaling up these models for big
data, existing methods typically rely on a stationary GP assumption for
approximation, and can thus perform poorly when the underlying response surface
is non-stationary, i.e., it has some regions of rapid change and other regions
with little change. Such non-stationarity is, however, ubiquitous in real-world
problems, including our motivating application for surrogate modeling of
computer experiments. We thus propose a new Product of Sparse GP (ProSpar-GP)
method for scalable GP modeling with massive non-stationary data. The
ProSpar-GP makes use of a carefully-constructed product-of-experts formulation
of sparse GP experts, where different experts are placed within local regions
of non-stationarity. These GP experts are fit via a novel variational inference
approach, which capitalizes on mini-batching and GPU acceleration for efficient
optimization of inducing points and length-scale parameters for each expert. We
further show that the ProSpar-GP is Kolmogorov-consistent, in that its
generative distribution defines a valid stochastic process over the prediction
space; such a property provides essential stability for variational inference,
particularly in the presence of non-stationarity. We then demonstrate the
improved performance of the ProSpar-GP over the state-of-the-art, in a suite of
numerical experiments and an application for surrogate modeling of a satellite
drag simulator
- …