9 research outputs found
Conservative objective models are a special kind of contrastive divergence-based energy model
In this work we theoretically show that conservative objective models (COMs)
for offline model-based optimisation (MBO) are a special kind of contrastive
divergence-based energy model, one where the energy function represents both
the unconditional probability of the input and the conditional probability of
the reward variable. While the initial formulation only samples modes from its
learned distribution, we propose a simple fix that replaces its gradient ascent
sampler with a Langevin MCMC sampler. This gives rise to a special
probabilistic model where the probability of sampling an input is proportional
to its predicted reward. Lastly, we show that better samples can be obtained if
the model is decoupled so that the unconditional and conditional probabilities
are modelled separately
Towards good validation metrics for generative models in offline model-based optimisation
In this work we propose a principled evaluation framework for model-based
optimisation to measure how well a generative model can extrapolate. We achieve
this by interpreting the training and validation splits as draws from their
respective `truncated' ground truth distributions, where examples in the
validation set contain scores much larger than those in the training set. Model
selection is performed on the validation set for some prescribed validation
metric. A major research question however is in determining what validation
metric correlates best with the expected value of generated candidates with
respect to the ground truth oracle; work towards answering this question can
translate to large economic gains since it is expensive to evaluate the ground
truth oracle in the real world. We compare various validation metrics for
generative adversarial networks using our framework. We also discuss
limitations with our framework with respect to existing datasets and how
progress can be made to mitigate them
Causal Graphs Underlying Generative Models: Path to Learning with Limited Data
Training generative models that capture rich semantics of the data and
interpreting the latent representations encoded by such models are very
important problems in unsupervised learning. In this work, we provide a simple
algorithm that relies on perturbation experiments on latent codes of a
pre-trained generative autoencoder to uncover a causal graph that is implied by
the generative model. We leverage pre-trained attribute classifiers and perform
perturbation experiments to check for influence of a given latent variable on a
subset of attributes. Given this, we show that one can fit an effective causal
graph that models a structural equation model between latent codes taken as
exogenous variables and attributes taken as observed variables. One interesting
aspect is that a single latent variable controls multiple overlapping subsets
of attributes unlike conventional approach that tries to impose full
independence. Using a pre-trained RNN-based generative autoencoder trained on a
dataset of peptide sequences, we demonstrate that the learnt causal graph from
our algorithm between various attributes and latent codes can be used to
predict a specific property for sequences which are unseen. We compare
prediction models trained on either all available attributes or only the ones
in the Markov blanket and empirically show that in both the unsupervised and
supervised regimes, typically, using the predictor that relies on Markov
blanket attributes generalizes better for out-of-distribution sequences
Parallel-mentoring for Offline Model-based Optimization
We study offline model-based optimization to maximize a black-box objective
function with a static dataset of designs and scores. These designs encompass a
variety of domains, including materials, robots and DNA sequences. A common
approach trains a proxy on the static dataset to approximate the black-box
objective function and performs gradient ascent to obtain new designs. However,
this often results in poor designs due to the proxy inaccuracies for
out-of-distribution designs. Recent studies indicate that: (a) gradient ascent
with a mean ensemble of proxies generally outperforms simple gradient ascent,
and (b) a trained proxy provides weak ranking supervision signals for design
selection. Motivated by (a) and (b), we propose \textit{parallel-mentoring} as
an effective and novel method that facilitates mentoring among parallel
proxies, creating a more robust ensemble to mitigate the out-of-distribution
issue. We focus on the three-proxy case and our method consists of two modules.
The first module, \textit{voting-based pairwise supervision}, operates on three
parallel proxies and captures their ranking supervision signals as pairwise
comparison labels. These labels are combined through majority voting to
generate consensus labels, which incorporate ranking supervision signals from
all proxies and enable mutual mentoring. However, label noise arises due to
possible incorrect consensus. To alleviate this, we introduce an
\textit{adaptive soft-labeling} module with soft-labels initialized as
consensus labels. Based on bi-level optimization, this module fine-tunes
proxies in the inner level and learns more accurate labels in the outer level
to adaptively mentor proxies, resulting in a more robust ensemble. Experiments
validate the effectiveness of our method. Our code is available here.Comment: Accepted by NeurIPS 202
ROMO: Retrieval-enhanced Offline Model-based Optimization
Data-driven black-box model-based optimization (MBO) problems arise in a
great number of practical application scenarios, where the goal is to find a
design over the whole space maximizing a black-box target function based on a
static offline dataset. In this work, we consider a more general but
challenging MBO setting, named constrained MBO (CoMBO), where only part of the
design space can be optimized while the rest is constrained by the environment.
A new challenge arising from CoMBO is that most observed designs that satisfy
the constraints are mediocre in evaluation. Therefore, we focus on optimizing
these mediocre designs in the offline dataset while maintaining the given
constraints rather than further boosting the best observed design in the
traditional MBO setting. We propose retrieval-enhanced offline model-based
optimization (ROMO), a new derivable forward approach that retrieves the
offline dataset and aggregates relevant samples to provide a trusted
prediction, and use it for gradient-based optimization. ROMO is simple to
implement and outperforms state-of-the-art approaches in the CoMBO setting.
Empirically, we conduct experiments on a synthetic Hartmann (3D) function
dataset, an industrial CIO dataset, and a suite of modified tasks in the
Design-Bench benchmark. Results show that ROMO performs well in a wide range of
constrained optimization tasks.Comment: 15 pages, 9 figure
Physics-Driven ML-Based Modelling for Correcting Inverse Estimation
When deploying machine learning estimators in science and engineering (SAE)
domains, it is critical to avoid failed estimations that can have disastrous
consequences, e.g., in aero engine design. This work focuses on detecting and
correcting failed state estimations before adopting them in SAE inverse
problems, by utilizing simulations and performance metrics guided by physical
laws. We suggest to flag a machine learning estimation when its physical model
error exceeds a feasible threshold, and propose a novel approach, GEESE, to
correct it through optimization, aiming at delivering both low error and high
efficiency. The key designs of GEESE include (1) a hybrid surrogate error model
to provide fast error estimations to reduce simulation cost and to enable
gradient based backpropagation of error feedback, and (2) two generative models
to approximate the probability distributions of the candidate states for
simulating the exploitation and exploration behaviours. All three models are
constructed as neural networks. GEESE is tested on three real-world SAE inverse
problems and compared to a number of state-of-the-art optimization/search
approaches. Results show that it fails the least number of times in terms of
finding a feasible state correction, and requires physical evaluations less
frequently in general.Comment: 19 pages, the paper is accepted by Neurips 2023 as a spotligh
Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
Large Language Models (LLMs) have made unprecedented breakthroughs, yet their
increasing integration into everyday life might raise societal risks due to
generated unethical content. Despite extensive study on specific issues like
bias, the intrinsic values of LLMs remain largely unexplored from a moral
philosophy perspective. This work delves into ethical values utilizing Moral
Foundation Theory. Moving beyond conventional discriminative evaluations with
poor reliability, we propose DeNEVIL, a novel prompt generation algorithm
tailored to dynamically exploit LLMs' value vulnerabilities and elicit the
violation of ethics in a generative manner, revealing their underlying value
inclinations. On such a basis, we construct MoralPrompt, a high-quality dataset
comprising 2,397 prompts covering 500+ value principles, and then benchmark the
intrinsic values across a spectrum of LLMs. We discovered that most models are
essentially misaligned, necessitating further ethical value alignment. In
response, we develop VILMO, an in-context alignment method that substantially
enhances the value compliance of LLM outputs by learning to generate
appropriate value instructions, outperforming existing competitors. Our methods
are suitable for black-box and open-source models, offering a promising initial
step in studying the ethical values of LLMs