181 research outputs found
DROPMAX: ADAPTIVE STOCHASTIC SOFTMAX
Department of Computer Science and EngineeringWe propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes according to dropout probabilities adaptively decided for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are input-adaptively learned via variational inference. This stochastic regularization has an effect of building an ensemble classifier out of exponentially many classifiers with different decision boundaries. Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains significantly improved accuracy over the regular softmax classifier and other baselines. Further analysis of the learned dropout probabilities shows that our model indeed selects confusing classes more often when it performs classification.ope
Online Hyperparameter Meta-Learning with Hypergradient Distillation
Many gradient-based meta-learning methods assume a set of parameters that do
not participate in inner-optimization, which can be considered as
hyperparameters. Although such hyperparameters can be optimized using the
existing gradient-based hyperparameter optimization (HO) methods, they suffer
from the following issues. Unrolled differentiation methods do not scale well
to high-dimensional hyperparameters or horizon length, Implicit Function
Theorem (IFT) based methods are restrictive for online optimization, and short
horizon approximations suffer from short horizon bias. In this work, we propose
a novel HO method that can overcome these limitations, by approximating the
second-order term with knowledge distillation. Specifically, we parameterize a
single Jacobian-vector product (JVP) for each HO step and minimize the distance
from the true second-order term. Our method allows online optimization and also
is scalable to the hyperparameter dimension and the horizon length. We
demonstrate the effectiveness of our method on two different meta-learning
methods and three benchmark datasets
Delta-AI: Local objectives for amortized inference in sparse graphical models
We present a new algorithm for amortized inference in sparse probabilistic
graphical models (PGMs), which we call -amortized inference
(-AI). Our approach is based on the observation that when the sampling
of variables in a PGM is seen as a sequence of actions taken by an agent,
sparsity of the PGM enables local credit assignment in the agent's policy
learning objective. This yields a local constraint that can be turned into a
local loss in the style of generative flow networks (GFlowNets) that enables
off-policy training but avoids the need to instantiate all the random variables
for each parameter update, thus speeding up training considerably. The
-AI objective matches the conditional distribution of a variable given
its Markov blanket in a tractable learned sampler, which has the structure of a
Bayesian network, with the same conditional distribution under the target PGM.
As such, the trained sampler recovers marginals and conditional distributions
of interest and enables inference of partial subsets of variables. We
illustrate -AI's effectiveness for sampling from synthetic PGMs and
training latent variable models with sparse factor structure.Comment: ICLR 2024; 19 pages, code: https://github.com/GFNOrg/Delta-AI
Aggregation-Prone Structural Ensembles of Transthyretin Collected With Regression Analysis for NMR Chemical Shift
Monomer dissociation and subsequent misfolding of the transthyretin (TTR) is one of the most critical causative factors of TTR amyloidosis. TTR amyloidosis causes several human diseases, such as senile systemic amyloidosis and familial amyloid cardiomyopathy/polyneuropathy; therefore, it is important to understand the molecular details of the structural deformation and aggregation mechanisms of TTR. However, such molecular characteristics are still elusive because of the complicated structural heterogeneity of TTR and its highly sensitive nature to various environmental factors. Several nuclear magnetic resonance (NMR) spectroscopy and molecular dynamics (MD) studies of TTR variants have recently reported evidence of transient aggregation-prone structural states of TTR. According to these studies, the stability of the DAGH β-sheet, one of the two main β-sheets in TTR, is a crucial determinant of the TTR amyloidosis mechanism. In addition, its conformational perturbation and possible involvement of nearby structural motifs facilitates TTR aggregation. This study proposes aggregation-prone structural ensembles of TTR obtained by MD simulation with enhanced sampling and a multiple linear regression approach. This method provides plausible structural models that are composed of ensemble structures consistent with NMR chemical shift data. This study validated the ensemble models with experimental data obtained from circular dichroism (CD) spectroscopy and NMR order parameter analysis. In addition, our results suggest that the structural deformation of the DAGH β-sheet and the AB loop regions may correlate with the manifestation of the aggregation-prone conformational states of TTR. In summary, our method employing MD techniques to extend the structural ensembles from NMR experimental data analysis may provide new opportunities to investigate various transient yet important structural states of amyloidogenic proteins. Copyright © 2021 Yang, Kim, Muniyappan, Lee, Kim and Yu.1
- …