14 research outputs found
Function Space Bayesian Pseudocoreset for Bayesian Neural Networks
A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential
information of a large-scale dataset and thus can be used as a proxy dataset
for scalable Bayesian inference. Typically, a Bayesian pseudocoreset is
constructed by minimizing a divergence measure between the posterior
conditioning on the pseudocoreset and the posterior conditioning on the full
dataset. However, evaluating the divergence can be challenging, particularly
for the models like deep neural networks having high-dimensional parameters. In
this paper, we propose a novel Bayesian pseudocoreset construction method that
operates on a function space. Unlike previous methods, which construct and
match the coreset and full data posteriors in the space of model parameters
(weights), our method constructs variational approximations to the coreset
posterior on a function space and matches it to the full data posterior in the
function space. By working directly on the function space, our method could
bypass several challenges that may arise when working on a weight space,
including limited scalability and multi-modality issue. Through various
experiments, we demonstrate that the Bayesian pseudocoresets constructed from
our method enjoys enhanced uncertainty quantification and better robustness
across various model architectures
SWAMP: Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning
Given the ever-increasing size of modern neural networks, the significance of
sparse architectures has surged due to their accelerated inference speeds and
minimal memory demands. When it comes to global pruning techniques, Iterative
Magnitude Pruning (IMP) still stands as a state-of-the-art algorithm despite
its simple nature, particularly in extremely sparse regimes. In light of the
recent finding that the two successive matching IMP solutions are linearly
connected without a loss barrier, we propose Sparse Weight Averaging with
Multiple Particles (SWAMP), a straightforward modification of IMP that achieves
performance comparable to an ensemble of two IMP solutions. For every
iteration, we concurrently train multiple sparse models, referred to as
particles, using different batch orders yet the same matching ticket, and then
weight average such models to produce a single mask. We demonstrate that our
method consistently outperforms existing baselines across different sparsities
through extensive experiments on various data and neural network structures
Regularizing Towards Soft Equivariance Under Mixed Symmetries
Datasets often have their intrinsic symmetries, and particular deep-learning
models called equivariant or invariant models have been developed to exploit
these symmetries. However, if some or all of these symmetries are only
approximate, which frequently happens in practice, these models may be
suboptimal due to the architectural restrictions imposed on them. We tackle
this issue of approximate symmetries in a setup where symmetries are mixed,
i.e., they are symmetries of not single but multiple different types and the
degree of approximation varies across these types. Instead of proposing a new
architectural restriction as in most of the previous approaches, we present a
regularizer-based method for building a model for a dataset with mixed
approximate symmetries. The key component of our method is what we call
equivariance regularizer for a given type of symmetries, which measures how
much a model is equivariant with respect to the symmetries of the type. Our
method is trained with these regularizers, one per each symmetry type, and the
strength of the regularizers is automatically tuned during training, leading to
the discovery of the approximation levels of some candidate symmetry types
without explicit supervision. Using synthetic function approximation and motion
forecasting tasks, we demonstrate that our method achieves better accuracy than
prior approaches while discovering the approximate symmetry levels correctly.Comment: Proceedings of the International Conference on Machine Learning
(ICML), 202
Enhancing Transfer Learning with Flexible Nonparametric Posterior Sampling
Transfer learning has recently shown significant performance across various
tasks involving deep neural networks. In these transfer learning scenarios, the
prior distribution for downstream data becomes crucial in Bayesian model
averaging (BMA). While previous works proposed the prior over the neural
network parameters centered around the pre-trained solution, such strategies
have limitations when dealing with distribution shifts between upstream and
downstream data. This paper introduces nonparametric transfer learning (NPTL),
a flexible posterior sampling method to address the distribution shift issue
within the context of nonparametric learning. The nonparametric learning (NPL)
method is a recent approach that employs a nonparametric prior for posterior
sampling, efficiently accounting for model misspecification scenarios, which is
suitable for transfer learning scenarios that may involve the distribution
shift between upstream and downstream tasks. Through extensive empirical
validations, we demonstrate that our approach surpasses other baselines in BMA
performance.Comment: ICLR 202
Joint-Embedding Masked Autoencoder for Self-supervised Learning of Dynamic Functional Connectivity from the Human Brain
Graph Neural Networks (GNNs) have shown promise in learning dynamic
functional connectivity for distinguishing phenotypes from human brain
networks. However, obtaining extensive labeled clinical data for training is
often resource-intensive, making practical application difficult. Leveraging
unlabeled data thus becomes crucial for representation learning in a
label-scarce setting. Although generative self-supervised learning techniques,
especially masked autoencoders, have shown promising results in representation
learning in various domains, their application to dynamic graphs for dynamic
functional connectivity remains underexplored, facing challenges in capturing
high-level semantic representations. Here, we introduce the Spatio-Temporal
Joint Embedding Masked Autoencoder (ST-JEMA), drawing inspiration from the
Joint Embedding Predictive Architecture (JEPA) in computer vision. ST-JEMA
employs a JEPA-inspired strategy for reconstructing dynamic graphs, which
enables the learning of higher-level semantic representations considering
temporal perspectives, addressing the challenges in fMRI data representation
learning. Utilizing the large-scale UK Biobank dataset for self-supervised
learning, ST-JEMA shows exceptional representation learning performance on
dynamic functional connectivity demonstrating superiority over previous methods
in predicting phenotypes and psychiatric diagnoses across eight benchmark fMRI
datasets even with limited samples and effectiveness of temporal reconstruction
on missing data scenarios. These findings highlight the potential of our
approach as a robust representation learning method for leveraging label-scarce
fMRI data.Comment: Under revie
Martingale Posterior Neural Processes
A Neural Process (NP) estimates a stochastic process implicitly defined with
neural networks given a stream of data, rather than pre-specifying priors
already known, such as Gaussian processes. An ideal NP would learn everything
from data without any inductive biases, but in practice, we often restrict the
class of stochastic processes for the ease of estimation. One such restriction
is the use of a finite-dimensional latent variable accounting for the
uncertainty in the functions drawn from NPs. Some recent works show that this
can be improved with more "data-driven" source of uncertainty such as
bootstrapping. In this work, we take a different approach based on the
martingale posterior, a recently developed alternative to Bayesian inference.
For the martingale posterior, instead of specifying prior-likelihood pairs, a
predictive distribution for future data is specified. Under specific conditions
on the predictive distribution, it can be shown that the uncertainty in the
generated future data actually corresponds to the uncertainty of the implicitly
defined Bayesian posteriors. Based on this result, instead of assuming any form
of the latent variables, we equip a NP with a predictive distribution
implicitly defined with neural networks and use the corresponding martingale
posteriors as the source of uncertainty. The resulting model, which we name as
Martingale Posterior Neural Process (MPNP), is demonstrated to outperform
baselines on various tasks.Comment: ICLR 202
Portable Amperometric Perchlorate Selective Sensors with Microhole Array-water/organic Gel Interfaces
A novel stick-shaped portable sensing device featuring a microhole array interface between the polyvinylchloride- 2-nitrophenyloctylether (PVC-NPOE) gel and water phase was developed for in-situ sensing of perchlorate ions in real water samples. Perchlorate sensitive sensing responses were obtained based on measuring the current changes with respect to the assisted transfer reaction of perchlorate ions by a perchlorate selective ligand namely, bis(dibenzoylmethanato)Ni(II) (Ni(DBM)2) across the polarized microhole array interface. Cyclic voltammetry was used to characterize the assisted transfer reaction of perchlorate ions by the Ni(DBM)2 ligand when using the portable sensing device. The current response for the transfer of perchlorate anions by Ni(DBM)2 across the micro-water/gel interface linearly increased as a function of the perchlorate ion concentration. The technique of differential pulse stripping voltammetry was also utilized to improve the sensitivity of the perchlorate anion detection down to 10 ppb. This was acquired by preconcentrating perchlorate anions in the gel layer by means of holding the ion transfer potential at 0 mV (vs. Ag/AgCl) for 30 s followed by stripping the complexed perchlorate ion with the ligand. The effect of various potential interfering anions on the perchlorate sensor was also investigated and showed an excellent selectivity over Brâ, NO2 â, NO3 â, CO3 2â, CH3COOâ and SO4 2â ions. As a final demonstration, some regional water samples from the Sincheon river in Daegu city were analyzed and the data was verified with that of ion chromatography (IC) analysis from one of the Korean-certified water quality evaluation centers
Solitary Extrahepatic Intraabdominal Metastasis from Hepatocellular Carcinoma after Liver Transplantation
A liver transplantation is a treatment option in selected patients with hepatocellular carcinoma (HCC). Despite the adequate selection of candidates, recurrences of HCC may still develop. Solitary extrahepatic metastasis from HCC after a liver transplantation is rare. Here we report two cases of HCC demonstrated extrahepatic recurrence to the adrenal gland and spleen, respectively, within one year after a liver transplantation. Since the treatment of solitary extrahepatic metastasis from HCC after a liver transplantation is not standardized, surgical resection was performed. In the case of HCC adrenal metastasis, innumerable intrahepatic metastases were found two months after the adrenalectomy. And 16 months after adrenalectomy, the patient expired due to tumor progression and hepatic failure. In the case of HCC splenic metastasis, postoperative radiation therapy was performed. However, two recurrent HCC nodules were found 15 months after the splenectomy and received transarterial chemoembolization (TACE). And 29 month after the splenectomy, the patient also expired as same causes of former patient
Protein transfer learning improves identification of heat shock protein families.
Heat shock proteins (HSPs) play a pivotal role as molecular chaperones against unfavorable conditions. Although HSPs are of great importance, their computational identification remains a significant challenge. Previous studies have two major limitations. First, they relied heavily on amino acid composition features, which inevitably limited their prediction performance. Second, their prediction performance was overestimated because of the independent two-stage evaluations and train-test data redundancy. To overcome these limitations, we introduce two novel deep learning algorithms: (1) time-efficient DeepHSP and (2) high-performance DeeperHSP. We propose a convolutional neural network (CNN)-based DeepHSP that classifies both non-HSPs and six HSP families simultaneously. It outperforms state-of-the-art algorithms, despite taking 14-15 times less time for both training and inference. We further improve the performance of DeepHSP by taking advantage of protein transfer learning. While DeepHSP is trained on raw protein sequences, DeeperHSP is trained on top of pre-trained protein representations. Therefore, DeeperHSP remarkably outperforms state-of-the-art algorithms increasing F1 scores in both cross-validation and independent test experiments by 20% and 10%, respectively. We envision that the proposed algorithms can provide a proteome-wide prediction of HSPs and help in various downstream analyses for pathology and clinical research