14 research outputs found

    Function Space Bayesian Pseudocoreset for Bayesian Neural Networks

    Full text link
    A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential information of a large-scale dataset and thus can be used as a proxy dataset for scalable Bayesian inference. Typically, a Bayesian pseudocoreset is constructed by minimizing a divergence measure between the posterior conditioning on the pseudocoreset and the posterior conditioning on the full dataset. However, evaluating the divergence can be challenging, particularly for the models like deep neural networks having high-dimensional parameters. In this paper, we propose a novel Bayesian pseudocoreset construction method that operates on a function space. Unlike previous methods, which construct and match the coreset and full data posteriors in the space of model parameters (weights), our method constructs variational approximations to the coreset posterior on a function space and matches it to the full data posterior in the function space. By working directly on the function space, our method could bypass several challenges that may arise when working on a weight space, including limited scalability and multi-modality issue. Through various experiments, we demonstrate that the Bayesian pseudocoresets constructed from our method enjoys enhanced uncertainty quantification and better robustness across various model architectures

    SWAMP: Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning

    Full text link
    Given the ever-increasing size of modern neural networks, the significance of sparse architectures has surged due to their accelerated inference speeds and minimal memory demands. When it comes to global pruning techniques, Iterative Magnitude Pruning (IMP) still stands as a state-of-the-art algorithm despite its simple nature, particularly in extremely sparse regimes. In light of the recent finding that the two successive matching IMP solutions are linearly connected without a loss barrier, we propose Sparse Weight Averaging with Multiple Particles (SWAMP), a straightforward modification of IMP that achieves performance comparable to an ensemble of two IMP solutions. For every iteration, we concurrently train multiple sparse models, referred to as particles, using different batch orders yet the same matching ticket, and then weight average such models to produce a single mask. We demonstrate that our method consistently outperforms existing baselines across different sparsities through extensive experiments on various data and neural network structures

    Regularizing Towards Soft Equivariance Under Mixed Symmetries

    Full text link
    Datasets often have their intrinsic symmetries, and particular deep-learning models called equivariant or invariant models have been developed to exploit these symmetries. However, if some or all of these symmetries are only approximate, which frequently happens in practice, these models may be suboptimal due to the architectural restrictions imposed on them. We tackle this issue of approximate symmetries in a setup where symmetries are mixed, i.e., they are symmetries of not single but multiple different types and the degree of approximation varies across these types. Instead of proposing a new architectural restriction as in most of the previous approaches, we present a regularizer-based method for building a model for a dataset with mixed approximate symmetries. The key component of our method is what we call equivariance regularizer for a given type of symmetries, which measures how much a model is equivariant with respect to the symmetries of the type. Our method is trained with these regularizers, one per each symmetry type, and the strength of the regularizers is automatically tuned during training, leading to the discovery of the approximation levels of some candidate symmetry types without explicit supervision. Using synthetic function approximation and motion forecasting tasks, we demonstrate that our method achieves better accuracy than prior approaches while discovering the approximate symmetry levels correctly.Comment: Proceedings of the International Conference on Machine Learning (ICML), 202

    Enhancing Transfer Learning with Flexible Nonparametric Posterior Sampling

    Full text link
    Transfer learning has recently shown significant performance across various tasks involving deep neural networks. In these transfer learning scenarios, the prior distribution for downstream data becomes crucial in Bayesian model averaging (BMA). While previous works proposed the prior over the neural network parameters centered around the pre-trained solution, such strategies have limitations when dealing with distribution shifts between upstream and downstream data. This paper introduces nonparametric transfer learning (NPTL), a flexible posterior sampling method to address the distribution shift issue within the context of nonparametric learning. The nonparametric learning (NPL) method is a recent approach that employs a nonparametric prior for posterior sampling, efficiently accounting for model misspecification scenarios, which is suitable for transfer learning scenarios that may involve the distribution shift between upstream and downstream tasks. Through extensive empirical validations, we demonstrate that our approach surpasses other baselines in BMA performance.Comment: ICLR 202

    Joint-Embedding Masked Autoencoder for Self-supervised Learning of Dynamic Functional Connectivity from the Human Brain

    Full text link
    Graph Neural Networks (GNNs) have shown promise in learning dynamic functional connectivity for distinguishing phenotypes from human brain networks. However, obtaining extensive labeled clinical data for training is often resource-intensive, making practical application difficult. Leveraging unlabeled data thus becomes crucial for representation learning in a label-scarce setting. Although generative self-supervised learning techniques, especially masked autoencoders, have shown promising results in representation learning in various domains, their application to dynamic graphs for dynamic functional connectivity remains underexplored, facing challenges in capturing high-level semantic representations. Here, we introduce the Spatio-Temporal Joint Embedding Masked Autoencoder (ST-JEMA), drawing inspiration from the Joint Embedding Predictive Architecture (JEPA) in computer vision. ST-JEMA employs a JEPA-inspired strategy for reconstructing dynamic graphs, which enables the learning of higher-level semantic representations considering temporal perspectives, addressing the challenges in fMRI data representation learning. Utilizing the large-scale UK Biobank dataset for self-supervised learning, ST-JEMA shows exceptional representation learning performance on dynamic functional connectivity demonstrating superiority over previous methods in predicting phenotypes and psychiatric diagnoses across eight benchmark fMRI datasets even with limited samples and effectiveness of temporal reconstruction on missing data scenarios. These findings highlight the potential of our approach as a robust representation learning method for leveraging label-scarce fMRI data.Comment: Under revie

    Martingale Posterior Neural Processes

    Full text link
    A Neural Process (NP) estimates a stochastic process implicitly defined with neural networks given a stream of data, rather than pre-specifying priors already known, such as Gaussian processes. An ideal NP would learn everything from data without any inductive biases, but in practice, we often restrict the class of stochastic processes for the ease of estimation. One such restriction is the use of a finite-dimensional latent variable accounting for the uncertainty in the functions drawn from NPs. Some recent works show that this can be improved with more "data-driven" source of uncertainty such as bootstrapping. In this work, we take a different approach based on the martingale posterior, a recently developed alternative to Bayesian inference. For the martingale posterior, instead of specifying prior-likelihood pairs, a predictive distribution for future data is specified. Under specific conditions on the predictive distribution, it can be shown that the uncertainty in the generated future data actually corresponds to the uncertainty of the implicitly defined Bayesian posteriors. Based on this result, instead of assuming any form of the latent variables, we equip a NP with a predictive distribution implicitly defined with neural networks and use the corresponding martingale posteriors as the source of uncertainty. The resulting model, which we name as Martingale Posterior Neural Process (MPNP), is demonstrated to outperform baselines on various tasks.Comment: ICLR 202

    Portable Amperometric Perchlorate Selective Sensors with Microhole Array-water/organic Gel Interfaces

    Get PDF
    A novel stick-shaped portable sensing device featuring a microhole array interface between the polyvinylchloride- 2-nitrophenyloctylether (PVC-NPOE) gel and water phase was developed for in-situ sensing of perchlorate ions in real water samples. Perchlorate sensitive sensing responses were obtained based on measuring the current changes with respect to the assisted transfer reaction of perchlorate ions by a perchlorate selective ligand namely, bis(dibenzoylmethanato)Ni(II) (Ni(DBM)2) across the polarized microhole array interface. Cyclic voltammetry was used to characterize the assisted transfer reaction of perchlorate ions by the Ni(DBM)2 ligand when using the portable sensing device. The current response for the transfer of perchlorate anions by Ni(DBM)2 across the micro-water/gel interface linearly increased as a function of the perchlorate ion concentration. The technique of differential pulse stripping voltammetry was also utilized to improve the sensitivity of the perchlorate anion detection down to 10 ppb. This was acquired by preconcentrating perchlorate anions in the gel layer by means of holding the ion transfer potential at 0 mV (vs. Ag/AgCl) for 30 s followed by stripping the complexed perchlorate ion with the ligand. The effect of various potential interfering anions on the perchlorate sensor was also investigated and showed an excellent selectivity over Br−, NO2 −, NO3 −, CO3 2−, CH3COO− and SO4 2− ions. As a final demonstration, some regional water samples from the Sincheon river in Daegu city were analyzed and the data was verified with that of ion chromatography (IC) analysis from one of the Korean-certified water quality evaluation centers

    Solitary Extrahepatic Intraabdominal Metastasis from Hepatocellular Carcinoma after Liver Transplantation

    Get PDF
    A liver transplantation is a treatment option in selected patients with hepatocellular carcinoma (HCC). Despite the adequate selection of candidates, recurrences of HCC may still develop. Solitary extrahepatic metastasis from HCC after a liver transplantation is rare. Here we report two cases of HCC demonstrated extrahepatic recurrence to the adrenal gland and spleen, respectively, within one year after a liver transplantation. Since the treatment of solitary extrahepatic metastasis from HCC after a liver transplantation is not standardized, surgical resection was performed. In the case of HCC adrenal metastasis, innumerable intrahepatic metastases were found two months after the adrenalectomy. And 16 months after adrenalectomy, the patient expired due to tumor progression and hepatic failure. In the case of HCC splenic metastasis, postoperative radiation therapy was performed. However, two recurrent HCC nodules were found 15 months after the splenectomy and received transarterial chemoembolization (TACE). And 29 month after the splenectomy, the patient also expired as same causes of former patient

    Protein transfer learning improves identification of heat shock protein families.

    No full text
    Heat shock proteins (HSPs) play a pivotal role as molecular chaperones against unfavorable conditions. Although HSPs are of great importance, their computational identification remains a significant challenge. Previous studies have two major limitations. First, they relied heavily on amino acid composition features, which inevitably limited their prediction performance. Second, their prediction performance was overestimated because of the independent two-stage evaluations and train-test data redundancy. To overcome these limitations, we introduce two novel deep learning algorithms: (1) time-efficient DeepHSP and (2) high-performance DeeperHSP. We propose a convolutional neural network (CNN)-based DeepHSP that classifies both non-HSPs and six HSP families simultaneously. It outperforms state-of-the-art algorithms, despite taking 14-15 times less time for both training and inference. We further improve the performance of DeepHSP by taking advantage of protein transfer learning. While DeepHSP is trained on raw protein sequences, DeeperHSP is trained on top of pre-trained protein representations. Therefore, DeeperHSP remarkably outperforms state-of-the-art algorithms increasing F1 scores in both cross-validation and independent test experiments by 20% and 10%, respectively. We envision that the proposed algorithms can provide a proteome-wide prediction of HSPs and help in various downstream analyses for pathology and clinical research
    corecore