167 research outputs found

    HyperBO+: Pre-training a universal prior for Bayesian optimization with hierarchical Gaussian processes

    Full text link
    Bayesian optimization (BO), while proved highly effective for many black-box function optimization tasks, requires practitioners to carefully select priors that well model their functions of interest. Rather than specifying by hand, researchers have investigated transfer learning based methods to automatically learn the priors, e.g. multi-task BO (Swersky et al., 2013), few-shot BO (Wistuba and Grabocka, 2021) and HyperBO (Wang et al., 2022). However, those prior learning methods typically assume that the input domains are the same for all tasks, weakening their ability to use observations on functions with different domains or generalize the learned priors to BO on different search spaces. In this work, we present HyperBO+: a pre-training approach for hierarchical Gaussian processes that enables the same prior to work universally for Bayesian optimization on functions with different domains. We propose a two-step pre-training method and analyze its appealing asymptotic properties and benefits to BO both theoretically and empirically. On real-world hyperparameter tuning tasks that involve multiple search spaces, we demonstrate that HyperBO+ is able to generalize to unseen search spaces and achieves lower regrets than competitive baselines.Comment: Full version of the workshop paper at 2022 NeurIPS Workshop on Gaussian Processes, Spatiotemporal Modeling, and Decision-making System

    Transfer Learning for Bayesian Optimization on Heterogeneous Search Spaces

    Full text link
    Bayesian optimization (BO) is a popular black-box function optimization method, which makes sequential decisions based on a Bayesian model, typically a Gaussian process (GP), of the function. To ensure the quality of the model, transfer learning approaches have been developed to automatically design GP priors by learning from observations on "training" functions. These training functions are typically required to have the same domain as the "test" function (black-box function to be optimized). In this paper, we introduce MPHD, a model pre-training method on heterogeneous domains, which uses a neural net mapping from domain-specific contexts to specifications of hierarchical GPs. MPHD can be seamlessly integrated with BO to transfer knowledge across heterogeneous search spaces. Our theoretical and empirical results demonstrate the validity of MPHD and its superior performance on challenging black-box function optimization tasks

    ALP: Action-Aware Embodied Learning for Perception

    Full text link
    Current methods in training and benchmarking vision models exhibit an over-reliance on passive, curated datasets. Although models trained on these datasets have shown strong performance in a wide variety of tasks such as classification, detection, and segmentation, they fundamentally are unable to generalize to an ever-evolving world due to constant out-of-distribution shifts of input data. Therefore, instead of training on fixed datasets, can we approach learning in a more human-centric and adaptive manner? In this paper, we introduce \textbf{A}ction-aware Embodied \textbf{L}earning for \textbf{P}erception (ALP), an embodied learning framework that incorporates action information into representation learning through a combination of optimizing policy gradients through reinforcement learning and inverse dynamics prediction objectives. Our method actively explores complex 3D environments to both learn generalizable task-agnostic representations as well as collect downstream training data. We show that ALP outperforms existing baselines in object detection and semantic segmentation. In addition, we show that by training on actively collected data more relevant to the environment and task, our method generalizes more robustly to downstream tasks compared to models pre-trained on fixed datasets such as ImageNet.Comment: preprin

    UAV first view landmark localization with active reinforcement learning

    Get PDF
    We present an active reinforcement learning framework for unmanned aerial vehicle (UAV) first view landmark localization. We formulate the problem of landmark localization as that of a Markov decision process and introduce an active landmark-localization network (ALLNet) to address it. The aim of the ALLNet is to locate a bounding box that surrounds the landmark in a first view image sequence. To this end, it is trained in a reinforcement learning fashion. Specifically, it employs support vector machine (SVM) scores on the bounding box patches as rewards and learns the bounding box transformations as actions. Furthermore, each SVM score indicates whether or not the landmark is detected by the bounding box such that it enables the ALLNet to have the capability of judging whether the landmark leaves or re-enters a first view image. Therefore, the operation of the ALLNet is not only dominated by the reinforcement learning process but also supplemented by an active learning motivated manner. Once the landmark is considered to leave the first view image, the ALLNet stops operating until the SVM detects its re-entry to the view. The active reinforcement learning model enables training a robust ALLNet for landmark localization. The experimental results validate the effectiveness of the proposed model for UAV first view landmark localization

    Site-Agnostic 3D Dose Distribution Prediction with Deep Learning Neural Networks

    Full text link
    Typically, the current dose prediction models are limited to small amounts of data and require re-training for a specific site, often leading to suboptimal performance. We propose a site-agnostic, 3D dose distribution prediction model using deep learning that can leverage data from any treatment site, thus increasing the total data available to train the model. Applying our proposed model to a new target treatment site requires only a brief fine-tuning of the model to the new data and involves no modifications to the model input channels or its parameters. Thus, it can be efficiently adapted to a different treatment site, even with a small training dataset

    Long-Short-Range Message-Passing: A Physics-Informed Framework to Capture Non-Local Interaction for Scalable Molecular Dynamics Simulation

    Full text link
    Computational simulation of chemical and biological systems using ab initio molecular dynamics has been a challenge over decades. Researchers have attempted to address the problem with machine learning and fragmentation-based methods, however the two approaches fail to give a satisfactory description of long-range and many-body interactions, respectively. Inspired by fragmentation-based methods, we propose the Long-Short-Range Message-Passing (LSR-MP) framework as a generalization of the existing equivariant graph neural networks (EGNNs) with the intent to incorporate long-range interactions efficiently and effectively. We apply the LSR-MP framework to the recently proposed ViSNet and demonstrate the state-of-the-art results with up to 40%40\% error reduction for molecules in MD22 and Chignolin datasets. Consistent improvements to various EGNNs will also be discussed to illustrate the general applicability and robustness of our LSR-MP framework

    Formation of sclerotia in Sclerotinia ginseng and composition of the sclerotial exudate

    Get PDF
    Background Sclerotinia ginseng is a major devastating soil-borne pathogen of ginseng that can cause irreparable damage and large economic losses. This pathogen produces sclerotia, which are among the most persistent resting structures produced by filamentous fungi. The production of an exudate is a common feature of sclerotial development. Methods S. ginseng was cultured on 10 different media and the following parameters were measured: mycelial growth rate (mm/day), initial formation time of exudate droplets, total quantity of exudate, number of sclerotia per dish, and sclerotial fresh/dry weight. The composition of the sclerotial exudate was analyzed using four methods (high performance liquid chromatography, gas chromatography-mass spectrometry, flame atomic absorption spectrometry, and Nessler’s reagent spectrophotometry). Results We found that PDA was the optimal medium for exudate production, while SDA medium resulted in the highest mycelial growth rate. The earliest emergence of exudate droplets from sclerotia was on OA-YE and V8 media. The largest amount of sclerotia and the smallest sclerotia were produced on V8 medium. The maximum and minimum dry/fresh weight were obtained on MEA medium and V8 medium, respectively. The exudate contained organic acids (oxalic acid, gallic acid, ferulic acid, vanillic acid, caffeic acid, and tannic acid), carbohydrates (inositol, glucose, and trehalose), various ions (potassium, sodium, and magnesium), and ammonia. Discussion The functions of the identified compounds are discussed within the context of pathogenicity, sclerotial development, and antimicrobial activity. Our findings provide information about the production of sclerotia and the composition of sclerotial exudate that may be useful to develop strategies to control this disease

    The effect of aging on network structure

    Full text link
    In network evolution, the effect of aging is universal: in scientific collaboration network, scientists have a finite time span of being active; in movie actors network, once popular stars are retiring from stage; devices on the Internet may become outmoded with techniques developing so rapidly. Here we find in citation networks that this effect can be represented by an exponential decay factor, e−βτe^{-\beta \tau}, where τ\tau is the node age, while other evolving networks (the Internet for instance) may have different types of aging, for example, a power-law decay factor, which is also studied and compared. It has been found that as soon as such a factor is introduced to the Barabasi-Albert Scale-Free model, the network will be significantly transformed. The network will be clustered even with infinitely large size, and the clustering coefficient varies greatly with the intensity of the aging effect, i.e. it increases linearly with β\beta for small values of β\beta and decays exponentially for large values of β\beta . At the same time, the aging effect may also result in a hierarchical structure and a disassortative degree-degree correlation. Generally the aging effect will increase the average distance between nodes, but the result depends on the type of the decay factor. The network appears like a one-dimensional chain when exponential decay is chosen, but with power-law decay, a transformation process is observed, i.e., from a small-world network to a hypercubic lattice, and to a one-dimensional chain finally. The disparities observed for different choices of the decay factor, in clustering, average node distance and probably other aspects not yet identified, are believed to bear significant meaning on empirical data acquisition.Comment: 8 pages, 9 figures,V2, accepted for publication in Phys. Rev.
    • …
    corecore