167 research outputs found
HyperBO+: Pre-training a universal prior for Bayesian optimization with hierarchical Gaussian processes
Bayesian optimization (BO), while proved highly effective for many black-box
function optimization tasks, requires practitioners to carefully select priors
that well model their functions of interest. Rather than specifying by hand,
researchers have investigated transfer learning based methods to automatically
learn the priors, e.g. multi-task BO (Swersky et al., 2013), few-shot BO
(Wistuba and Grabocka, 2021) and HyperBO (Wang et al., 2022). However, those
prior learning methods typically assume that the input domains are the same for
all tasks, weakening their ability to use observations on functions with
different domains or generalize the learned priors to BO on different search
spaces. In this work, we present HyperBO+: a pre-training approach for
hierarchical Gaussian processes that enables the same prior to work universally
for Bayesian optimization on functions with different domains. We propose a
two-step pre-training method and analyze its appealing asymptotic properties
and benefits to BO both theoretically and empirically. On real-world
hyperparameter tuning tasks that involve multiple search spaces, we demonstrate
that HyperBO+ is able to generalize to unseen search spaces and achieves lower
regrets than competitive baselines.Comment: Full version of the workshop paper at 2022 NeurIPS Workshop on
Gaussian Processes, Spatiotemporal Modeling, and Decision-making System
Transfer Learning for Bayesian Optimization on Heterogeneous Search Spaces
Bayesian optimization (BO) is a popular black-box function optimization
method, which makes sequential decisions based on a Bayesian model, typically a
Gaussian process (GP), of the function. To ensure the quality of the model,
transfer learning approaches have been developed to automatically design GP
priors by learning from observations on "training" functions. These training
functions are typically required to have the same domain as the "test" function
(black-box function to be optimized). In this paper, we introduce MPHD, a model
pre-training method on heterogeneous domains, which uses a neural net mapping
from domain-specific contexts to specifications of hierarchical GPs. MPHD can
be seamlessly integrated with BO to transfer knowledge across heterogeneous
search spaces. Our theoretical and empirical results demonstrate the validity
of MPHD and its superior performance on challenging black-box function
optimization tasks
ALP: Action-Aware Embodied Learning for Perception
Current methods in training and benchmarking vision models exhibit an
over-reliance on passive, curated datasets. Although models trained on these
datasets have shown strong performance in a wide variety of tasks such as
classification, detection, and segmentation, they fundamentally are unable to
generalize to an ever-evolving world due to constant out-of-distribution shifts
of input data. Therefore, instead of training on fixed datasets, can we
approach learning in a more human-centric and adaptive manner? In this paper,
we introduce \textbf{A}ction-aware Embodied \textbf{L}earning for
\textbf{P}erception (ALP), an embodied learning framework that incorporates
action information into representation learning through a combination of
optimizing policy gradients through reinforcement learning and inverse dynamics
prediction objectives. Our method actively explores complex 3D environments to
both learn generalizable task-agnostic representations as well as collect
downstream training data. We show that ALP outperforms existing baselines in
object detection and semantic segmentation. In addition, we show that by
training on actively collected data more relevant to the environment and task,
our method generalizes more robustly to downstream tasks compared to models
pre-trained on fixed datasets such as ImageNet.Comment: preprin
UAV first view landmark localization with active reinforcement learning
We present an active reinforcement learning framework for unmanned aerial vehicle (UAV) first view landmark localization. We formulate the problem of landmark localization as that of a Markov decision process and introduce an active landmark-localization network (ALLNet) to address it. The aim of the ALLNet is to locate a bounding box that surrounds the landmark in a first view image sequence. To this end, it is trained in a reinforcement learning fashion. Specifically, it employs support vector machine (SVM) scores on the bounding box patches as rewards and learns the bounding box transformations as actions. Furthermore, each SVM score indicates whether or not the landmark is detected by the bounding box such that it enables the ALLNet to have the capability of judging whether the landmark leaves or re-enters a first view image. Therefore, the operation of the ALLNet is not only dominated by the reinforcement learning process but also supplemented by an active learning motivated manner. Once the landmark is considered to leave the first view image, the ALLNet stops operating until the SVM detects its re-entry to the view. The active reinforcement learning model enables training a robust ALLNet for landmark localization. The experimental results validate the effectiveness of the proposed model for UAV first view landmark localization
Site-Agnostic 3D Dose Distribution Prediction with Deep Learning Neural Networks
Typically, the current dose prediction models are limited to small amounts of
data and require re-training for a specific site, often leading to suboptimal
performance. We propose a site-agnostic, 3D dose distribution prediction model
using deep learning that can leverage data from any treatment site, thus
increasing the total data available to train the model. Applying our proposed
model to a new target treatment site requires only a brief fine-tuning of the
model to the new data and involves no modifications to the model input channels
or its parameters. Thus, it can be efficiently adapted to a different treatment
site, even with a small training dataset
Long-Short-Range Message-Passing: A Physics-Informed Framework to Capture Non-Local Interaction for Scalable Molecular Dynamics Simulation
Computational simulation of chemical and biological systems using ab initio
molecular dynamics has been a challenge over decades. Researchers have
attempted to address the problem with machine learning and fragmentation-based
methods, however the two approaches fail to give a satisfactory description of
long-range and many-body interactions, respectively. Inspired by
fragmentation-based methods, we propose the Long-Short-Range Message-Passing
(LSR-MP) framework as a generalization of the existing equivariant graph neural
networks (EGNNs) with the intent to incorporate long-range interactions
efficiently and effectively. We apply the LSR-MP framework to the recently
proposed ViSNet and demonstrate the state-of-the-art results with up to
error reduction for molecules in MD22 and Chignolin datasets. Consistent
improvements to various EGNNs will also be discussed to illustrate the general
applicability and robustness of our LSR-MP framework
Formation of sclerotia in Sclerotinia ginseng and composition of the sclerotial exudate
Background Sclerotinia ginseng is a major devastating soil-borne pathogen of ginseng that can cause irreparable damage and large economic losses. This pathogen produces sclerotia, which are among the most persistent resting structures produced by filamentous fungi. The production of an exudate is a common feature of sclerotial development. Methods S. ginseng was cultured on 10 different media and the following parameters were measured: mycelial growth rate (mm/day), initial formation time of exudate droplets, total quantity of exudate, number of sclerotia per dish, and sclerotial fresh/dry weight. The composition of the sclerotial exudate was analyzed using four methods (high performance liquid chromatography, gas chromatography-mass spectrometry, flame atomic absorption spectrometry, and Nessler’s reagent spectrophotometry). Results We found that PDA was the optimal medium for exudate production, while SDA medium resulted in the highest mycelial growth rate. The earliest emergence of exudate droplets from sclerotia was on OA-YE and V8 media. The largest amount of sclerotia and the smallest sclerotia were produced on V8 medium. The maximum and minimum dry/fresh weight were obtained on MEA medium and V8 medium, respectively. The exudate contained organic acids (oxalic acid, gallic acid, ferulic acid, vanillic acid, caffeic acid, and tannic acid), carbohydrates (inositol, glucose, and trehalose), various ions (potassium, sodium, and magnesium), and ammonia. Discussion The functions of the identified compounds are discussed within the context of pathogenicity, sclerotial development, and antimicrobial activity. Our findings provide information about the production of sclerotia and the composition of sclerotial exudate that may be useful to develop strategies to control this disease
The effect of aging on network structure
In network evolution, the effect of aging is universal: in scientific
collaboration network, scientists have a finite time span of being active; in
movie actors network, once popular stars are retiring from stage; devices on
the Internet may become outmoded with techniques developing so rapidly. Here we
find in citation networks that this effect can be represented by an exponential
decay factor, , where is the node age, while other
evolving networks (the Internet for instance) may have different types of
aging, for example, a power-law decay factor, which is also studied and
compared. It has been found that as soon as such a factor is introduced to the
Barabasi-Albert Scale-Free model, the network will be significantly
transformed. The network will be clustered even with infinitely large size, and
the clustering coefficient varies greatly with the intensity of the aging
effect, i.e. it increases linearly with for small values of
and decays exponentially for large values of . At the same time, the
aging effect may also result in a hierarchical structure and a disassortative
degree-degree correlation. Generally the aging effect will increase the average
distance between nodes, but the result depends on the type of the decay factor.
The network appears like a one-dimensional chain when exponential decay is
chosen, but with power-law decay, a transformation process is observed, i.e.,
from a small-world network to a hypercubic lattice, and to a one-dimensional
chain finally. The disparities observed for different choices of the decay
factor, in clustering, average node distance and probably other aspects not yet
identified, are believed to bear significant meaning on empirical data
acquisition.Comment: 8 pages, 9 figures,V2, accepted for publication in Phys. Rev.
- …