64 research outputs found
Gaussian Max-Value Entropy Search for Multi-Agent Bayesian Optimization
We study the multi-agent Bayesian optimization (BO) problem, where multiple
agents maximize a black-box function via iterative queries. We focus on Entropy
Search (ES), a sample-efficient BO algorithm that selects queries to maximize
the mutual information about the maximum of the black-box function. One of the
main challenges of ES is that calculating the mutual information requires
computationally-costly approximation techniques. For multi-agent BO problems,
the computational cost of ES is exponential in the number of agents. To address
this challenge, we propose the Gaussian Max-value Entropy Search, a multi-agent
BO algorithm with favorable sample and computational efficiency. The key to our
idea is to use a normal distribution to approximate the function maximum and
calculate its mutual information accordingly. The resulting approximation
allows queries to be cast as the solution of a closed-form optimization problem
which, in turn, can be solved via a modified gradient ascent algorithm and
scaled to a large number of agents. We demonstrate the effectiveness of
Gaussian max-value Entropy Search through numerical experiments on standard
test functions and real-robot experiments on the source-seeking problem.
Results show that the proposed algorithm outperforms the multi-agent BO
baselines in the numerical experiments and can stably seek the source with a
limited number of noisy observations on real robots.Comment: 10 pages, 9 figure
3D Model-based Zero-Shot Pose Estimation Pipeline
Most existing learning-based pose estimation methods are typically developed
for non-zero-shot scenarios, where they can only estimate the poses of objects
present in the training dataset. This setting restricts their applicability to
unseen objects in the training phase. In this paper, we introduce a fully
zero-shot pose estimation pipeline that leverages the 3D models of objects as
clues. Specifically, we design a two-step pipeline consisting of 3D model-based
zero-shot instance segmentation and a zero-shot pose estimator. For the first
step, there is a novel way to perform zero-shot instance segmentation based on
the 3D models instead of text descriptions, which can handle complex properties
of unseen objects. For the second step, we utilize a hierarchical geometric
structure matching mechanism to perform zero-shot pose estimation which is 10
times faster than the current render-based method. Extensive experimental
results on the seven core datasets on the BOP challenge show that the proposed
method outperforms the zero-shot state-of-the-art method with higher speed and
lower computation cost
A Secure Mechanism for Big Data Collection in Large Scale Internet of Vehicle
As an extension for Internet of Things (IoT), Internet of Vehicles (IoV) achieves unified management in smart transportation area. With the development of IoV, an increasing number of vehicles are connected to the network. Large scale IoV collects data from different places and various attributes, which conform with heterogeneous nature of big data in size, volume, and dimensionality. Big data collection between vehicle and application platform becomes more and more frequent through various communication technologies, which causes evolving security attack. However, the existing protocols in IoT cannot be directly applied in big data collection in large scale IoV. The dynamic network structure and growing amount of vehicle nodes increases the complexity and necessary of the secure mechanism. In this paper, a secure mechanism for big data collection in large scale IoV is proposed for improved security performance and efficiency. To begin with, vehicles need to register in the big data center to connect into the network. Afterwards, vehicles associate with big data center via mutual authentication and single sign-on algorithm. Two different secure protocols are proposed for business data and confidential data collection. The collected big data is stored securely using distributed storage. The discussion and performance evaluation result shows the security and efficiency of the proposed secure mechanism
MIAD: A Maintenance Inspection Dataset for Unsupervised Anomaly Detection
Visual anomaly detection plays a crucial role in not only manufacturing
inspection to find defects of products during manufacturing processes, but also
maintenance inspection to keep equipment in optimum working condition
particularly outdoors. Due to the scarcity of the defective samples,
unsupervised anomaly detection has attracted great attention in recent years.
However, existing datasets for unsupervised anomaly detection are biased
towards manufacturing inspection, not considering maintenance inspection which
is usually conducted under outdoor uncontrolled environment such as varying
camera viewpoints, messy background and degradation of object surface after
long-term working. We focus on outdoor maintenance inspection and contribute a
comprehensive Maintenance Inspection Anomaly Detection (MIAD) dataset which
contains more than 100K high-resolution color images in various outdoor
industrial scenarios. This dataset is generated by a 3D graphics software and
covers both surface and logical anomalies with pixel-precise ground truth.
Extensive evaluations of representative algorithms for unsupervised anomaly
detection are conducted, and we expect MIAD and corresponding experimental
results can inspire research community in outdoor unsupervised anomaly
detection tasks. Worthwhile and related future work can be spawned from our new
dataset
Balancing Logit Variation for Long-tailed Semantic Segmentation
Semantic segmentation usually suffers from a long-tail data distribution. Due
to the imbalanced number of samples across categories, the features of those
tail classes may get squeezed into a narrow area in the feature space. Towards
a balanced feature distribution, we introduce category-wise variation into the
network predictions in the training phase such that an instance is no longer
projected to a feature point, but a small region instead. Such a perturbation
is highly dependent on the category scale, which appears as assigning smaller
variation to head classes and larger variation to tail classes. In this way, we
manage to close the gap between the feature areas of different categories,
resulting in a more balanced representation. It is noteworthy that the
introduced variation is discarded at the inference stage to facilitate a
confident prediction. Although with an embarrassingly simple implementation,
our method manifests itself in strong generalizability to various datasets and
task settings. Extensive experiments suggest that our plug-in design lends
itself well to a range of state-of-the-art approaches and boosts the
performance on top of them
Geo6D: Geometric Constraints Learning for 6D Pose Estimation
Numerous 6D pose estimation methods have been proposed that employ end-to-end
regression to directly estimate the target pose parameters. Since the visible
features of objects are implicitly influenced by their poses, the network
allows inferring the pose by analyzing the differences in features in the
visible region. However, due to the unpredictable and unrestricted range of
pose variations, the implicitly learned visible feature-pose constraints are
insufficiently covered by the training samples, making the network vulnerable
to unseen object poses. To tackle these challenges, we proposed a novel
geometric constraints learning approach called Geo6D for direct regression 6D
pose estimation methods. It introduces a pose transformation formula expressed
in relative offset representation, which is leveraged as geometric constraints
to reconstruct the input and output targets of the network. These reconstructed
data enable the network to estimate the pose based on explicit geometric
constraints and relative offset representation mitigates the issue of the pose
distribution gap. Extensive experimental results show that when equipped with
Geo6D, the direct 6D methods achieve state-of-the-art performance on multiple
datasets and demonstrate significant effectiveness, even with only 10% amount
of data
Baichuan 2: Open Large-scale Language Models
Large language models (LLMs) have demonstrated remarkable performance on a
variety of natural language tasks based on just a few examples of natural
language instructions, reducing the need for extensive feature engineering.
However, most powerful LLMs are closed-source or limited in their capability
for languages other than English. In this technical report, we present Baichuan
2, a series of large-scale multilingual language models containing 7 billion
and 13 billion parameters, trained from scratch, on 2.6 trillion tokens.
Baichuan 2 matches or outperforms other open-source models of similar size on
public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan
2 excels in vertical domains such as medicine and law. We will release all
pre-training model checkpoints to benefit the research community in better
understanding the training dynamics of Baichuan 2.Comment: Baichuan 2 technical report. Github:
https://github.com/baichuan-inc/Baichuan
- …