74 research outputs found
Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction
Traditional geometric registration based estimation methods only exploit the
CAD model implicitly, which leads to their dependence on observation quality
and deficiency to occlusion. To address the problem,the paper proposes a
bidirectional correspondence prediction network with a point-wise
attention-aware mechanism. This network not only requires the model points to
predict the correspondence but also explicitly models the geometric
similarities between observations and the model prior. Our key insight is that
the correlations between each model point and scene point provide essential
information for learning point-pair matches. To further tackle the correlation
noises brought by feature distribution divergence, we design a simple but
effective pseudo-siamese network to improve feature homogeneity. Experimental
results on the public datasets of LineMOD, YCB-Video, and Occ-LineMOD show that
the proposed method achieves better performance than other state-of-the-art
methods under the same evaluation criteria. Its robustness in estimating poses
is greatly improved, especially in an environment with severe occlusions
Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects
With the widespread deployment of deep neural networks (DNNs), ensuring the
reliability of DNN-based systems is of great importance. Serious reliability
issues such as system failures can be caused by numerical defects, one of the
most frequent defects in DNNs. To assure high reliability against numerical
defects, in this paper, we propose the RANUM approach including novel
techniques for three reliability assurance tasks: detection of potential
numerical defects, confirmation of potential-defect feasibility, and suggestion
of defect fixes. To the best of our knowledge, RANUM is the first approach that
confirms potential-defect feasibility with failure-exhibiting tests and
suggests fixes automatically. Extensive experiments on the benchmarks of 63
real-world DNN architectures show that RANUM outperforms state-of-the-art
approaches across the three reliability assurance tasks. In addition, when the
RANUM-generated fixes are compared with developers' fixes on open-source
projects, in 37 out of 40 cases, RANUM-generated fixes are equivalent to or
even better than human fixes.Comment: To appear at 45th International Conference on Software Engineering
(ICSE 2023), camera-ready versio
Anomalous quantum transport in 2D asymptotic quasiperiodic system
Quasiperiodic systems extend the concept of Anderson transition to the
quasi-random and low-dimensional realm, exhibiting intricate behaviors even in
the one-dimension, while their investigation in higher dimensions remains less
explored. Here, we delve into a series of two-dimensional lattice models of
Hall systems with asymptotically incommensurate flux, and reveal the impact of
asymptotic incommensurability together with relaxation on transport phenomena.
Specifically, we demonstrate anomalous bulk transport with universal scaling
characteristics in the wave-packet dynamics and conductivity, and predict novel
interplay effects involving asymptotic incommensurability, temperature, and
relaxation, leading to unprecedented multiple anisotropic metal-insulator
transitions. The asymptotic quasiperiodicity also leads to the quantized
anisotropic edge tunneling transport. Our work enriches the universal quantum
transport phenomena, and add to the fundamental mechanisms underlying the
metal-insulator transitions driven by incommensurability in higher dimensions,
potentially opening a new avenue for exploring novel transport physics in
quasiperiodic systems.Comment: 5+6 pages. 4+6 figure
Searching Collaborative Agents for Multi-plane Localization in 3D Ultrasound
3D ultrasound (US) is widely used due to its rich diagnostic information,
portability and low cost. Automated standard plane (SP) localization in US
volume not only improves efficiency and reduces user-dependence, but also
boosts 3D US interpretation. In this study, we propose a novel Multi-Agent
Reinforcement Learning (MARL) framework to localize multiple uterine SPs in 3D
US simultaneously. Our contribution is two-fold. First, we equip the MARL with
a one-shot neural architecture search (NAS) module to obtain the optimal agent
for each plane. Specifically, Gradient-based search using Differentiable
Architecture Sampler (GDAS) is employed to accelerate and stabilize the
training process. Second, we propose a novel collaborative strategy to
strengthen agents' communication. Our strategy uses recurrent neural network
(RNN) to learn the spatial relationship among SPs effectively. Extensively
validated on a large dataset, our approach achieves the accuracy of 7.05
degree/2.21mm, 8.62 degree/2.36mm and 5.93 degree/0.89mm for the mid-sagittal,
transverse and coronal plane localization, respectively. The proposed MARL
framework can significantly increase the plane localization accuracy and reduce
the computational cost and model size.Comment: Early accepted by MICCAI 202
FetusMapV2: Enhanced Fetal Pose Estimation in 3D Ultrasound
Fetal pose estimation in 3D ultrasound (US) involves identifying a set of
associated fetal anatomical landmarks. Its primary objective is to provide
comprehensive information about the fetus through landmark connections, thus
benefiting various critical applications, such as biometric measurements, plane
localization, and fetal movement monitoring. However, accurately estimating the
3D fetal pose in US volume has several challenges, including poor image
quality, limited GPU memory for tackling high dimensional data, symmetrical or
ambiguous anatomical structures, and considerable variations in fetal poses. In
this study, we propose a novel 3D fetal pose estimation framework (called
FetusMapV2) to overcome the above challenges. Our contribution is three-fold.
First, we propose a heuristic scheme that explores the complementary network
structure-unconstrained and activation-unreserved GPU memory management
approaches, which can enlarge the input image resolution for better results
under limited GPU memory. Second, we design a novel Pair Loss to mitigate
confusion caused by symmetrical and similar anatomical structures. It separates
the hidden classification task from the landmark localization task and thus
progressively eases model learning. Last, we propose a shape priors-based
self-supervised learning by selecting the relatively stable landmarks to refine
the pose online. Extensive experiments and diverse applications on a
large-scale fetal US dataset including 1000 volumes with 22 landmarks per
volume demonstrate that our method outperforms other strong competitors.Comment: 16 pages, 11 figures, accepted by Medical Image Analysis(2023
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
The advancement of large language models (LLMs) has significantly propelled
the field of code generation. Previous work integrated reinforcement learning
(RL) with compiler feedback for exploring the output space of LLMs to enhance
code generation quality. However, the lengthy code generated by LLMs in
response to complex human requirements makes RL exploration a challenge. Also,
since the unit tests may not cover the complicated code, optimizing LLMs by
using these unexecuted code snippets is ineffective. To tackle these
challenges, we introduce StepCoder, a novel RL framework for code generation,
consisting of two main components: CCCS addresses the exploration challenge by
breaking the long sequences code generation task into a Curriculum of Code
Completion Subtasks, while FGO only optimizes the model by masking the
unexecuted code segments to provide Fine-Grained Optimization. In addition, we
furthermore construct the APPS+ dataset for RL training, which is manually
verified to ensure the correctness of unit tests. Experimental results show
that our method improves the ability to explore the output space and
outperforms state-of-the-art approaches in corresponding benchmarks. Our
dataset APPS+ and StepCoder are available online.Comment: 13 pages, 5 figure
Secrets of RLHF in Large Language Models Part I: PPO
Large language models (LLMs) have formulated a blueprint for the advancement
of artificial general intelligence. Its primary objective is to function as a
human-centric (helpful, honest, and harmless) assistant. Alignment with humans
assumes paramount significance, and reinforcement learning with human feedback
(RLHF) emerges as the pivotal technological paradigm underpinning this pursuit.
Current technical routes usually include \textbf{reward models} to measure
human preferences, \textbf{Proximal Policy Optimization} (PPO) to optimize
policy model outputs, and \textbf{process supervision} to improve step-by-step
reasoning capabilities. However, due to the challenges of reward design,
environment interaction, and agent training, coupled with huge trial and error
cost of large language models, there is a significant barrier for AI
researchers to motivate the development of technical alignment and safe landing
of LLMs. The stable training of RLHF has still been a puzzle. In the first
report, we dissect the framework of RLHF, re-evaluate the inner workings of
PPO, and explore how the parts comprising PPO algorithms impact policy agent
training. We identify policy constraints being the key factor for the effective
implementation of the PPO algorithm. Therefore, we explore the PPO-max, an
advanced version of PPO algorithm, to efficiently improve the training
stability of the policy model. Based on our main results, we perform a
comprehensive analysis of RLHF abilities compared with SFT models and ChatGPT.
The absence of open-source implementations has posed significant challenges to
the investigation of LLMs alignment. Therefore, we are eager to release
technical reports, reward models and PPO code
- …