126 research outputs found
ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception
Visual perception tasks often require vast amounts of labelled data,
including 3D poses and image space segmentation masks. The process of creating
such training data sets can prove difficult or time-intensive to scale up to
efficacy for general use. Consider the task of pose estimation for rigid
objects. Deep neural network based approaches have shown good performance when
trained on large, public datasets. However, adapting these networks for other
novel objects, or fine-tuning existing models for different environments,
requires significant time investment to generate newly labelled instances.
Towards this end, we propose ProgressLabeller as a method for more efficiently
generating large amounts of 6D pose training data from color images sequences
for custom scenes in a scalable manner. ProgressLabeller is intended to also
support transparent or translucent objects, for which the previous methods
based on depth dense reconstruction will fail. We demonstrate the effectiveness
of ProgressLabeller by rapidly create a dataset of over 1M samples with which
we fine-tune a state-of-the-art pose estimation network in order to markedly
improve the downstream robotic grasp success rates. ProgressLabeller is
open-source at https://github.com/huijieZH/ProgressLabeller.Comment: IROS 2022 accepted paper; project page:
https://progress.eecs.umich.edu/projects/progress-labeller
Adaptive Integration of Partial Label Learning and Negative Learning for Enhanced Noisy Label Learning
There has been significant attention devoted to the effectiveness of various
domains, such as semi-supervised learning, contrastive learning, and
meta-learning, in enhancing the performance of methods for noisy label learning
(NLL) tasks. However, most existing methods still depend on prior assumptions
regarding clean samples amidst different sources of noise (\eg, a pre-defined
drop rate or a small subset of clean samples). In this paper, we propose a
simple yet powerful idea called \textbf{NPN}, which revolutionizes
\textbf{N}oisy label learning by integrating \textbf{P}artial label learning
(PLL) and \textbf{N}egative learning (NL). Toward this goal, we initially
decompose the given label space adaptively into the candidate and complementary
labels, thereby establishing the conditions for PLL and NL. We propose two
adaptive data-driven paradigms of label disambiguation for PLL: hard
disambiguation and soft disambiguation. Furthermore, we generate reliable
complementary labels using all non-candidate labels for NL to enhance model
robustness through indirect supervision. To maintain label reliability during
the later stage of model training, we introduce a consistency regularization
term that encourages agreement between the outputs of multiple augmentations.
Experiments conducted on both synthetically corrupted and real-world noisy
datasets demonstrate the superiority of NPN compared to other state-of-the-art
(SOTA) methods. The source code has been made available at
{\color{purple}{\url{https://github.com/NUST-Machine-Intelligence-Laboratory/NPN}}}.Comment: accepted by AAAI 202
Siamese DETR
Recent self-supervised methods are mainly designed for representation
learning with the base model, e.g., ResNets or ViTs. They cannot be easily
transferred to DETR, with task-specific Transformer modules. In this work, we
present Siamese DETR, a Siamese self-supervised pretraining approach for the
Transformer architecture in DETR. We consider learning view-invariant and
detection-oriented representations simultaneously through two complementary
tasks, i.e., localization and discrimination, in a novel multi-view learning
framework. Two self-supervised pretext tasks are designed: (i) Multi-View
Region Detection aims at learning to localize regions-of-interest between
augmented views of the input, and (ii) Multi-View Semantic Discrimination
attempts to improve object-level discrimination for each region. The proposed
Siamese DETR achieves state-of-the-art transfer performance on COCO and PASCAL
VOC detection using different DETR variants in all setups. Code is available at
https://github.com/Zx55/SiameseDETR.Comment: 10 pages, 11 figures. Accepted in CVPR 202
Statistical characteristics of ionospheric hiss waves
In this study, we use the observations of electromagnetic waves by DEMETER satellite to investigate propagation characteristics of low altitude ionospheric hiss. In an event study, intense hiss wave power is concentrated over a narrow frequency band with a central frequency that decreases as latitude decreases, which coincides to the variation of local proton cyclotron frequency fCH. The wave propagates obliquely to the background magnetic field and equatorward from high latitude region. We use about 6 years' observations to statistically study the dependence of ionospheric hiss wave power on location, local time, geomagnetic activity and season. The results demonstrate that the ionospheric hiss power is stronger on the dayside, under higher geomagnetic activity, in local summer and confined near the region where the local fCH is equal to the wave frequency. To explain the concentration of wave power, a ray tracing simulation is performed and reproduced the wave propagation process
TransNet: Transparent Object Manipulation Through Category-Level Pose Estimation
Transparent objects present multiple distinct challenges to visual perception
systems. First, their lack of distinguishing visual features makes transparent
objects harder to detect and localize than opaque objects. Even humans find
certain transparent surfaces with little specular reflection or refraction,
like glass doors, difficult to perceive. A second challenge is that depth
sensors typically used for opaque object perception cannot obtain accurate
depth measurements on transparent surfaces due to their unique reflective
properties. Stemming from these challenges, we observe that transparent object
instances within the same category, such as cups, look more similar to each
other than to ordinary opaque objects of that same category. Given this
observation, the present paper explores the possibility of category-level
transparent object pose estimation rather than instance-level pose estimation.
We propose \textit{\textbf{TransNet}}, a two-stage pipeline that estimates
category-level transparent object pose using localized depth completion and
surface normal estimation. TransNet is evaluated in terms of pose estimation
accuracy on a large-scale transparent object dataset and compared to a
state-of-the-art category-level pose estimation approach. Results from this
comparison demonstrate that TransNet achieves improved pose estimation accuracy
on transparent objects. Moreover, we use TransNet to build an autonomous
transparent object manipulation system for robotic pick-and-place and pouring
tasks
Bi-level Actor-Critic for Multi-agent Coordination
Coordination is one of the essential problems in multi-agent systems.
Typically multi-agent reinforcement learning (MARL) methods treat agents
equally and the goal is to solve the Markov game to an arbitrary Nash
equilibrium (NE) when multiple equilibra exist, thus lacking a solution for NE
selection. In this paper, we treat agents \emph{unequally} and consider
Stackelberg equilibrium as a potentially better convergence point than Nash
equilibrium in terms of Pareto superiority, especially in cooperative
environments. Under Markov games, we formally define the bi-level reinforcement
learning problem in finding Stackelberg equilibrium. We propose a novel
bi-level actor-critic learning method that allows agents to have different
knowledge base (thus intelligent), while their actions still can be executed
simultaneously and distributedly. The convergence proof is given, while the
resulting learning algorithm is tested against the state of the arts. We found
that the proposed bi-level actor-critic algorithm successfully converged to the
Stackelberg equilibria in matrix games and find an asymmetric solution in a
highway merge environment
HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors
Conventional knowledge distillation (KD) methods for object detection mainly
concentrate on homogeneous teacher-student detectors. However, the design of a
lightweight detector for deployment is often significantly different from a
high-capacity detector. Thus, we investigate KD among heterogeneous
teacher-student pairs for a wide application. We observe that the core
difficulty for heterogeneous KD (hetero-KD) is the significant semantic gap
between the backbone features of heterogeneous detectors due to the different
optimization manners. Conventional homogeneous KD (homo-KD) methods suffer from
such a gap and are hard to directly obtain satisfactory performance for
hetero-KD. In this paper, we propose the HEtero-Assists Distillation (HEAD)
framework, leveraging heterogeneous detection heads as assistants to guide the
optimization of the student detector to reduce this gap. In HEAD, the assistant
is an additional detection head with the architecture homogeneous to the
teacher head attached to the student backbone. Thus, a hetero-KD is transformed
into a homo-KD, allowing efficient knowledge transfer from the teacher to the
student. Moreover, we extend HEAD into a Teacher-Free HEAD (TF-HEAD) framework
when a well-trained teacher detector is unavailable. Our method has achieved
significant improvement compared to current detection KD methods. For example,
on the MS-COCO dataset, TF-HEAD helps R18 RetinaNet achieve 33.9 mAP (+2.2),
while HEAD further pushes the limit to 36.2 mAP (+4.5).Comment: ECCV 2022, Code: https://github.com/LutingWang/HEA
Design of a precision calibration unit for Keck NIRC2 AO instrument
High-precision astrometry has the potential to address questions in planet formation, black hole science, Galactic structure, and more. However, in order to achieve a precision of sub-milli arcseconds (mas), we need a calibration method better than the current techniques such as on-sky calibration using calibrated stellar or stellar cluster systems, which have a precision of ~1 mas. Precision calibration unit with a regular grid of photo-lithographically manufactured pinholes combined with self-calibration techniques, on the other hand, is a new and innovative way to potentially achieve a precision of sub-mas over the entire field of view. This technique is beneficial to adaptive optic (AO) instruments for future telescopes like the Thirty Meter Telescope (TMT). In this work, we present our design for a new astrometric calibration unit to feed the NIRC2 AO instrument at the W. M. Keck Observatory. It allows calibration over a large field of view of 47" x 47"
Design of a precision calibration unit for Keck NIRC2 AO instrument
High-precision astrometry has the potential to address questions in planet formation, black hole science, Galactic structure, and more. However, in order to achieve a precision of sub-milli arcseconds (mas), we need a calibration method better than the current techniques such as on-sky calibration using calibrated stellar or stellar cluster systems, which have a precision of ~1 mas. Precision calibration unit with a regular grid of photo-lithographically manufactured pinholes combined with self-calibration techniques, on the other hand, is a new and innovative way to potentially achieve a precision of sub-mas over the entire field of view. This technique is beneficial to adaptive optic (AO) instruments for future telescopes like the Thirty Meter Telescope (TMT). In this work, we present our design for a new astrometric calibration unit to feed the NIRC2 AO instrument at the W. M. Keck Observatory. It allows calibration over a large field of view of 47" x 47"
Theoretical foundations of studying criticality in the brain
Criticality is hypothesized as a physical mechanism underlying efficient
transitions between cortical states and remarkable information processing
capacities in the brain. While considerable evidence generally supports this
hypothesis, non-negligible controversies persist regarding the ubiquity of
criticality in neural dynamics and its role in information processing. Validity
issues frequently arise during identifying potential brain criticality from
empirical data. Moreover, the functional benefits implied by brain criticality
are frequently misconceived or unduly generalized. These problems stem from the
non-triviality and immaturity of the physical theories that analytically derive
brain criticality and the statistic techniques that estimate brain criticality
from empirical data. To help solve these problems, we present a systematic
review and reformulate the foundations of studying brain criticality, i.e.,
ordinary criticality (OC), quasi-criticality (qC), self-organized criticality
(SOC), and self-organized quasi-criticality (SOqC), using the terminology of
neuroscience. We offer accessible explanations of the physical theories and
statistic techniques of brain criticality, providing step-by-step derivations
to characterize neural dynamics as a physical system with avalanches. We
summarize error-prone details and existing limitations in brain criticality
analysis and suggest possible solutions. Moreover, we present a forward-looking
perspective on how optimizing the foundations of studying brain criticality can
deepen our understanding of various neuroscience questions
- …