114 research outputs found
Uncertainty Estimation for 3D Dense Prediction via Cross-Point Embeddings
Dense prediction tasks are common for 3D point clouds, but the uncertainties inherent in massive points and their embeddings have long been ignored. In this work, we present CUE, a novel uncertainty estimation method for dense prediction tasks in 3D point clouds. Inspired by metric learning, the key idea of CUE is to explore cross-point embeddings upon a conventional 3D dense prediction pipeline. Specifically, CUE involves building a probabilistic embedding model and then enforcing metric alignments of massive points in the embedding space. We also propose CUE+, which enhances CUE by explicitly modeling cross-point dependencies in the covariance matrix. We demonstrate that both CUE and CUE+ are generic and effective for uncertainty estimation in 3D point clouds with two different tasks: (1) in 3D geometric feature learning we for the first time obtain well-calibrated uncertainty, and (2) in semantic segmentation we reduce uncertainty's Expected Calibration Error of the state-of-the-arts by 16.5%. All uncertainties are estimated without compromising predictive performance
STUN: Self-Teaching Uncertainty Estimation for Place Recognition
Place recognition is key to Simultaneous Localization and Mapping (SLAM) and
spatial perception. However, a place recognition in the wild often suffers from
erroneous predictions due to image variations, e.g., changing viewpoints and
street appearance. Integrating uncertainty estimation into the life cycle of
place recognition is a promising method to mitigate the impact of variations on
place recognition performance. However, existing uncertainty estimation
approaches in this vein are either computationally inefficient (e.g., Monte
Carlo dropout) or at the cost of dropped accuracy. This paper proposes STUN, a
self-teaching framework that learns to simultaneously predict the place and
estimate the prediction uncertainty given an input image. To this end, we first
train a teacher net using a standard metric learning pipeline to produce
embedding priors. Then, supervised by the pretrained teacher net, a student net
with an additional variance branch is trained to finetune the embedding priors
and estimate the uncertainty sample by sample. During the online inference
phase, we only use the student net to generate a place prediction in
conjunction with the uncertainty. When compared with place recognition systems
that are ignorant to the uncertainty, our framework features the uncertainty
estimation for free without sacrificing any prediction accuracy. Our
experimental results on the large-scale Pittsburgh30k dataset demonstrate that
STUN outperforms the state-of-the-art methods in both recognition accuracy and
the quality of uncertainty estimation.Comment: To appear at the 35th IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS2022
Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs
Diffusion models have exhibited excellent performance in various domains. The
probability flow ordinary differential equation (ODE) of diffusion models
(i.e., diffusion ODEs) is a particular case of continuous normalizing flows
(CNFs), which enables deterministic inference and exact likelihood evaluation.
However, the likelihood estimation results by diffusion ODEs are still far from
those of the state-of-the-art likelihood-based generative models. In this work,
we propose several improved techniques for maximum likelihood estimation for
diffusion ODEs, including both training and evaluation perspectives. For
training, we propose velocity parameterization and explore variance reduction
techniques for faster convergence. We also derive an error-bounded high-order
flow matching objective for finetuning, which improves the ODE likelihood and
smooths its trajectory. For evaluation, we propose a novel training-free
truncated-normal dequantization to fill the training-evaluation gap commonly
existing in diffusion ODEs. Building upon these techniques, we achieve
state-of-the-art likelihood estimation results on image datasets (2.56 on
CIFAR-10, 3.43/3.69 on ImageNet-32) without variational dequantization or data
augmentation.Comment: Accepted in ICML202
DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics
Diffusion probabilistic models (DPMs) have exhibited excellent performance
for high-fidelity image generation while suffering from inefficient sampling.
Recent works accelerate the sampling procedure by proposing fast ODE solvers
that leverage the specific ODE form of DPMs. However, they highly rely on
specific parameterization during inference (such as noise/data prediction),
which might not be the optimal choice. In this work, we propose a novel
formulation towards the optimal parameterization during sampling that minimizes
the first-order discretization error of the ODE solution. Based on such
formulation, we propose DPM-Solver-v3, a new fast ODE solver for DPMs by
introducing several coefficients efficiently computed on the pretrained model,
which we call empirical model statistics. We further incorporate multistep
methods and a predictor-corrector framework, and propose some techniques for
improving sample quality at small numbers of function evaluations (NFE) or
large guidance scales. Experiments show that DPM-Solver-v3 achieves
consistently better or comparable performance in both unconditional and
conditional sampling with both pixel-space and latent-space DPMs, especially in
510 NFEs. We achieve FIDs of 12.21 (5 NFE), 2.51 (10 NFE) on
unconditional CIFAR10, and MSE of 0.55 (5 NFE, 7.5 guidance scale) on Stable
Diffusion, bringing a speed-up of 15%30% compared to previous
state-of-the-art training-free methods. Code is available at
https://github.com/thu-ml/DPM-Solver-v3.Comment: Accepted at NeurIPS 202
Risk Controlled Image Retrieval
Most image retrieval research focuses on improving predictive performance,
but they may fall short in scenarios where the reliability of the prediction is
crucial. Though uncertainty quantification can help by assessing uncertainty
for query and database images, this method can provide only a heuristic
estimate rather than an guarantee. To address these limitations, we present
Risk Controlled Image Retrieval (RCIR), which generates retrieval sets that are
guaranteed to contain the ground truth samples with a predefined probability.
RCIR can be easily plugged into any image retrieval method, agnostic to data
distribution and model selection. To the best of our knowledge, this is the
first work that provides coverage guarantees for image retrieval. The validity
and efficiency of RCIR is demonstrated on four real-world image retrieval
datasets, including the Stanford CAR-196 (Krause et al. 2013), CUB-200 (Wah et
al. 2011), the Pittsburgh dataset (Torii et al. 2013) and the ChestX-Det
dataset (Lian et al. 2021)
- …