In this paper, we present a novel shape reconstruction method leveraging
diffusion model to generate 3D sparse point cloud for the object captured in a
single RGB image. Recent methods typically leverage global embedding or local
projection-based features as the condition to guide the diffusion model.
However, such strategies fail to consistently align the denoised point cloud
with the given image, leading to unstable conditioning and inferior
performance. In this paper, we present CCD-3DR, which exploits a novel centered
diffusion probabilistic model for consistent local feature conditioning. We
constrain the noise and sampled point cloud from the diffusion model into a
subspace where the point cloud center remains unchanged during the forward
diffusion process and reverse process. The stable point cloud center further
serves as an anchor to align each point with its corresponding local
projection-based features. Extensive experiments on synthetic benchmark
ShapeNet-R2N2 demonstrate that CCD-3DR outperforms all competitors by a large
margin, with over 40% improvement. We also provide results on real-world
dataset Pix3D to thoroughly demonstrate the potential of CCD-3DR in real-world
applications. Codes will be released soonComment: 11 page