274 research outputs found
Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM
The central problem in cryo-electron microscopy (cryo-EM) is to recover the
3D structure from noisy 2D projection images which requires estimating the
missing projection angles (poses). Recent methods attempted to solve the 3D
reconstruction problem with the autoencoder architecture, which suffers from
the latent vector space sampling problem and frequently produces suboptimal
pose inferences and inferior 3D reconstructions. Here we present an improved
autoencoder architecture called ACE (Asymmetric Complementary autoEncoder),
based on which we designed the ACE-EM method for cryo-EM 3D reconstructions.
Compared to previous methods, ACE-EM reached higher pose space coverage within
the same training time and boosted the reconstruction performance regardless of
the choice of decoders. With this method, the Nyquist resolution (highest
possible resolution) was reached for 3D reconstructions of both simulated and
experimental cryo-EM datasets. Furthermore, ACE-EM is the only amortized
inference method that reached the Nyquist resolution
Amortized Bayesian Inference of GISAXS Data with Normalizing Flows
Grazing-Incidence Small-Angle X-ray Scattering (GISAXS) is a modern imaging
technique used in material research to study nanoscale materials.
Reconstruction of the parameters of an imaged object imposes an ill-posed
inverse problem that is further complicated when only an in-plane GISAXS signal
is available. Traditionally used inference algorithms such as Approximate
Bayesian Computation (ABC) rely on computationally expensive scattering
simulation software, rendering analysis highly time-consuming. We propose a
simulation-based framework that combines variational auto-encoders and
normalizing flows to estimate the posterior distribution of object parameters
given its GISAXS data. We apply the inference pipeline to experimental data and
demonstrate that our method reduces the inference cost by orders of magnitude
while producing consistent results with ABC
Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder
Despite its practical importance across a wide range of modalities, recent
advances in self-supervised learning (SSL) have been primarily focused on a few
well-curated domains, e.g., vision and language, often relying on their
domain-specific knowledge. For example, Masked Auto-Encoder (MAE) has become
one of the popular architectures in these domains, but less has explored its
potential in other modalities. In this paper, we develop MAE as a unified,
modality-agnostic SSL framework. In turn, we argue meta-learning as a key to
interpreting MAE as a modality-agnostic learner, and propose enhancements to
MAE from the motivation to jointly improve its SSL across diverse modalities,
coined MetaMAE as a result. Our key idea is to view the mask reconstruction of
MAE as a meta-learning task: masked tokens are predicted by adapting the
Transformer meta-learner through the amortization of unmasked tokens. Based on
this novel interpretation, we propose to integrate two advanced meta-learning
techniques. First, we adapt the amortized latent of the Transformer encoder
using gradient-based meta-learning to enhance the reconstruction. Then, we
maximize the alignment between amortized and adapted latents through task
contrastive learning which guides the Transformer encoder to better encode the
task-specific knowledge. Our experiment demonstrates the superiority of MetaMAE
in the modality-agnostic SSL benchmark (called DABS), significantly
outperforming prior baselines. Code is available at
https://github.com/alinlab/MetaMAE.Comment: Accepted to NeurIPS 2023. The first two authors contributed equall
MeshDiffusion: Score-based Generative 3D Mesh Modeling
We consider the task of generating realistic 3D shapes, which is useful for a
variety of applications such as automatic scene generation and physical
simulation. Compared to other 3D representations like voxels and point clouds,
meshes are more desirable in practice, because (1) they enable easy and
arbitrary manipulation of shapes for relighting and simulation, and (2) they
can fully leverage the power of modern graphics pipelines which are mostly
optimized for meshes. Previous scalable methods for generating meshes typically
rely on sub-optimal post-processing, and they tend to produce overly-smooth or
noisy surfaces without fine-grained geometric details. To overcome these
shortcomings, we take advantage of the graph structure of meshes and use a
simple yet very effective generative modeling method to generate 3D meshes.
Specifically, we represent meshes with deformable tetrahedral grids, and then
train a diffusion model on this direct parametrization. We demonstrate the
effectiveness of our model on multiple generative tasks.Comment: Published in ICLR 2023 (Spotlight, Notable-top-25%
Learning 3D Shape Completion under Weak Supervision
We address the problem of 3D shape completion from sparse and noisy point
clouds, a fundamental problem in computer vision and robotics. Recent
approaches are either data-driven or learning-based: Data-driven approaches
rely on a shape model whose parameters are optimized to fit the observations;
Learning-based approaches, in contrast, avoid the expensive optimization step
by learning to directly predict complete shapes from incomplete observations in
a fully-supervised setting. However, full supervision is often not available in
practice. In this work, we propose a weakly-supervised learning-based approach
to 3D shape completion which neither requires slow optimization nor direct
supervision. While we also learn a shape prior on synthetic data, we amortize,
i.e., learn, maximum likelihood fitting using deep neural networks resulting in
efficient shape completion without sacrificing accuracy. On synthetic
benchmarks based on ShapeNet and ModelNet as well as on real robotics data from
KITTI and Kinect, we demonstrate that the proposed amortized maximum likelihood
approach is able to compete with recent fully supervised baselines and
outperforms data-driven approaches, while requiring less supervision and being
significantly faster
- …