675 research outputs found
A Revisit of Shape Editing Techniques: from the Geometric to the Neural Viewpoint
3D shape editing is widely used in a range of applications such as movie
production, computer games and computer aided design. It is also a popular
research topic in computer graphics and computer vision. In past decades,
researchers have developed a series of editing methods to make the editing
process faster, more robust, and more reliable. Traditionally, the deformed
shape is determined by the optimal transformation and weights for an energy
term. With increasing availability of 3D shapes on the Internet, data-driven
methods were proposed to improve the editing results. More recently as the deep
neural networks became popular, many deep learning based editing methods have
been developed in this field, which is naturally data-driven. We mainly survey
recent research works from the geometric viewpoint to those emerging neural
deformation techniques and categorize them into organic shape editing methods
and man-made model editing methods. Both traditional methods and recent neural
network based methods are reviewed
TapMo: Shape-aware Motion Generation of Skeleton-free Characters
Previous motion generation methods are limited to the pre-rigged 3D human
model, hindering their applications in the animation of various non-rigged
characters. In this work, we present TapMo, a Text-driven Animation Pipeline
for synthesizing Motion in a broad spectrum of skeleton-free 3D characters. The
pivotal innovation in TapMo is its use of shape deformation-aware features as a
condition to guide the diffusion model, thereby enabling the generation of
mesh-specific motions for various characters. Specifically, TapMo comprises two
main components - Mesh Handle Predictor and Shape-aware Diffusion Module. Mesh
Handle Predictor predicts the skinning weights and clusters mesh vertices into
adaptive handles for deformation control, which eliminates the need for
traditional skeletal rigging. Shape-aware Motion Diffusion synthesizes motion
with mesh-specific adaptations. This module employs text-guided motions and
mesh features extracted during the first stage, preserving the geometric
integrity of the animations by accounting for the character's shape and
deformation. Trained in a weakly-supervised manner, TapMo can accommodate a
multitude of non-human meshes, both with and without associated text motions.
We demonstrate the effectiveness and generalizability of TapMo through rigorous
qualitative and quantitative experiments. Our results reveal that TapMo
consistently outperforms existing auto-animation methods, delivering
superior-quality animations for both seen or unseen heterogeneous 3D
characters
HIGH QUALITY HUMAN 3D BODY MODELING, TRACKING AND APPLICATION
Geometric reconstruction of dynamic objects is a fundamental task of computer vision and graphics, and modeling human body of high fidelity is considered to be a core of this problem. Traditional human shape and motion capture techniques require an array of surrounding cameras or subjects wear reflective markers, resulting in a limitation of working space and portability. In this dissertation, a complete process is designed from geometric modeling detailed 3D human full body and capturing shape dynamics over time using a flexible setup to guiding clothes/person re-targeting with such data-driven models. As the mechanical movement of human body can be considered as an articulate motion, which is easy to guide the skin animation but has difficulties in the reverse process to find parameters from images without manual intervention, we present a novel parametric model, GMM-BlendSCAPE, jointly taking both linear skinning model and the prior art of BlendSCAPE (Blend Shape Completion and Animation for PEople) into consideration and develop a Gaussian Mixture Model (GMM) to infer both body shape and pose from incomplete observations. We show the increased accuracy of joints and skin surface estimation using our model compared to the skeleton based motion tracking. To model the detailed body, we start with capturing high-quality partial 3D scans by using a single-view commercial depth camera. Based on GMM-BlendSCAPE, we can then reconstruct multiple complete static models of large pose difference via our novel non-rigid registration algorithm. With vertex correspondences established, these models can be further converted into a personalized drivable template and used for robust pose tracking in a similar GMM framework. Moreover, we design a general purpose real-time non-rigid deformation algorithm to accelerate this registration. Last but not least, we demonstrate a novel virtual clothes try-on application based on our personalized model utilizing both image and depth cues to synthesize and re-target clothes for single-view videos of different people
AI-generated Content for Various Data Modalities: A Survey
AI-generated content (AIGC) methods aim to produce text, images, videos, 3D
assets, and other media using AI algorithms. Due to its wide range of
applications and the demonstrated potential of recent works, AIGC developments
have been attracting lots of attention recently, and AIGC methods have been
developed for various data modalities, such as image, video, text, 3D shape (as
voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human
avatar (body and head), 3D motion, and audio -- each presenting different
characteristics and challenges. Furthermore, there have also been many
significant developments in cross-modality AIGC methods, where generative
methods can receive conditioning input in one modality and produce outputs in
another. Examples include going from various modalities to image, video, 3D
shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar),
and audio modalities. In this paper, we provide a comprehensive review of AIGC
methods across different data modalities, including both single-modality and
cross-modality methods, highlighting the various challenges, representative
works, and recent technical directions in each setting. We also survey the
representative datasets throughout the modalities, and present comparative
results for various modalities. Moreover, we also discuss the challenges and
potential future research directions
Non-isometric 3D shape registration.
3D shape registration is an important task in computer graphics and computer vision. It has been widely used in the area of film industry, 3D animation, video games and AR/VR assets creation. Manually creating the 3D model of a character from scratch is tedious and time consuming, and it can only be completed by professional trained artists. With the development of 3D geometry acquisition technology, it becomes easier and cheaper to capture high-resolution and highly detailed 3D geometries. However, the scanned data are often incomplete or noisy and therefore cannot be employed directly. To deal with the above two problems, one typical and efficient solution is to deform an existing high-quality model (template) to fit the scanned data (target). Shape registration as an essential technique to do so has been arousing intensive attention. In last decades, various shape registration approaches have been proposed for accurate template fitting. However, there are still some remaining challenges. It is well known that the template can be largely different with the target in respect of size and pose. With the large (usually non-isometric) deformation between them, the shear distortion can easily occur, which may lead to poor results, such as degenerated triangles, fold-overs. Before deforming the template towards the target, reliable correspondences between them should be found first. Incorrect correspondences give the wrong deformation guidance, which can also easily produce fold-overs. As mentioned before, the target always comes with noise. This is the part we want to filter out and try not to fit the template on it. Hence, non-isometric shape registration robust to noise is highly desirable in the scene of geometry modelling from the scanned data. In this PhD research, we address existing challenges in shape registration, including how to prevent the deformation distortion, how to reduce the foldover occurrence and how to deal with the noise in the target. Novel methods including consistent as-similar as-possible surface deformation and robust Huber-L1 surface registration are proposed, which are validated through experimental comparison with state-of-the-arts. The deformation technique plays an important role in shape registration. In this research, a consistent as similar-as-possible (CASAP) surface deformation approach is proposed. Starting from investigating the continuous deformation energy, we analyse the existing term to make the discrete energy converge to the continuous one, whose property we called as energy consistency. Based on the deformation method, a novel CASAP non-isometric surface registration method is proposed. The proposed registration method well preserves the angles of triangles in the template surface so that least distortion is introduced during the surface deformation and thus reduce the risk of fold-over and self-intersection. To reduce the noise influence, a Huber-L1 based non-isometric surface registration is proposed, where a Huber-L1 regularized model constrained on the transformation variation and position difference. The proposed method is robust to noise and produces piecewise smooth results while still preserving fine details on the target. We evaluate and validate our methods through extensive experiments, whose results have demonstrated that the proposed methods in this thesis are more accurate and robust to noise in comparison of the state-of-the arts and enable us to produce high quality models with little efforts
DSM-Net: Disentangled Structured Mesh Net for Controllable Generation of Fine Geometry
3D shape generation is a fundamental operation in computer graphics. While
significant progress has been made, especially with recent deep generative
models, it remains a challenge to synthesize high-quality geometric shapes with
rich detail and complex structure, in a controllable manner. To tackle this, we
introduce DSM-Net, a deep neural network that learns a disentangled structured
mesh representation for 3D shapes, where two key aspects of shapes, geometry
and structure, are encoded in a synergistic manner to ensure plausibility of
the generated shapes, while also being disentangled as much as possible. This
supports a range of novel shape generation applications with intuitive control,
such as interpolation of structure (geometry) while keeping geometry
(structure) unchanged. To achieve this, we simultaneously learn structure and
geometry through variational autoencoders (VAEs) in a hierarchical manner for
both, with bijective mappings at each level. In this manner we effectively
encode geometry and structure in separate latent spaces, while ensuring their
compatibility: the structure is used to guide the geometry and vice versa. At
the leaf level, the part geometry is represented using a conditional part VAE,
to encode high-quality geometric details, guided by the structure context as
the condition. Our method not only supports controllable generation
applications, but also produces high-quality synthesized shapes, outperforming
state-of-the-art methods
A First Step Towards Cage-based Deformation in Virtual Reality
The advent of low cost technologies makes the use of immersive virtual environments more interesting for several application contexts. 3D models are largely used in such environments for providing feelings of immersion and presence in the virtual world. 3D models are normally defined in dedicated authoring tools and then adapted to be used in the virtual environments; thus, any change in the model requires to loop back to the authoring tool for performing the wished modification and the successive adaptation processes. The availability of shape modification capabilities within the virtual environment can avoid the above modification-adaptation loop. To this aim, we present our first step in the development of a 3D modelling system in Virtual Reality. The shape modification is achieved through a cage-based deformation approach, applied to semantically enriched meshes, carrying annotated meaningful regions, thus allowing the direct selection and editing of significant object parts
Recommended from our members
Image Understanding and Robotics Research at Columbia University
Over the past year, the research investigations of the Vision/Robotics Laboratory at Columbia University have reflected the interests of its four faculty members, two staff programmers, and 16 Ph.D. students. Several of the projects involve other faculty members in the department or the university, or researchers at AT&T, IBM, or Philips. We list below a summary of our interests and results, together with the principal researchers associated with them. Since it is difficult to separate those aspects of robotic research that are purely visual from those that are vision-like (for example, tactile sensing) or vision-related (for example, integrated vision-robotic systems), we have listed all robotic research that is not purely manipulative. The majority of our current investigations are deepenings of work reported last year; this was the second year of both our basic Image Understanding contract and our Strategic Computing contract. Therefore, the form of this year's report closely resembles last year's. Although there are a few new initiatives, mainly we report the new results we have obtained in the same five basic research areas. Much of this work is summarized on a video tape that is available on request. We also note two service contributions this past year. The Special Issue on Computer Vision of the Proceedings of the IEEE, August, 1988, was co-edited by one of us (John Kender [27]). And, the upcoming IEEE Computer Society Conference on Computer Vision and Pattem Recognition, June, 1989, is co-program chaired by one of us (John Kender [23])
- …