3,138 research outputs found
Recommended from our members
LEARNING TO RIG CHARACTERS
With the emergence of 3D virtual worlds, 3D social media, and massive online games, the need for diverse, high-quality, animation-ready characters and avatars is greater than ever. To animate characters, artists hand-craft articulation structures, such as animation skeletons and part deformers, which require significant amount of manual and laborious interaction with 2D/3D modeling interfaces. This thesis presents deep learning methods that are able to significantly automate the process of character rigging.
First, the thesis introduces RigNet, a method capable of predicting an animation skeleton for an input static 3D shape in the form of a polygon mesh. The predicted skeletons match the animator expectations in joint placement and topology. RigNet also estimates surface skin weights which determine how the mesh is animated given the different skeletal poses. In contrast to prior work that fits pre-defined skeletal templates with hand-tuned objectives, RigNet is able to automatically rig diverse characters, such as humanoids, quadrupeds, toys, birds, with varying articulation structure and geometry. RigNet is based on a deep neural architecture that directly operates on the mesh representation. The architecture is trained on a diverse dataset of rigged models that we mined online and curated. The dataset includes 2.7K polygon meshes, along with their associated skeletons and corresponding skin weights.
Second, the thesis introduces Morig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters. Compared to RigNet, MoRig\u27s rigging is \emph{motion-aware}: its neural network encodes motion cues from the point clouds into compact feature representations that are informative about the articulated parts of the performing character. These motion-aware features guide the inference of an appropriate skeletal rig for the input mesh. Furthermore, Morig is able to animate the rig according to the captured point cloud motion. Morig can handle diverse characters with different morphologies (e.g., humanoids, quadrupeds, toy characters). It also accounts for occluded regions in the point clouds and mismatches in the part proportions between the input mesh and captured character.
Third, the thesis introduces APES, a method that takes as input 2D raster images depicting a small set of poses of a character shown in a sprite sheet, and identifies articulated parts useful for rigging the character. APES uses a combination of neural network inference and integer linear programming to identify a compact set of articulated body parts, e.g. head, torso and limbs, that best reconstruct the input poses. Compared to Morig and RigNet that require a large collection of training models with associated skeletons and skinning weights, APES\u27 neural architecture relies on less effortful supervision from (i) pixel correspondences readily available in existing large cartoon image datasets (e.g., Creative Flow), (ii) a relatively small dataset of 57 cartoon characters segmented into moving parts.
Finally, the thesis discusses future research directions related to combining neural rigging with 3D and 4D reconstruction of characters from point cloud data and 2D video as well as automating the process of motion synthesis for 3D characters
Surface Reconstruction and Evolution from Multiple Views
Applications like 3D Telepresence necessitate faithful 3D surface reconstruction
of the object and 3D data compression in both spatial and
temporal domains. This makes us feel immersed in virtual environments
there by making 3D Telepresence a powerful tool in many applications.
Hence 3D surface reconstruction and 3D compression are two challenging
problems which are addressed in this thesis
Recommended from our members
Image based human body rendering via regression & MRF energy minimization
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A machine learning method for synthesising human images is explored to create new images without relying on 3D modelling. Machine learning allows the creation of new images through prediction from existing data based on the use of training images. In the present study, image synthesis is performed at two levels: contour and pixel. A class of learning-based methods is formulated to create object contours from the training image for the synthetic image that allow pixel synthesis within the contours in the second level. The methods rely on applying robust object descriptions, dynamic learning models after appropriate motion segmentation, and machine learning-based frameworks.
Image-based human image synthesis using machine learning is a research focus that has recently gained considerable attention in the field of computer graphics. It makes use of techniques from image/motion analysis in computer vision. The problem lies in the estimation of methods for image-based object configuration (i.e. segmentation, contour outline). Using the results of these analysis methods as bases, the research adopts the machine learning approach, in which human images are synthesised by executing the synthesis of contour and pixels through the learning from training image.
Firstly, thesis shows how an accurate silhouette is distilled using developed background subtraction for accuracy and efficiency. The traditional vector machine approach is used to avoid ambiguities within the regression process. Images can be represented as a class of accurate and efficient vectors for single images as well as sequences. Secondly, the framework is explored using a unique view of machine learning methods, i.e., support vector regression (SVR), to obtain the convergence result of vectors for contour allocation. The changing relationship between the synthetic image and the training image is expressed as a vector and represented in functions. Finally, a pixel synthesis is performed based on belief propagation.
This thesis proposes a novel image-based rendering method for colour image synthesis using SVR and belief propagation for generalisation to enable the prediction of contour and colour information from input colour images. The methods rely on using appropriately defined and robust input colour images, optimising the input contour images within a sparse SVR framework. Firstly, the thesis shows how contour can effectively and efficiently be predicted from small numbers of input contour images. In addition, the thesis exploits the sparse properties of SVR efficiency, and makes use of SVR to estimate regression function. The image-based rendering method employed in this study enables contour synthesis for the prediction of small numbers of input source images. This procedure avoids the use of complex models and geometry information. Secondly, the method used for human body contour colouring is extended to define eight differently connected pixels, and construct a link distance field via the belief propagation method. The link distance, which acts as the message in propagation, is transformed by improving the low-envelope method in fast distance transform. Finally, the methodology is tested by considering human facial and human body clothing information. The accuracy of the test results for the human body model confirms the efficiency of the proposed method
XML in Motion from Genome to Drug
Information technology (IT) has emerged as a central to the solution of contemporary genomics and drug discovery problems. Researchers involved in genomics, proteomics, transcriptional profiling, high throughput structure determination, and in other sub-disciplines of bioinformatics have direct impact on this IT revolution. As the full genome sequences of many species, data from structural genomics, micro-arrays, and proteomics became available, integration of these data to a common platform require sophisticated bioinformatics tools. Organizing these data into knowledgeable databases and developing appropriate software tools for analyzing the same are going to be major challenges. XML (eXtensible Markup Language) forms the backbone of biological data representation and exchange over the internet, enabling researchers to aggregate data from various heterogeneous data resources. The present article covers a comprehensive idea of the integration of XML on particular type of biological databases mainly dealing with sequence-structure-function relationship and its application towards drug discovery. This e-medical science approach should be applied to other scientific domains and the latest trend in semantic web applications is also highlighted
AI-generated Content for Various Data Modalities: A Survey
AI-generated content (AIGC) methods aim to produce text, images, videos, 3D
assets, and other media using AI algorithms. Due to its wide range of
applications and the demonstrated potential of recent works, AIGC developments
have been attracting lots of attention recently, and AIGC methods have been
developed for various data modalities, such as image, video, text, 3D shape (as
voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human
avatar (body and head), 3D motion, and audio -- each presenting different
characteristics and challenges. Furthermore, there have also been many
significant developments in cross-modality AIGC methods, where generative
methods can receive conditioning input in one modality and produce outputs in
another. Examples include going from various modalities to image, video, 3D
shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar),
and audio modalities. In this paper, we provide a comprehensive review of AIGC
methods across different data modalities, including both single-modality and
cross-modality methods, highlighting the various challenges, representative
works, and recent technical directions in each setting. We also survey the
representative datasets throughout the modalities, and present comparative
results for various modalities. Moreover, we also discuss the challenges and
potential future research directions
Configurable Input Devices for 3D Interaction using Optical Tracking
Three-dimensional interaction with virtual objects is one of the aspects that needs to be addressed
in order to increase the usability and usefulness of virtual reality. Human beings
have difficulties understanding 3D spatial relationships and manipulating 3D user interfaces,
which require the control of multiple degrees of freedom simultaneously. Conventional interaction
paradigms known from the desktop computer, such as the use of interaction devices as
the mouse and keyboard, may be insufficient or even inappropriate for 3D spatial interaction
tasks.
The aim of the research in this thesis is to develop the technology required to improve 3D
user interaction. This can be accomplished by allowing interaction devices to be constructed
such that their use is apparent from their structure, and by enabling efficient development of
new input devices for 3D interaction.
The driving vision in this thesis is that for effective and natural direct 3D interaction the
structure of an interaction device should be specifically tuned to the interaction task. Two
aspects play an important role in this vision. First, interaction devices should be structured
such that interaction techniques are as direct and transparent as possible. Interaction techniques
define the mapping between interaction task parameters and the degrees of freedom of
interaction devices. Second, the underlying technology should enable developers to rapidly
construct and evaluate new interaction devices.
The thesis is organized as follows. In Chapter 2, a review of the optical tracking field is
given. The tracking pipeline is discussed, existing methods are reviewed, and improvement
opportunities are identified.
In Chapters 3 and 4 the focus is on the development of optical tracking techniques of rigid
objects. The goal of the tracking method presented in Chapter 3 is to reduce the occlusion
problem. The method exploits projection invariant properties of line pencil markers, and the
fact that line features only need to be partially visible.
In Chapter 4, the aim is to develop a tracking system that supports devices of arbitrary
shapes, and allows for rapid development of new interaction devices. The method is based on
subgraph isomorphism to identify point clouds. To support the development of new devices
in the virtual environment an automatic model estimation method is used.
Chapter 5 provides an analysis of three optical tracking systems based on different principles.
The first system is based on an optimization procedure that matches the 3D device
model points to the 2D data points that are detected in the camera images. The other systems
are the tracking methods as discussed in Chapters 3 and 4.
In Chapter 6 an analysis of various filtering and prediction methods is given. These
techniques can be used to make the tracking system more robust against noise, and to reduce
the latency problem.
Chapter 7 focusses on optical tracking of composite input devices, i.e., input devices
197
198 Summary
that consist of multiple rigid parts that can have combinations of rotational and translational
degrees of freedom with respect to each other. Techniques are developed to automatically
generate a 3D model of a segmented input device from motion data, and to use this model to
track the device.
In Chapter 8, the presented techniques are combined to create a configurable input device,
which supports direct and natural co-located interaction. In this chapter, the goal of the thesis
is realized. The device can be configured such that its structure reflects the parameters of the
interaction task.
In Chapter 9, the configurable interaction device is used to study the influence of spatial
device structure with respect to the interaction task at hand. The driving vision of this thesis,
that the spatial structure of an interaction device should match that of the task, is analyzed
and evaluated by performing a user study.
The concepts and techniques developed in this thesis allow researchers to rapidly construct
and apply new interaction devices for 3D interaction in virtual environments. Devices
can be constructed such that their spatial structure reflects the 3D parameters of the interaction
task at hand. The interaction technique then becomes a transparent one-to-one mapping
that directly mediates the functions of the device to the task. The developed configurable interaction
devices can be used to construct intuitive spatial interfaces, and allow researchers to
rapidly evaluate new device configurations and to efficiently perform studies on the relation
between the spatial structure of devices and the interaction task
Intelligent visual media processing: when graphics meets vision
The computer graphics and computer vision communities have been working closely together in recent
years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media
around us. There are three major driving forces behind this phenomenon: i) the availability of big data from the
Internet has created a demand for dealing with the ever increasing, vast amount of resources; ii) powerful processing
tools, such as deep neural networks, provide e�ective ways for learning how to deal with heterogeneous visual data;
iii) new data capture devices, such as the Kinect, bridge between algorithms for 2D image understanding and
3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics
and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey
recent research on how computer vision techniques bene�t computer graphics techniques and vice versa, and cover
research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest
possible further research directions
- …