1,517 research outputs found
Unsupervised Learning of Complex Articulated Kinematic Structures combining Motion and Skeleton Information
In this paper we present a novel framework for unsupervised kinematic structure learning of complex articulated objects from a single-view image sequence. In contrast to prior motion information based methods, which estimate relatively simple articulations, our method can generate arbitrarily complex kinematic structures with skeletal topology by a successive iterative merge process. The iterative merge process is guided by a skeleton distance function which is generated from a novel object boundary generation method from sparse points. Our main contributions can be summarised as follows: (i) Unsupervised complex articulated kinematic structure learning by combining motion and skeleton information. (ii) Iterative fine-to-coarse merging strategy for adaptive motion segmentation and structure smoothing. (iii) Skeleton estimation from sparse feature points. (iv) A new highly articulated object dataset containing multi-stage complexity with ground truth. Our experiments show that the proposed method out-performs state-of-the-art methods both quantitatively and qualitatively
Robust Real-time RGB-D Visual Odometry in Dynamic Environments via Rigid Motion Model
In the paper, we propose a robust real-time visual odometry in dynamic
environments via rigid-motion model updated by scene flow. The proposed
algorithm consists of spatial motion segmentation and temporal motion tracking.
The spatial segmentation first generates several motion hypotheses by using a
grid-based scene flow and clusters the extracted motion hypotheses, separating
objects that move independently of one another. Further, we use a dual-mode
motion model to consistently distinguish between the static and dynamic parts
in the temporal motion tracking stage. Finally, the proposed algorithm
estimates the pose of a camera by taking advantage of the region classified as
static parts. In order to evaluate the performance of visual odometry under the
existence of dynamic rigid objects, we use self-collected dataset containing
RGB-D images and motion capture data for ground-truth. We compare our algorithm
with state-of-the-art visual odometry algorithms. The validation results
suggest that the proposed algorithm can estimate the pose of a camera robustly
and accurately in dynamic environments
Median K-flats for hybrid linear modeling with many outliers
We describe the Median K-Flats (MKF) algorithm, a simple online method for
hybrid linear modeling, i.e., for approximating data by a mixture of flats.
This algorithm simultaneously partitions the data into clusters while finding
their corresponding best approximating l1 d-flats, so that the cumulative l1
error is minimized. The current implementation restricts d-flats to be
d-dimensional linear subspaces. It requires a negligible amount of storage, and
its complexity, when modeling data consisting of N points in D-dimensional
Euclidean space with K d-dimensional linear subspaces, is of order O(n K d D+n
d^2 D), where n is the number of iterations required for convergence
(empirically on the order of 10^4). Since it is an online algorithm, data can
be supplied to it incrementally and it can incrementally produce the
corresponding output. The performance of the algorithm is carefully evaluated
using synthetic and real data
Recommended from our members
LEARNING TO RIG CHARACTERS
With the emergence of 3D virtual worlds, 3D social media, and massive online games, the need for diverse, high-quality, animation-ready characters and avatars is greater than ever. To animate characters, artists hand-craft articulation structures, such as animation skeletons and part deformers, which require significant amount of manual and laborious interaction with 2D/3D modeling interfaces. This thesis presents deep learning methods that are able to significantly automate the process of character rigging.
First, the thesis introduces RigNet, a method capable of predicting an animation skeleton for an input static 3D shape in the form of a polygon mesh. The predicted skeletons match the animator expectations in joint placement and topology. RigNet also estimates surface skin weights which determine how the mesh is animated given the different skeletal poses. In contrast to prior work that fits pre-defined skeletal templates with hand-tuned objectives, RigNet is able to automatically rig diverse characters, such as humanoids, quadrupeds, toys, birds, with varying articulation structure and geometry. RigNet is based on a deep neural architecture that directly operates on the mesh representation. The architecture is trained on a diverse dataset of rigged models that we mined online and curated. The dataset includes 2.7K polygon meshes, along with their associated skeletons and corresponding skin weights.
Second, the thesis introduces Morig, a method that automatically rigs character meshes driven by single-view point cloud streams capturing the motion of performing characters. Compared to RigNet, MoRig\u27s rigging is \emph{motion-aware}: its neural network encodes motion cues from the point clouds into compact feature representations that are informative about the articulated parts of the performing character. These motion-aware features guide the inference of an appropriate skeletal rig for the input mesh. Furthermore, Morig is able to animate the rig according to the captured point cloud motion. Morig can handle diverse characters with different morphologies (e.g., humanoids, quadrupeds, toy characters). It also accounts for occluded regions in the point clouds and mismatches in the part proportions between the input mesh and captured character.
Third, the thesis introduces APES, a method that takes as input 2D raster images depicting a small set of poses of a character shown in a sprite sheet, and identifies articulated parts useful for rigging the character. APES uses a combination of neural network inference and integer linear programming to identify a compact set of articulated body parts, e.g. head, torso and limbs, that best reconstruct the input poses. Compared to Morig and RigNet that require a large collection of training models with associated skeletons and skinning weights, APES\u27 neural architecture relies on less effortful supervision from (i) pixel correspondences readily available in existing large cartoon image datasets (e.g., Creative Flow), (ii) a relatively small dataset of 57 cartoon characters segmented into moving parts.
Finally, the thesis discusses future research directions related to combining neural rigging with 3D and 4D reconstruction of characters from point cloud data and 2D video as well as automating the process of motion synthesis for 3D characters
λμ νκ²½μ κ°μΈν λͺ¨μ λΆλ₯ κΈ°λ°μ μμ νλ² μκ³ λ¦¬μ¦ κ°λ°
νμλ
Όλ¬Έ (μμ¬)-- μμΈλνκ΅ λνμ 곡과λν κΈ°κ³ν곡곡νλΆ, 2017. 8. κΉνμ§.In the paper, we propose a robust visual odometry algorithm for dynamic environments via rigid motion segmentation using a grid-based optical flow. The algorithm first divides image frame by a fixed-size grid, then calculates the three-dimensional motion of grids for light computational load and uniformly distributed optical flow vectors. Next, it selects several adjacent points among grid-based optical flow vectors based on a so-called entropy and generates motion hypotheses formed by three-dimensional rigid transformation. These processes for a spatial motion segmentation utilizes the principle of randomized hypothesis generation and the existing clustering algorithm, thus separating objects that move independently of each other. Moreover, we use a dual-mode simple Gaussian model in order to differentiate static and dynamic parts persistently. The model measures the output of the spatial motion segmentation algorithm and updates a probability vector consisting of the likelihood of representing specific label. For the evaluation of the proposed algorithm, we use a self-made dataset captured by ASUS Xtion Pro live RGB-D camera and Vicon motion capture system. We compare our algorithm with the existing motion segmentation algorithm and the current state-of-the-art visual odometry algorithm respectively, and the proposed algorithm estimates the ego-motion robustly and accurately in dynamic environments while showing the competitive performance of the motion segmentation.κΈ°μ‘΄ λλ€μμ μμ νλ² μκ³ λ¦¬μ¦μ μ μ μΈ νκ²½μ κ°μ νμ¬ κ°λ°λμ΄ μμΌλ©°, μ μ μλ λ°μ΄ν°μ
μμ μ±λ₯μ΄ κ²μ¦λμ΄ μλ€. νμ§λ§ λ¬΄μΈ λ‘λ΄μ΄ μμ νλ²μ νμ©νμ¬ μ무λ₯Ό μννμ¬μΌ νλ μ₯μλ, μ€μ μ¬λμ΄λ μ°¨λμ΄ μλνλ λ± λμ μΈ νκ²½μΌ κ°λ₯μ±μ΄ ν¬λ€. λΉλ‘ RANSACμ νμ©νμ¬ μμ νλ²μ μννλ μΌλΆ μκ³ λ¦¬μ¦λ€μ νλ μ λ΄μ λΉμ μμ μΈ μμ§μμ μμΉ μΆμ κ³Όμ μμ λ°°μ ν μ μμ§λ§, μ΄λ λμ λ¬Όμ²΄κ° μμ νλ μμ μμ λΆλΆμ μ°¨μ§νλ κ²½μ°μλ§ μ μ©μ΄ κ°λ₯νλ€. λ°λΌμ λΆνμ€μ±μ΄ μ‘΄μ¬νλ λμ νκ²½μμ μκΈ° μμΉλ₯Ό κ°μΈνκ² μΆμ νκΈ° μν΄, λ³Έ λ
Όλ¬Έμμλ λμ νκ²½μ κ°μΈν μμ κΈ°λ° μ£Όν κΈ°λ‘κ³ μκ³ λ¦¬μ¦μ μ μνλ€. μ μν μκ³ λ¦¬μ¦μ μνν μν μλμ μ΄λ―Έμ§ λ΄μ κ· μΌνκ² λΆν¬λ λͺ¨μ
μ κ³μ°νκΈ° μν΄, 격μ κΈ°λ° μ΅ν°μ»¬ νλ‘μ°λ₯Ό μ΄μ©νλ€. κ·Έλ¦¬κ³ κ²©μ λ¨μ 그리λμ λͺ¨μ
μ ν΅ν΄ λ¨μΌ νλ μ λ΄μμ 3μ°¨μ κ³΅κ° λͺ¨μ
λΆν μ μννκ³ , λ€μμ λμ 물체 λ° μ μ μμλ₯Ό μ§μμ μΌλ‘ κ΅¬λΆ λ° κ΅¬λ³νκΈ° μν΄ μκ°μ λͺ¨μ
λΆν μ μννλ€. νΉν μ§μμ μΌλ‘ λμ λ° μ μ μμλ₯Ό ꡬλ³νκΈ° μν΄, μ°λ¦¬λ μ΄λ―Έμ§ λ΄μ κ° κ·Έλ¦¬λμ μ΄μ€ λͺ¨λ κ°μ°μμ λͺ¨λΈμ μ μ©νμ¬ μκ³ λ¦¬μ¦μ΄ 곡κ°μ λͺ¨μ
λΆν μ μΌμμ λ
Έμ΄μ¦μ κ°μΈνκ² νκ³ , νλ₯ 벑ν°λ₯Ό ꡬμ±νμ¬ κ·Έλ¦¬λκ° μλ‘ κ΅¬λ³λλ κ°κ°μ μμλ‘ λ°νν νλ₯ μ κ³μ°νκ² νλ€. κ°λ°ν μκ³ λ¦¬μ¦μ μ±λ₯ κ²μ¦μ μν΄ ASUS Xtion RGB-D μΉ΄λ©λΌμ Vicon λͺ¨μ
μΊ‘μ³ μμ€ν
μ ν΅ν΄ ꡬμ±ν λ°μ΄ν°μ
μ μ΄μ©νμμΌλ©°, κΈ°μ‘΄ λͺ¨μ
λΆν μκ³ λ¦¬μ¦κ³Όμ μ¬νμ¨ (recall), μ λ°λ (precision) λΉκ΅ λ° κΈ°μ‘΄ μμ κΈ°λ° μ£Όν κΈ°λ‘κ³ μκ³ λ¦¬μ¦κ³Όμ μΆμ μ€μ°¨ λΉκ΅λ₯Ό ν΅ν΄ ν μκ³ λ¦¬μ¦ λλΉ μ°μν λͺ¨μ
κ²μΆ λ° μμΉ μΆμ μ±λ₯μ νμΈνμλ€.Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Chapter
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Thesis contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Rigid transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Grid-based optical flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Motion Spatial Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 Motion hypothesis search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Motion hypothesis refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Motion hypothesis clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Motion Temporal Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1 Label matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Dual-mode simple Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2.1 Update model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2.2 Compensate model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 Motion segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Visual odometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Maste
- β¦