7 research outputs found
Video alignment to a common reference
2015 Spring.Includes bibliographical references.Handheld videos often include unintentional motion (jitter) and intentional motion (pan and/or zoom). Human viewers prefer to see jitter removed, creating a smoothly moving camera. For video analysis, in contrast, aligning to a fixed stable background is sometimes preferable. This paper presents an algorithm that removes both forms of motion using a novel and efficient way of tracking background points while ignoring moving foreground points. The approach is related to image mosaicing, but the result is a video rather than an enlarged still image. It is also related to multiple object tracking approaches, but simpler since moving objects need not be explicitly tracked. The algorithm presented takes as input a video and returns one or several stabilized videos. Videos are broken into parts when the algorithm detects background change and it becomes necessary to fix upon a new background. We present two techniques in this thesis. One technique stabilizes the video with respect to the first available frame. Another technique stabilizes the videos with respect to a best frame. Our approach assumes the person holding the camera is standing in one place and that objects in motion do not dominate the image. Our algorithm performs better than previously published approaches when compared on 1,401 handheld videos from the recently released Point-and-Shoot Face Recognition Challenge (PASC)
λμ νκ²½μ κ°μΈν λͺ¨μ λΆλ₯ κΈ°λ°μ μμ νλ² μκ³ λ¦¬μ¦ κ°λ°
νμλ
Όλ¬Έ (μμ¬)-- μμΈλνκ΅ λνμ 곡과λν κΈ°κ³ν곡곡νλΆ, 2017. 8. κΉνμ§.In the paper, we propose a robust visual odometry algorithm for dynamic environments via rigid motion segmentation using a grid-based optical flow. The algorithm first divides image frame by a fixed-size grid, then calculates the three-dimensional motion of grids for light computational load and uniformly distributed optical flow vectors. Next, it selects several adjacent points among grid-based optical flow vectors based on a so-called entropy and generates motion hypotheses formed by three-dimensional rigid transformation. These processes for a spatial motion segmentation utilizes the principle of randomized hypothesis generation and the existing clustering algorithm, thus separating objects that move independently of each other. Moreover, we use a dual-mode simple Gaussian model in order to differentiate static and dynamic parts persistently. The model measures the output of the spatial motion segmentation algorithm and updates a probability vector consisting of the likelihood of representing specific label. For the evaluation of the proposed algorithm, we use a self-made dataset captured by ASUS Xtion Pro live RGB-D camera and Vicon motion capture system. We compare our algorithm with the existing motion segmentation algorithm and the current state-of-the-art visual odometry algorithm respectively, and the proposed algorithm estimates the ego-motion robustly and accurately in dynamic environments while showing the competitive performance of the motion segmentation.κΈ°μ‘΄ λλ€μμ μμ νλ² μκ³ λ¦¬μ¦μ μ μ μΈ νκ²½μ κ°μ νμ¬ κ°λ°λμ΄ μμΌλ©°, μ μ μλ λ°μ΄ν°μ
μμ μ±λ₯μ΄ κ²μ¦λμ΄ μλ€. νμ§λ§ λ¬΄μΈ λ‘λ΄μ΄ μμ νλ²μ νμ©νμ¬ μ무λ₯Ό μννμ¬μΌ νλ μ₯μλ, μ€μ μ¬λμ΄λ μ°¨λμ΄ μλνλ λ± λμ μΈ νκ²½μΌ κ°λ₯μ±μ΄ ν¬λ€. λΉλ‘ RANSACμ νμ©νμ¬ μμ νλ²μ μννλ μΌλΆ μκ³ λ¦¬μ¦λ€μ νλ μ λ΄μ λΉμ μμ μΈ μμ§μμ μμΉ μΆμ κ³Όμ μμ λ°°μ ν μ μμ§λ§, μ΄λ λμ λ¬Όμ²΄κ° μμ νλ μμ μμ λΆλΆμ μ°¨μ§νλ κ²½μ°μλ§ μ μ©μ΄ κ°λ₯νλ€. λ°λΌμ λΆνμ€μ±μ΄ μ‘΄μ¬νλ λμ νκ²½μμ μκΈ° μμΉλ₯Ό κ°μΈνκ² μΆμ νκΈ° μν΄, λ³Έ λ
Όλ¬Έμμλ λμ νκ²½μ κ°μΈν μμ κΈ°λ° μ£Όν κΈ°λ‘κ³ μκ³ λ¦¬μ¦μ μ μνλ€. μ μν μκ³ λ¦¬μ¦μ μνν μν μλμ μ΄λ―Έμ§ λ΄μ κ· μΌνκ² λΆν¬λ λͺ¨μ
μ κ³μ°νκΈ° μν΄, 격μ κΈ°λ° μ΅ν°μ»¬ νλ‘μ°λ₯Ό μ΄μ©νλ€. κ·Έλ¦¬κ³ κ²©μ λ¨μ 그리λμ λͺ¨μ
μ ν΅ν΄ λ¨μΌ νλ μ λ΄μμ 3μ°¨μ κ³΅κ° λͺ¨μ
λΆν μ μννκ³ , λ€μμ λμ 물체 λ° μ μ μμλ₯Ό μ§μμ μΌλ‘ κ΅¬λΆ λ° κ΅¬λ³νκΈ° μν΄ μκ°μ λͺ¨μ
λΆν μ μννλ€. νΉν μ§μμ μΌλ‘ λμ λ° μ μ μμλ₯Ό ꡬλ³νκΈ° μν΄, μ°λ¦¬λ μ΄λ―Έμ§ λ΄μ κ° κ·Έλ¦¬λμ μ΄μ€ λͺ¨λ κ°μ°μμ λͺ¨λΈμ μ μ©νμ¬ μκ³ λ¦¬μ¦μ΄ 곡κ°μ λͺ¨μ
λΆν μ μΌμμ λ
Έμ΄μ¦μ κ°μΈνκ² νκ³ , νλ₯ 벑ν°λ₯Ό ꡬμ±νμ¬ κ·Έλ¦¬λκ° μλ‘ κ΅¬λ³λλ κ°κ°μ μμλ‘ λ°νν νλ₯ μ κ³μ°νκ² νλ€. κ°λ°ν μκ³ λ¦¬μ¦μ μ±λ₯ κ²μ¦μ μν΄ ASUS Xtion RGB-D μΉ΄λ©λΌμ Vicon λͺ¨μ
μΊ‘μ³ μμ€ν
μ ν΅ν΄ ꡬμ±ν λ°μ΄ν°μ
μ μ΄μ©νμμΌλ©°, κΈ°μ‘΄ λͺ¨μ
λΆν μκ³ λ¦¬μ¦κ³Όμ μ¬νμ¨ (recall), μ λ°λ (precision) λΉκ΅ λ° κΈ°μ‘΄ μμ κΈ°λ° μ£Όν κΈ°λ‘κ³ μκ³ λ¦¬μ¦κ³Όμ μΆμ μ€μ°¨ λΉκ΅λ₯Ό ν΅ν΄ ν μκ³ λ¦¬μ¦ λλΉ μ°μν λͺ¨μ
κ²μΆ λ° μμΉ μΆμ μ±λ₯μ νμΈνμλ€.Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Chapter
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Thesis contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Rigid transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Grid-based optical flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Motion Spatial Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 Motion hypothesis search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Motion hypothesis refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Motion hypothesis clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Motion Temporal Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1 Label matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Dual-mode simple Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2.1 Update model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2.2 Compensate model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 Motion segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Visual odometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Maste
Robust motion segmentation with subspace constraints
Motion segmentation is an important task in computer vision with
many applications such as dynamic scene understanding and
multi-body structure from motion. When the point correspondences
across frames are given, motion segmentation can be addressed as
a subspace clustering problem under an affine camera model. In
the first two parts of this thesis, we target the general
subspace clustering problem and propose two novel methods, namely
Efficient Dense Subspace Clustering (EDSC) and the Robust Shape
Interaction Matrix (RSIM) method.
Instead of following the standard compressive sensing approach,
in EDSC we formulate subspace clustering as a Frobenius norm
minimization problem, which inherently yields denser connections
between data points. While in the noise-free case we rely on the
self-expressiveness of the observations, in the presence of noise
we recover a clean dictionary to represent the data. Our
formulation lets us solve the subspace clustering problem
efficiently. More specifically, for outlier-free observations,
the solution can be obtained in closed-form, and in the presence
of outliers, we solve the problem by performing a series of
linear operations. Furthermore, we show that our Frobenius norm
formulation shares the same solution as the popular nuclear norm
minimization approach when the data is free of any noise.
In RSIM, we revisit the Shape Interaction Matrix (SIM) method,
one of the earliest approaches for motion segmentation (or
subspace clustering), and reveal its connections to several
recent subspace clustering methods. We derive a simple, yet
effective algorithm to robustify the SIM method and make it
applicable to real-world scenarios where the data is corrupted by
noise. We validate the proposed method by intuitive examples and
justify it with the matrix perturbation theory. Moreover, we show
that RSIM can be extended to handle missing data with a
Grassmannian gradient descent method.
The above subspace clustering methods work well for motion
segmentation, yet they require that point trajectories across
frames are known {\it a priori}. However, finding point
correspondences is in itself a challenging task. Existing
approaches tackle the correspondence estimation and motion
segmentation problems separately. In the third part of this
thesis, given a set of feature points detected in each frame of
the sequence, we develop an approach which simultaneously
performs motion segmentation and finds point correspondences
across the frames. We formulate this problem in terms of Partial
Permutation Matrices (PPMs) and aim to match feature descriptors
while simultaneously encouraging point trajectories to satisfy
subspace constraints. This lets us handle outliers in both point
locations and feature appearance. The resulting optimization
problem is solved via the Alternating Direction Method of
Multipliers (ADMM), where each subproblem has an efficient
solution. In particular, we show that most of the subproblems can
be solved in closed-form, and one binary assignment subproblem
can be solved by the Hungarian algorithm.
Obtaining reliable feature tracks in a frame-by-frame manner is
desirable in applications such as online motion segmentation. In
the final part of the thesis, we introduce a novel multi-body
feature tracker that exploits a multi-body rigidity assumption to
improve tracking robustness under a general perspective camera
model. A conventional approach to addressing this problem would
consist of alternating between solving two subtasks: motion
segmentation and feature tracking under rigidity constraints for
each segment. This approach, however, requires knowing the number
of motions, as well as assigning points to motion groups, which
is typically sensitive to motion estimates. By contrast, we
introduce a segmentation-free solution to multi-body feature
tracking that bypasses the motion assignment step and reduces to
solving a series of subproblems with closed-form solutions.
In summary, in this thesis, we exploit the powerful subspace
constraints and develop robust motion segmentation methods in
different challenging scenarios where the trajectories are either
given as input, or unknown beforehand. We also present a general
robust multi-body feature tracker which can be used as the first
step of motion segmentation to get reliable trajectories