99,003 research outputs found

    RELIABILITY AND VALIDITY OF A DEEP LEARNING ALGORITHM BASED MARKERLESS MOTION CAPTURE SYSTEM IN MEASURING SQUATS

    Get PDF
    This study aimed to compare the performance of a traditional marker-based motion capture system and a video-based markerless system in analyzing squats and to determine the reliability and validity of the markerless system. Twenty-one squats were recorded using a marker-based motion capture system and a 2D video camera. We analyzed the 2D video data using Sportip Motion 3D, a deep learning-based 3D human pose estimation algorithm based specifically on sports activities, and the peak lower limb joint angles were calculated by both systems. There was an excellent agreement between VICON and Sportip Motion 3D for all joint angles (hip intraclass correlation coefficient (ICC) = 0.96, knee ICC = 0.92, ankle ICC = 0.86), with average differences of less than 1.3ยฐ. These results indicate that squat analysis using Sportip Motion 3D is equally reliable and accurate as the conventional marker-based method

    Markerless Human Motion Analysis

    Get PDF
    Measuring and understanding human motion is crucial in several domains, ranging from neuroscience, to rehabilitation and sports biomechanics. Quantitative information about human motion is fundamental to study how our Central Nervous System controls and organizes movements to functionally evaluate motor performance and deficits. In the last decades, the research in this field has made considerable progress. State-of-the-art technologies that provide useful and accurate quantitative measures rely on marker-based systems. Unfortunately, markers are intrusive and their number and location must be determined a priori. Also, marker-based systems require expensive laboratory settings with several infrared cameras. This could modify the naturalness of a subject\u2019s movements and induce discomfort. Last, but not less important, they are computationally expensive in time and space. Recent advances on markerless pose estimation based on computer vision and deep neural networks are opening the possibility of adopting efficient video-based methods for extracting movement information from RGB video data. In this contest, this thesis presents original contributions to the following objectives: (i) the implementation of a video-based markerless pipeline to quantitatively characterize human motion; (ii) the assessment of its accuracy if compared with a gold standard marker-based system; (iii) the application of the pipeline to different domains in order to verify its versatility, with a special focus on the characterization of the motion of preterm infants and on gait analysis. With the proposed approach we highlight that, starting only from RGB videos and leveraging computer vision and machine learning techniques, it is possible to extract reliable information characterizing human motion comparable to that obtained with gold standard marker-based systems

    ๋™์˜์ƒ ์† ์‚ฌ๋žŒ ๋™์ž‘์˜ ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์žฌ๊ตฌ์„ฑ ๋ฐ ๋ถ„์„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021. 2. ์ด์ œํฌ.In computer graphics, simulating and analyzing human movement have been interesting research topics started since the 1960s. Still, simulating realistic human movements in a 3D virtual world is a challenging task in computer graphics. In general, motion capture techniques have been used. Although the motion capture data guarantees realistic result and high-quality data, there is lots of equipment required to capture motion, and the process is complicated. Recently, 3D human pose estimation techniques from the 2D video are remarkably developed. Researchers in computer graphics and computer vision have attempted to reconstruct the various human motions from video data. However, existing methods can not robustly estimate dynamic actions and not work on videos filmed with a moving camera. In this thesis, we propose methods to reconstruct dynamic human motions from in-the-wild videos and to control the motions. First, we developed a framework to reconstruct motion from videos using prior physics knowledge. For dynamic motions such as backspin, the poses estimated by a state-of-the-art method are incomplete and include unreliable root trajectory or lack intermediate poses. We designed a reward function using poses and hints extracted from videos in the deep reinforcement learning controller and learned a policy to simultaneously reconstruct motion and control a virtual character. Second, we simulated figure skating movements in video. Skating sequences consist of fast and dynamic movements on ice, hindering the acquisition of motion data. Thus, we extracted 3D key poses from a video to then successfully replicate several figure skating movements using trajectory optimization and a deep reinforcement learning controller. Third, we devised an algorithm for gait analysis through video of patients with movement disorders. After acquiring the patients joint positions from 2D video processed by a deep learning network, the 3D absolute coordinates were estimated, and gait parameters such as gait velocity, cadence, and step length were calculated. Additionally, we analyzed the optimization criteria of human walking by using a 3D musculoskeletal humanoid model and physics-based simulation. For two criteria, namely, the minimization of muscle activation and joint torque, we compared simulation data with real human data for analysis. To demonstrate the effectiveness of the first two research topics, we verified the reconstruction of dynamic human motions from 2D videos using physics-based simulations. For the last two research topics, we evaluated our results with real human data.์ปดํ“จํ„ฐ ๊ทธ๋ž˜ํ”ฝ์Šค์—์„œ ์ธ๊ฐ„์˜ ์›€์ง์ž„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ฐ ๋ถ„์„์€ 1960 ๋…„๋Œ€๋ถ€ํ„ฐ ๋‹ค๋ฃจ์–ด์ง„ ํฅ๋ฏธ๋กœ์šด ์—ฐ๊ตฌ ์ฃผ์ œ์ด๋‹ค. ๋ช‡ ์‹ญ๋…„ ๋™์•ˆ ํ™œ๋ฐœํ•˜๊ฒŒ ์—ฐ๊ตฌ๋˜์–ด ์™”์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , 3์ฐจ์› ๊ฐ€์ƒ ๊ณต๊ฐ„ ์ƒ์—์„œ ์‚ฌ์‹ค์ ์ธ ์ธ๊ฐ„์˜ ์›€์ง์ž„์„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๋Š” ์—ฐ๊ตฌ๋Š” ์—ฌ์ „ํžˆ ์–ด๋ ต๊ณ  ๋„์ „์ ์ธ ์ฃผ์ œ์ด๋‹ค. ๊ทธ๋™์•ˆ ์‚ฌ๋žŒ์˜ ์›€์ง์ž„ ๋ฐ์ดํ„ฐ๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด์„œ ๋ชจ์…˜ ์บก์ณ ๊ธฐ์ˆ ์ด ์‚ฌ์šฉ๋˜์–ด ์™”๋‹ค. ๋ชจ์…˜ ์บก์ฒ˜ ๋ฐ์ดํ„ฐ๋Š” ์‚ฌ์‹ค์ ์ธ ๊ฒฐ๊ณผ์™€ ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด์žฅํ•˜์ง€๋งŒ ๋ชจ์…˜ ์บก์ณ๋ฅผ ํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ•„์š”ํ•œ ์žฅ๋น„๋“ค์ด ๋งŽ๊ณ , ๊ทธ ๊ณผ์ •์ด ๋ณต์žกํ•˜๋‹ค. ์ตœ๊ทผ์— 2์ฐจ์› ์˜์ƒ์œผ๋กœ๋ถ€ํ„ฐ ์‚ฌ๋žŒ์˜ 3์ฐจ์› ์ž์„ธ๋ฅผ ์ถ”์ •ํ•˜๋Š” ์—ฐ๊ตฌ๋“ค์ด ๊ด„๋ชฉํ•  ๋งŒํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ปดํ“จํ„ฐ ๊ทธ๋ž˜ํ”ฝ์Šค์™€ ์ปดํ“จํ„ฐ ๋น„์ ผ ๋ถ„์•ผ์˜ ์—ฐ๊ตฌ์ž๋“ค์€ ๋น„๋””์˜ค ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๋‹ค์–‘ํ•œ ์ธ๊ฐ„ ๋™์ž‘์„ ์žฌ๊ตฌ์„ฑํ•˜๋ ค๋Š” ์‹œ๋„๋ฅผ ํ•˜๊ณ  ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ธฐ์กด์˜ ๋ฐฉ๋ฒ•๋“ค์€ ๋น ๋ฅด๊ณ  ๋‹ค์ด๋‚˜๋ฏนํ•œ ๋™์ž‘๋“ค์€ ์•ˆ์ •์ ์œผ๋กœ ์ถ”์ •ํ•˜์ง€ ๋ชปํ•˜๋ฉฐ ์›€์ง์ด๋Š” ์นด๋ฉ”๋ผ๋กœ ์ดฌ์˜ํ•œ ๋น„๋””์˜ค์— ๋Œ€ํ•ด์„œ๋Š” ์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋น„๋””์˜ค๋กœ๋ถ€ํ„ฐ ์—ญ๋™์ ์ธ ์ธ๊ฐ„ ๋™์ž‘์„ ์žฌ๊ตฌ์„ฑํ•˜๊ณ  ๋™์ž‘์„ ์ œ์–ดํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋จผ์ € ์‚ฌ์ „ ๋ฌผ๋ฆฌํ•™ ์ง€์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋น„๋””์˜ค์—์„œ ๋ชจ์…˜์„ ์žฌ๊ตฌ์„ฑํ•˜๋Š” ํ”„๋ ˆ์ž„ ์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๊ณต์ค‘์ œ๋น„์™€ ๊ฐ™์€ ์—ญ๋™์ ์ธ ๋™์ž‘๋“ค์— ๋Œ€ํ•ด์„œ ์ตœ์‹  ์—ฐ๊ตฌ ๋ฐฉ๋ฒ•์„ ๋™์›ํ•˜์—ฌ ์ถ”์ •๋œ ์ž์„ธ๋“ค์€ ์บ๋ฆญํ„ฐ์˜ ๊ถค์ ์„ ์‹ ๋ขฐํ•  ์ˆ˜ ์—†๊ฑฐ๋‚˜ ์ค‘๊ฐ„์— ์ž์„ธ ์ถ”์ •์— ์‹คํŒจํ•˜๋Š” ๋“ฑ ๋ถˆ์™„์ „ํ•˜๋‹ค. ์šฐ๋ฆฌ๋Š” ์‹ฌ์ธต๊ฐ•ํ™”ํ•™์Šต ์ œ์–ด๊ธฐ์—์„œ ์˜์ƒ์œผ๋กœ๋ถ€ํ„ฐ ์ถ”์ถœํ•œ ํฌ์ฆˆ์™€ ํžŒํŠธ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ณด์ƒ ํ•จ์ˆ˜๋ฅผ ์„ค๊ณ„ํ•˜๊ณ  ๋ชจ์…˜ ์žฌ๊ตฌ์„ฑ๊ณผ ์บ๋ฆญํ„ฐ ์ œ์–ด๋ฅผ ๋™์‹œ์— ํ•˜๋Š” ์ •์ฑ…์„ ํ•™์Šตํ•˜์˜€๋‹ค. ๋‘˜ ์งธ, ๋น„๋””์˜ค์—์„œ ํ”ผ๊ฒจ ์Šค์ผ€์ดํŒ… ๊ธฐ์ˆ ์„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•œ๋‹ค. ํ”ผ๊ฒจ ์Šค์ผ€์ดํŒ… ๊ธฐ์ˆ ๋“ค์€ ๋น™์ƒ์—์„œ ๋น ๋ฅด๊ณ  ์—ญ๋™์ ์ธ ์›€์ง์ž„์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์–ด ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋ฅผ ์–ป๊ธฐ๊ฐ€ ๊นŒ๋‹ค๋กญ๋‹ค. ๋น„๋””์˜ค์—์„œ 3์ฐจ์› ํ‚ค ํฌ์ฆˆ๋ฅผ ์ถ”์ถœํ•˜๊ณ  ๊ถค์  ์ตœ์ ํ™” ๋ฐ ์‹ฌ์ธต๊ฐ•ํ™”ํ•™์Šต ์ œ์–ด๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์—ฌ๋Ÿฌ ํ”ผ๊ฒจ ์Šค์ผ€์ดํŒ… ๊ธฐ์ˆ ์„ ์„ฑ๊ณต์ ์œผ๋กœ ์‹œ์—ฐํ•œ๋‹ค. ์…‹ ์งธ, ํŒŒํ‚จ์Šจ ๋ณ‘์ด๋‚˜ ๋‡Œ์„ฑ๋งˆ๋น„์™€ ๊ฐ™์€ ์งˆ๋ณ‘์œผ๋กœ ์ธํ•˜์—ฌ ์›€์ง์ž„ ์žฅ์• ๊ฐ€ ์žˆ๋Š” ํ™˜์ž์˜ ๋ณดํ–‰์„ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. 2์ฐจ์› ๋น„๋””์˜ค๋กœ๋ถ€ํ„ฐ ๋”ฅ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•œ ์ž์„ธ ์ถ”์ •๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ํ™˜์ž์˜ ๊ด€์ ˆ ์œ„์น˜๋ฅผ ์–ป์–ด๋‚ธ ๋‹ค์Œ, 3์ฐจ์› ์ ˆ๋Œ€ ์ขŒํ‘œ๋ฅผ ์–ป์–ด๋‚ด์–ด ์ด๋กœ๋ถ€ํ„ฐ ๋ณดํญ, ๋ณดํ–‰ ์†๋„์™€ ๊ฐ™์€ ๋ณดํ–‰ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๊ทผ๊ณจ๊ฒฉ ์ธ์ฒด ๋ชจ๋ธ๊ณผ ๋ฌผ๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์ด์šฉํ•˜์—ฌ ์ธ๊ฐ„ ๋ณดํ–‰์˜ ์ตœ์ ํ™” ๊ธฐ์ค€์— ๋Œ€ํ•ด ํƒ๊ตฌํ•œ๋‹ค. ๊ทผ์œก ํ™œ์„ฑ๋„ ์ตœ์†Œํ™”์™€ ๊ด€์ ˆ ๋Œ๋ฆผํž˜ ์ตœ์†Œํ™”, ๋‘ ๊ฐ€์ง€ ๊ธฐ์ค€์— ๋Œ€ํ•ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•œ ํ›„, ์‹ค์ œ ์‚ฌ๋žŒ ๋ฐ์ดํ„ฐ์™€ ๋น„๊ตํ•˜์—ฌ ๊ฒฐ๊ณผ๋ฅผ ๋ถ„์„ํ•œ๋‹ค. ์ฒ˜์Œ ๋‘ ๊ฐœ์˜ ์—ฐ๊ตฌ ์ฃผ์ œ์˜ ํšจ๊ณผ๋ฅผ ์ž…์ฆํ•˜๊ธฐ ์œ„ํ•ด, ๋ฌผ๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด์ฐจ์› ๋น„๋””์˜ค๋กœ๋ถ€ํ„ฐ ์žฌ๊ตฌ์„ฑํ•œ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์—ญ๋™์ ์ธ ์‚ฌ๋žŒ์˜ ๋™์ž‘๋“ค์„ ์žฌํ˜„ํ•œ๋‹ค. ๋‚˜์ค‘ ๋‘ ๊ฐœ์˜ ์—ฐ๊ตฌ ์ฃผ์ œ๋Š” ์‚ฌ๋žŒ ๋ฐ์ดํ„ฐ์™€์˜ ๋น„๊ต ๋ถ„์„์„ ํ†ตํ•˜์—ฌ ํ‰๊ฐ€ํ•œ๋‹ค.1 Introduction 1 2 Background 9 2.1 Pose Estimation from 2D Video . . . . . . . . . . . . . . . . . . . . 9 2.2 Motion Reconstruction from Monocular Video . . . . . . . . . . . . 10 2.3 Physics-Based Character Simulation and Control . . . . . . . . . . . 12 2.4 Motion Reconstruction Leveraging Physics . . . . . . . . . . . . . . 13 2.5 Human Motion Control . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5.1 Figure Skating Simulation . . . . . . . . . . . . . . . . . . . 16 2.6 Objective Gait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.7 Optimization for Human Movement Simulation . . . . . . . . . . . . 17 2.7.1 Stability Criteria . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Human Dynamics from Monocular Video with Dynamic Camera Movements 19 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 Pose and Contact Estimation . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Learning Human Dynamics . . . . . . . . . . . . . . . . . . . . . . . 24 3.4.1 Policy Learning . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4.2 Network Training . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.3 Scene Estimator . . . . . . . . . . . . . . . . . . . . . . . . 29 3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5.1 Video Clips . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5.2 Comparison of Contact Estimators . . . . . . . . . . . . . . . 33 3.5.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.4 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4 Figure Skating Simulation from Video 42 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3 Skating Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.1 Non-holonomic Constraints . . . . . . . . . . . . . . . . . . 46 4.3.2 Relaxation of Non-holonomic Constraints . . . . . . . . . . . 47 4.4 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.5 Trajectory Optimization and Control . . . . . . . . . . . . . . . . . . 50 4.5.1 Trajectory Optimization . . . . . . . . . . . . . . . . . . . . 50 4.5.2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5 Gait Analysis Using Pose Estimation Algorithm with 2D-video of Patients 61 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.2.1 Patients and video recording . . . . . . . . . . . . . . . . . . 63 5.2.2 Standard protocol approvals, registrations, and patient consents 66 5.2.3 3D Pose estimation from 2D video . . . . . . . . . . . . . . . 66 5.2.4 Gait parameter estimation . . . . . . . . . . . . . . . . . . . 67 5.2.5 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . 68 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.3.1 Validation of video-based analysis of the gait . . . . . . . . . 68 5.3.2 gait analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.4.1 Validation with the conventional sensor-based method . . . . 75 5.4.2 Analysis of gait and turning in TUG . . . . . . . . . . . . . . 75 5.4.3 Correlation with clinical parameters . . . . . . . . . . . . . . 76 5.4.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.5 Supplementary Material . . . . . . . . . . . . . . . . . . . . . . . . . 77 6 Control Optimization of Human Walking 80 6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 6.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 6.2.1 Musculoskeletal model . . . . . . . . . . . . . . . . . . . . . 82 6.2.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 82 6.2.3 Control co-activation level . . . . . . . . . . . . . . . . . . . 83 6.2.4 Push-recovery experiment . . . . . . . . . . . . . . . . . . . 84 6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 7 Conclusion 90 7.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Docto

    Computational Modeling of Human Dorsal Pathway for Motion Processing

    Get PDF
    Reliable motion estimation in videos is of crucial importance for background iden- tification, object tracking, action recognition, event analysis, self-navigation, etc. Re- constructing the motion field in the 2D image plane is very challenging, due to variations in image quality, scene geometry, lighting condition, and most importantly, camera jit- tering. Traditional optical flow models assume consistent image brightness and smooth motion field, which are violated by unstable illumination and motion discontinuities that are common in real world videos. To recognize observer (or camera) motion robustly in complex, realistic scenarios, we propose a biologically-inspired motion estimation system to overcome issues posed by real world videos. The bottom-up model is inspired from the infrastructure as well as functionalities of human dorsal pathway, and the hierarchical processing stream can be divided into three stages: 1) spatio-temporal processing for local motion, 2) recogni- tion for global motion patterns (camera motion), and 3) preemptive estimation of object motion. To extract effective and meaningful motion features, we apply a series of steer- able, spatio-temporal filters to detect local motion at different speeds and directions, in a way that\u27s selective of motion velocity. The intermediate response maps are cal- ibrated and combined to estimate dense motion fields in local regions, and then, local motions along two orthogonal axes are aggregated for recognizing planar, radial and circular patterns of global motion. We evaluate the model with an extensive, realistic video database that collected by hand with a mobile device (iPad) and the video content varies in scene geometry, lighting condition, view perspective and depth. We achieved high quality result and demonstrated that this bottom-up model is capable of extracting high-level semantic knowledge regarding self motion in realistic scenes. Once the global motion is known, we segment objects from moving backgrounds by compensating for camera motion. For videos captured with non-stationary cam- eras, we consider global motion as a combination of camera motion (background) and object motion (foreground). To estimate foreground motion, we exploit corollary dis- charge mechanism of biological systems and estimate motion preemptively. Since back- ground motions for each pixel are collectively introduced by camera movements, we apply spatial-temporal averaging to estimate the background motion at pixel level, and the initial estimation of foreground motion is derived by comparing global motion and background motion at multiple spatial levels. The real frame signals are compared with those derived by forward predictions, refining estimations for object motion. This mo- tion detection system is applied to detect objects with cluttered, moving backgrounds and is proved to be efficient in locating independently moving, non-rigid regions. The core contribution of this thesis is the invention of a robust motion estimation system for complicated real world videos, with challenges by real sensor noise, complex natural scenes, variations in illumination and depth, and motion discontinuities. The overall system demonstrates biological plausibility and holds great potential for other applications, such as camera motion removal, heading estimation, obstacle avoidance, route planning, and vision-based navigational assistance, etc

    Six DOF Motion Estimation for Teleoperated Flexible Endoscopes Using Optical Flow: A Comparative Study

    Get PDF
    Colorectal cancer is one of the leading causes of cancer-related deaths worldwide, although it can be effectively treated if detected early. Teleoperated flexible endoscopes are an emerging technology to promote participation in these preventive screenings. Real-time pose estimation is therefore essential to enable feedback to the robotic endoscope's control system. Vision-based endoscope localization approaches are a promising avenue, since they do not require extra sensors on board the endoscopes. In this work, we compare several state-of-the-art algorithms for computing the image motion (optical flow), which is then used with a supervised learning strategy to provide an accurate estimate of the 6 degree of freedom endoscope motion. The method is validated using a robotically actuated endoscope in a human colon simulator, and represents a preliminary effort towards testing with clinical video data
    • โ€ฆ
    corecore