159,219 research outputs found
Semi-hierarchical based motion estimation algorithm for the dirac video encoder
Having fast and efficient motion estimation is crucial in today’s advance video compression
technique since it determines the compression efficiency and the complexity of a video encoder. In this paper, a method which we call semi-hierarchical motion estimation is proposed for the Dirac video encoder. By considering the fully hierarchical motion estimation only for a certain type of inter frame encoding, complexity
of the motion estimation can be greatly reduced while maintaining the desirable accuracy. The experimental results show that the proposed algorithm gives two to three times reduction in terms of the number of SAD calculation compared with existing motion estimation algorithm of Dirac for the same motion estimation
accuracy, compression efficiency and PSNR performance. Moreover, depending upon the complexity of the test sequence, the proposed algorithm has the ability to increase or decrease the search range in order to maintain the accuracy of the motion estimation to a certain level
Motion estimation using optical flow field
Over the last decade, many low-level vision algorithms have been devised for extracting depth from intensity images. Most of them are based on motion of the rigid observer. Translation and rotation are constants with respect to space coordinates. When multi-objects move and/or the objects change shape, the algorithms cannot be used.
In this dissertation, we develop a new robust framework for the determination of dense 3-D position and motion fields from a stereo image sequence. The framework is based on unified optical flow field (UOFF). In the UOFF approach, a four frame mode is used to compute six dense 3-D position and velocity fields. Their accuracy depends on the accuracy of optical flow field computation. The approach can estimate rigid and/or nonrigid motion as well as observer and/or object(s) motion.
Here, a novel approach to optical flow field computation is developed. The approach is named as correlation-feedback approach. It has three different features from any other existing approaches. They are feedback, rubber window, and special refinement. With those three features, error is reduced, boundary is conserved, subpixel estimation accuracy is increased, and the system is robust. Convergence of the algorithm is proved in general.
Since the UOFF is based on each pixel, it is sensitive to noise or uncertainty at each pixel. In order to improve its performance, we applied two Kalman filters. Our analysis indicates that different image areas need different convergence rates, for instance. the areas along boundaries have faster convergence rate than an interior area. The first Kalman filter is developed to conserve moving boundary in optical How determination by applying needed nonhomogeneous iterations. The second Kalman filter is devised to compute 3-D motion and structure based on a stereo image sequence. Since multi-object motion is allowed, newly visible areas may be exposed in images. How to detect and handle the newly visible areas is addressed. The system and measurement noise covariance matrices, Q and R, in the two Kalman filters are analyzed in detail. Numerous experiments demonstrate the efficiency of our approach
Benchmarking and Comparing Popular Visual SLAM Algorithms
This paper contains the performance analysis and benchmarking of two popular
visual SLAM Algorithms: RGBD-SLAM and RTABMap. The dataset used for the
analysis is the TUM RGBD Dataset from the Computer Vision Group at TUM. The
dataset selected has a large set of image sequences from a Microsoft Kinect
RGB-D sensor with highly accurate and time-synchronized ground truth poses from
a motion capture system. The test sequences selected depict a variety of
problems and camera motions faced by Simultaneous Localization and Mapping
(SLAM) algorithms for the purpose of testing the robustness of the algorithms
in different situations. The evaluation metrics used for the comparison are
Absolute Trajectory Error (ATE) and Relative Pose Error (RPE). The analysis
involves comparing the Root Mean Square Error (RMSE) of the two metrics and the
processing time for each algorithm. This paper serves as an important aid in
the selection of SLAM algorithm for different scenes and camera motions. The
analysis helps to realize the limitations of both SLAM methods. This paper also
points out some underlying flaws in the used evaluation metrics.Comment: 7 pages, 4 figure
Four-dimensional tomographic reconstruction by time domain decomposition
Since the beginnings of tomography, the requirement that the sample does not
change during the acquisition of one tomographic rotation is unchanged. We
derived and successfully implemented a tomographic reconstruction method which
relaxes this decades-old requirement of static samples. In the presented
method, dynamic tomographic data sets are decomposed in the temporal domain
using basis functions and deploying an L1 regularization technique where the
penalty factor is taken for spatial and temporal derivatives. We implemented
the iterative algorithm for solving the regularization problem on modern GPU
systems to demonstrate its practical use
Efficient MRF Energy Propagation for Video Segmentation via Bilateral Filters
Segmentation of an object from a video is a challenging task in multimedia
applications. Depending on the application, automatic or interactive methods
are desired; however, regardless of the application type, efficient computation
of video object segmentation is crucial for time-critical applications;
specifically, mobile and interactive applications require near real-time
efficiencies. In this paper, we address the problem of video segmentation from
the perspective of efficiency. We initially redefine the problem of video
object segmentation as the propagation of MRF energies along the temporal
domain. For this purpose, a novel and efficient method is proposed to propagate
MRF energies throughout the frames via bilateral filters without using any
global texture, color or shape model. Recently presented bi-exponential filter
is utilized for efficiency, whereas a novel technique is also developed to
dynamically solve graph-cuts for varying, non-lattice graphs in general linear
filtering scenario. These improvements are experimented for both automatic and
interactive video segmentation scenarios. Moreover, in addition to the
efficiency, segmentation quality is also tested both quantitatively and
qualitatively. Indeed, for some challenging examples, significant time
efficiency is observed without loss of segmentation quality.Comment: Multimedia, IEEE Transactions on (Volume:16, Issue: 5, Aug. 2014
- …