65,102 research outputs found

    Matching techniques to compute image motion

    Get PDF
    This paper describes a thorough analysis of the pattern matching techniques used to compute image motion from a sequence of two or more images. Several correlation/distance measures are tested, and problems in displacement estimation are investigated. As a byproduct of this analysis, several novel techniques are presented which improve the accuracy of flow vector estimation and reduce the computational cost by using filters, multi-scale approach and mask sub-sampling. Further, new algorithms to obtain a sub-pixel accuracy of the flow are proposed. A large amount of experimental tests have been performed to compare all the techniques proposed, in order to understand which are the most useful for practical applications, and the results obtained are very accurate, showing that correlation-based flow computation is suitable for practical and real-time applications.247–260Pubblicat

    Motion estimation and video coding

    Get PDF
    Over the last ten years. research on the analysis of visual motion has come to play a key role in the fields of data compression for visual communication as well as computer vision. Enormous efforts have been made on the design of various motion estimation algorithms. One of the fundamental tasks in motion estimation is the accurate measurement of 2-D dense motion fields. For this purpose. we devise and present in this dissertation a multiattribute feedback computational framework. In this framework for each pixel in an image. instead of a single image intensity. multiple image attributes are computed as conservation information. To enhance the estimation accuracy. feedback technique is applied. Besides. the proposed algorithm needs less differentiation and thus is more robust to various noises. With these features. the estimation accuracy is improved considerably. Experiments have demonstrated that the proposed algorithm outperforms most of the existing techniques that compute 2-D dense motion fields in terms of accuracy. The estimation of 2-D block motion vector fields has been dominant among techniques in exploiting the temporal redundancy in video coding owing to its straightforward implementation and reasonable performance. But block matching is still a computational burden in real time video compression. Hence. efficient block matching techniques remain in demand. Existing block matching methods including full search and multiresolution techniques treat every region in an image domain indiscriminately no matter whether the region contains complicated motion or not. Motivated from this observation. we have developed two thresholding techniques for block matching in video coding. in which regions experiencing relatively uniform motion are withheld from further processing via thresholfing. thus saving compu­tation drastically. One is a thresholding multiresolution block matching. Extensive experiments show that the proposed algorithm has a consistent performance for sequences with different motion complexities. It reduces the processing time ranging from 14% to 20% while maintaining almost the same quality of the reconstructed image (only about 0.1 dB loss in PSNR). compared with the fastest existing multiresolution technique. The other is a thresholding hierarchical block matching where no pyramid is actually formed. Experiments indicate that for sequences with less motion such as videoconferencing sequences. this algorithm works faster and has much less motion vectors than the thresholding multiresolution block matching method

    Non-rigid registration of 2-D/3-D dynamic data with feature alignment

    Get PDF
    In this work, we are computing the matching between 2D manifolds and 3D manifolds with temporal constraints, that is we are computing the matching among a time sequence of 2D/3D manifolds. It is solved by mapping all the manifolds to a common domain, then build their matching by composing the forward mapping and the inverse mapping. At first, we solve the matching problem between 2D manifolds with temporal constraints by using mesh-based registration method. We propose a surface parameterization method to compute the mapping between the 2D manifold and the common 2D planar domain. We can compute the matching among the time sequence of deforming geometry data through this common domain. Compared with previous work, our method is independent of the quality of mesh elements and more efficient for the time sequence data. Then we develop a global intensity-based registration method to solve the matching problem between 3D manifolds with temporal constraints. Our method is based on a 4D(3D+T) free-from B-spline deformation model which has both spatial and temporal smoothness. Compared with previous 4D image registration techniques, our method avoids some local minimum. Thus it can be solved faster and achieve better accuracy of landmark point predication. We demonstrate the efficiency of these works on the real applications. The first one is applied to the dynamic face registering and texture mapping. The second one is applied to lung tumor motion tracking in the medical image analysis. In our future work, we are developing more efficient mesh-based 4D registration method. It can be applied to tumor motion estimation and tracking, which can be used to calculate the read dose delivered to the lung and surrounding tissues. Thus this can support the online treatment of lung cancer radiotherapy

    Vector Quantization Video Encoder Using Hierarchical Cache Memory Scheme

    Get PDF
    A system compresses image blocks via successive hierarchical stages and motion encoders which employ caches updated by stack replacement algorithms. Initially, a background detector compares the present image block with a corresponding previously encoded image block and if similar, the background detector terminates the encoding procedure by setting a flag bit. Otherwise, the image block is decomposed into smaller present image subblocks. The smaller present image subblocks are each compared with a corresponding previously encoded image subblock of comparable size within the present image block. When a present image subblock is similar to a corresponding previously encoded image subblock, then the procedure is terminated by setting a flag bit. Alternatively, the present image subblock is forwarded to a motion encoder where it is compared with displaced image subblocks, which are formed by displacing previously encoded image subblocks by motion vectors that are stored in a cache, to derive a first distortion vector. When the first distortion vector is below a first threshold TM, the procedure is terminated and the present image subblock is encoded by setting flag bit and a cache index corresponding to the first distortion vector. Alternatively, the present image subblock is passed to a block matching encoder where it is compared with other previously encoded image subblocks to derive a second distortion vector. When the second distortion vector is below a second threshold Tm, the procedure is terminated by setting a flag bit, by generating the second distortion vector, and by updating the cache.Georgia Tech Research Corporatio

    Multi-Scale 3D Scene Flow from Binocular Stereo Sequences

    Full text link
    Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108

    GraphMatch: Efficient Large-Scale Graph Construction for Structure from Motion

    Full text link
    We present GraphMatch, an approximate yet efficient method for building the matching graph for large-scale structure-from-motion (SfM) pipelines. Unlike modern SfM pipelines that use vocabulary (Voc.) trees to quickly build the matching graph and avoid a costly brute-force search of matching image pairs, GraphMatch does not require an expensive offline pre-processing phase to construct a Voc. tree. Instead, GraphMatch leverages two priors that can predict which image pairs are likely to match, thereby making the matching process for SfM much more efficient. The first is a score computed from the distance between the Fisher vectors of any two images. The second prior is based on the graph distance between vertices in the underlying matching graph. GraphMatch combines these two priors into an iterative "sample-and-propagate" scheme similar to the PatchMatch algorithm. Its sampling stage uses Fisher similarity priors to guide the search for matching image pairs, while its propagation stage explores neighbors of matched pairs to find new ones with a high image similarity score. Our experiments show that GraphMatch finds the most image pairs as compared to competing, approximate methods while at the same time being the most efficient.Comment: Published at IEEE 3DV 201

    Low Power Depth Estimation of Rigid Objects for Time-of-Flight Imaging

    Full text link
    Depth sensing is useful in a variety of applications that range from augmented reality to robotics. Time-of-flight (TOF) cameras are appealing because they obtain dense depth measurements with minimal latency. However, for many battery-powered devices, the illumination source of a TOF camera is power hungry and can limit the battery life of the device. To address this issue, we present an algorithm that lowers the power for depth sensing by reducing the usage of the TOF camera and estimating depth maps using concurrently collected images. Our technique also adaptively controls the TOF camera and enables it when an accurate depth map cannot be estimated. To ensure that the overall system power for depth sensing is reduced, we design our algorithm to run on a low power embedded platform, where it outputs 640x480 depth maps at 30 frames per second. We evaluate our approach on several RGB-D datasets, where it produces depth maps with an overall mean relative error of 0.96% and reduces the usage of the TOF camera by 85%. When used with commercial TOF cameras, we estimate that our algorithm can lower the total power for depth sensing by up to 73%
    corecore