775 research outputs found

    Multi-Scale 3D Scene Flow from Binocular Stereo Sequences

    Full text link
    Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108

    Vision-based techniques for gait recognition

    Full text link
    Global security concerns have raised a proliferation of video surveillance devices. Intelligent surveillance systems seek to discover possible threats automatically and raise alerts. Being able to identify the surveyed object can help determine its threat level. The current generation of devices provide digital video data to be analysed for time varying features to assist in the identification process. Commonly, people queue up to access a facility and approach a video camera in full frontal view. In this environment, a variety of biometrics are available - for example, gait which includes temporal features like stride period. Gait can be measured unobtrusively at a distance. The video data will also include face features, which are short-range biometrics. In this way, one can combine biometrics naturally using one set of data. In this paper we survey current techniques of gait recognition and modelling with the environment in which the research was conducted. We also discuss in detail the issues arising from deriving gait data, such as perspective and occlusion effects, together with the associated computer vision challenges of reliable tracking of human movement. Then, after highlighting these issues and challenges related to gait processing, we proceed to discuss the frameworks combining gait with other biometrics. We then provide motivations for a novel paradigm in biometrics-based human recognition, i.e. the use of the fronto-normal view of gait as a far-range biometrics combined with biometrics operating at a near distance

    Registration using Graphics Processor Unit

    Get PDF
    Data point set registration is an important operation in coordinate metrology. Registration is the operation by which sampled point clouds are aligned with a CAD model by a 4X4 homogeneous transformation (e.g., rotation and translation). This alignment permits validation of the produced artifact\u27s geometry. State-of-the-art metrology systems are now capable of generating thousands, if not millions, of data points during an inspection operation, resulting in increased computational power to fully utilize these larger data sets. The registration process is an iterative nonlinear optimization operation having an execution time directly related to the number of points processed and CAD model complexity. The objective function to be minimized by this optimization is the sum of the square distances between each point in the point cloud and the closest surface in the CAD model. A brute force approach to registration, which is often used, is to compute the minimum distance between each point and each surface in the CAD model. As point cloud sizes and CAD model complexity increase, this approach becomes intractable and inefficient. Highly efficient numerical and analytical gradient based algorithms exist and their goal is to convergence to an optimal solution in minimum time. This thesis presents a new approach to efficiently perform the registration process by employing readily available computer hardware, the graphical processor unit (GPU). The data point set registration time for the GPU shows a significant improvement (around 15-20 times) over typical CPU performance. Efficient GPU programming decreases the complexity of the steps and improves the rate of convergence of the existing algorithms. The experimental setup reveals the exponential increasing nature of the CPU and the linear performance of the GPU in various aspects of an algorithm. The importance of CPU in the GPU programming is highlighted. The future implementations disclose the possible extensions of a GPU for higher order and complex coordinate metrology algorithms

    Characterization of multiphase flows integrating X-ray imaging and virtual reality

    Get PDF
    Multiphase flows are used in a wide variety of industries, from energy production to pharmaceutical manufacturing. However, because of the complexity of the flows and difficulty measuring them, it is challenging to characterize the phenomena inside a multiphase flow. To help overcome this challenge, researchers have used numerous types of noninvasive measurement techniques to record the phenomena that occur inside the flow. One technique that has shown much success is X-ray imaging. While capable of high spatial resolutions, X-ray imaging generally has poor temporal resolution. This research improves the characterization of multiphase flows in three ways. First, an X-ray image intensifier is modified to use a high-speed camera to push the temporal limits of what is possible with current tube source X-ray imaging technology. Using this system, sample flows were imaged at 1000 frames per second without a reduction in spatial resolution. Next, the sensitivity of X-ray computed tomography (CT) measurements to changes in acquisition parameters is analyzed. While in theory CT measurements should be stable over a range of acquisition parameters, previous research has indicated otherwise. The analysis of this sensitivity shows that, while raw CT values are strongly affected by changes to acquisition parameters, if proper calibration techniques are used, acquisition parameters do not significantly influence the results for multiphase flow imaging. Finally, two algorithms are analyzed for their suitability to reconstruct an approximate tomographic slice from only two X-ray projections. These algorithms increase the spatial error in the measurement, as compared to traditional CT; however, they allow for very high temporal resolutions for 3D imaging. The only limit on the speed of this measurement technique is the image intensifier-camera setup, which was shown to be capable of imaging at a rate of at least 1000 FPS. While advances in measurement techniques for multiphase flows are one part of improving multiphase flow characterization, the challenge extends beyond measurement techniques. For improved measurement techniques to be useful, the data must be accessible to scientists in a way that maximizes the comprehension of the phenomena. To this end, this work also presents a system for using the Microsoft Kinect sensor to provide natural, non-contact interaction with multiphase flow data. Furthermore, this system is constructed so that it is trivial to add natural, non-contact interaction to immersive visualization applications. Therefore, multiple visualization applications can be built that are optimized to specific types of data, but all leverage the same natural interaction. Finally, the research is concluded by proposing a system that integrates the improved X-ray measurements, with the Kinect interaction system, and a CAVE automatic virtual environment (CAVE) to present scientists with the multiphase flow measurements in an intuitive and inherently three-dimensional manner

    Optical flow estimation via steered-L1 norm

    Get PDF
    Global variational methods for estimating optical flow are among the best performing methods due to the subpixel accuracy and the ‘fill-in’ effect they provide. The fill-in effect allows optical flow displacements to be estimated even in low and untextured areas of the image. The estimation of such displacements are induced by the smoothness term. The L1 norm provides a robust regularisation term for the optical flow energy function with a very good performance for edge-preserving. However this norm suffers from several issues, among these is the isotropic nature of this norm which reduces the fill-in effect and eventually the accuracy of estimation in areas near motion boundaries. In this paper we propose an enhancement to the L1 norm that improves the fill-in effect for this smoothness term. In order to do this we analyse the structure tensor matrix and use its eigenvectors to steer the smoothness term into components that are ‘orthogonal to’ and ‘aligned with’ image structures. This is done in primal-dual formulation. Results show a reduced end-point error and improved accuracy compared to the conventional L1 norm

    Optical flow estimation via steered-L1 norm

    Get PDF
    Global variational methods for estimating optical flow are among the best performing methods due to the subpixel accuracy and the ‘fill-in’ effect they provide. The fill-in effect allows optical flow displacements to be estimated even in low and untextured areas of the image. The estimation of such displacements are induced by the smoothness term. The L1 norm provides a robust regularisation term for the optical flow energy function with a very good performance for edge-preserving. However this norm suffers from several issues, among these is the isotropic nature of this norm which reduces the fill-in effect and eventually the accuracy of estimation in areas near motion boundaries. In this paper we propose an enhancement to the L1 norm that improves the fill-in effect for this smoothness term. In order to do this we analyse the structure tensor matrix and use its eigenvectors to steer the smoothness term into components that are ‘orthogonal to’ and ‘aligned with’ image structures. This is done in primal-dual formulation. Results show a reduced end-point error and improved accuracy compared to the conventional L1 norm

    Cognitive Robots for Social Interactions

    Get PDF
    One of my goals is to work towards developing Cognitive Robots, especially with regard to improving the functionalities that facilitate the interaction with human beings and their surrounding objects. Any cognitive system designated for serving human beings must be capable of processing the social signals and eventually enable efficient prediction and planning of appropriate responses. My main focus during my PhD study is to bridge the gap between the motoric space and the visual space. The discovery of the mirror neurons ([RC04]) shows that the visual perception of human motion (visual space) is directly associated to the motor control of the human body (motor space). This discovery poses a large number of challenges in different fields such as computer vision, robotics and neuroscience. One of the fundamental challenges is the understanding of the mapping between 2D visual space and 3D motoric control, and further developing building blocks (primitives) of human motion in the visual space as well as in the motor space. First, I present my study on the visual-motoric mapping of human actions. This study aims at mapping human actions in 2D videos to 3D skeletal representation. Second, I present an automatic algorithm to decompose motion capture (MoCap) sequences into synergies along with the times at which they are executed (or "activated") for each joint. Third, I proposed to use the Granger Causality as a tool to study the coordinated actions performed by at least two units. Recent scientific studies suggest that the above "action mirroring circuit" might be tuned to action coordination rather than single action mirroring. Fourth, I present the extraction of key poses in visual space. These key poses facilitate the further study of the "action mirroring circuit". I conclude the dissertation by describing the future of cognitive robotics study
    corecore