364 research outputs found

    Object Detection and Tracking Based on Visual Saliency

    Get PDF
    基于视频的目标检测与跟踪是计算机视觉领域热点的研究方向之一,它在智能视频监控、军事侦察监视、交通管理和无人驾驶等领域有着广泛的应用,并发挥着举足轻重的作用。 在机器视觉中,一般的视频跟踪技术需要在第一帧手动地标记出运动目标。本文针对这一问题,研究如何让机器自动发现显著物并进行跟踪:利用视觉显著性对目标进行检测,通过词袋模型形成对运动目标的观测,结合粒子滤波跟踪算法对运动目标进行跟踪。主要的研究工作及创新点如下: 1.提出一种基于多线索视觉显著性融合的运动目标检测算法。利用中央周边差异显著性来检测局部对比度强的显著区域,利用谱残差显著性检测图像在空间域上的显著区域,利用动态显著性来检测具有运...Video-based target detection and tracking is one of the research hotspots in the field of computer vision. It plays a very important role in many applications, such as smart surveillance, military reconnaissance and surveillance, traffic management and auto driving. In machine vision, tracking always needs to label the object by human on the first frame. According to this problem, this thesis res...学位:工学硕士院系专业:信息科学与技术学院计算机科学系_计算机应用技术学号:2302007115126

    Cognitive architecture for an Attention-based and Bidirectional Loop-closing Domain (CABILDO)

    Get PDF
    This Ph. D. Thesis presents a novel attention-based cognitive architecture for social robots. The architecture aims to join perception and reasoning considering a double and simultaneous imbrication: the ongoing task biases the perceptual process to obtain only useful elements whereas perceived items determine the behaviours to be accomplished. Therefore, the proposed architecture represents a bidirectional solution to the perception-reasoning-action loop closing problem. The basis of the architecture is an Object-Based Visual Attention model. This perception system draws attention over perceptual units of visual information, called proto-objects. In order to highlight relevant elements, not only several intrinsic basic features (such as colour, location or shape) but also the constraints provided by the ongoing behaviour and context are considered. The proposed architecture is divided into two levels of performance. The lower level is concerned with quantitative models of execution, namely tasks that are suitable for the current work conditions, whereas a qualitative framework that describes and defines tasks relationships and coverages is placed at the top level. Perceived items determine the tasks that can be executed in each moment, following a need-based approach. Thereby, the tasks that better fit the perceived environment are more likely to be executed. Finally, the cognitive architecture has been tested using a real and unrestricted scenario that involves a real robot, time-varying tasks and daily life situations, in order to demonstrate that the proposal is able to efficiently address time- and behaviour-varying environments, overcoming the main drawbacks of already existing models

    Robust Methods for Visual Tracking and Model Alignment

    Get PDF
    The ubiquitous presence of cameras and camera networks needs the development of robust visual analytics algorithms. As the building block of many video visual surveillance tasks, a robust visual tracking algorithm plays an important role in achieving the goal of automatic and robust surveillance. In practice, it is critical to know when and where the tracking algorithm fails so that remedial measures can be taken to resume tracking. We propose a novel performance evaluation strategy for tracking systems using a time-reversed Markov chain. We also present a novel bidirectional tracker to achieve better robustness. Instead of looking only forward in the time domain, we incorporate both forward and backward processing of video frames using a time-reversibility constraint. When the objects of interest in surveillance applications have relatively stable structures, the parameterized shape model of objects can be usually built or learned from sample images, which allows us to perform more accurate tracking. We present a machine learning method to learn a scoring function without local extrema to guide the gradient descent/accent algorithm and find the optimal parameters of the shape model. These algorithms greatly improve the robustness of video analysis systems in practice

    Reconstruction and motion estimation of sparsely sampled ionospheric data

    Get PDF
    COPYRIGHT Attention is drawn to the fact that copyright of this thesis rests with its author. A copy of this thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with the author and they must not copy it or use material from it except as permitted by law or with the consent of the author. This thesis may be made available for consultation within the University Library and may be photocopied or lent to other libraries for the purposes of consultation

    Template-based Monocular 3-D Shape Reconstruction And Tracking Using Laplacian Meshes

    Get PDF
    This thesis addresses the problem of recovering the 3-D shape of a deformable object in single images, or image sequences acquired by a monocular video camera, given that a 3-D template shape and a template image of the object are available. While being a very challenging problem in computer vision, being able to reconstruct and track 3-D deformable objects in videos allows us to develop many potential applications ranging from sports and entertainments to engineering and medical imaging. This thesis extends the scope of deformable object modeling to real-world applications of fully 3-D modeling of deformable objects from video streams with a number of contributions. We show that by extending the Laplacian formalism, which was first introduced in the Graphics community to regularize 3-D meshes, we can turn the monocular 3-D shape reconstruction of a deformable object given correspondences with a reference image into a much better-posed problem with far fewer degrees of freedom than the original one. This has proved key to achieving real-time performance while preserving both sufficient flexibility and robustness. Our real-time 3-D reconstruction and tracking system of deformable objects can very quickly reject outlier correspondences and accurately reconstruct the object shape in 3D. Frame-to-frame tracking is exploited to track the object under difficult settings such as large deformations, occlusions, illumination changes, and motion blur. We present an approach to solving the problem of dense image registration and 3-D shape reconstruction of deformable objects in the presence of occlusions and minimal texture. A main ingredient is the pixel-wise relevancy score that we use to weigh the influence of the image information from a pixel in the image energy cost function. A careful design of the framework is essential for obtaining state-of-the-art results in recovering 3-D deformations of both well- and poorly-textured objects in the presence of occlusions. We study the problem of reconstructing 3-D deformable objects interacting with rigid ones. Imposing real physical constraints allows us to model the interactions of objects in the real world more accurately and more realistically. In particular, we study the problem of a ball colliding with a bat observed by high speed cameras. We provide quantitative measurements of the impact that are compared with simulation-based methods to evaluate which simulation predictions most accurately describe a physical quantity of interest and to improve the models. Based on the diffuse property of the tracked deformable object, we propose a method to estimate the environment irradiance map represented by a set of low frequency spherical harmonics. The obtained irradiance map can be used to realistically illuminate 2-D and 3-D virtual contents in the context of augmented reality on deformable objects. The results compare favorably with baseline methods. In collaboration with Disney Research, we develop an augmented reality coloring book application that runs in real-time on mobile devices. The app allows the children to see the coloring work by showing animated characters with texture lifted from their colors on the drawing. Deformations of the book page are explicitly modeled by our 3-D tracking and reconstruction method. As a result, accurate color information is extracted to synthesize the character's texture

    Visual motion : algorithms for analysis and application

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1990.Includes bibliographical references (leaves 71-73).by Michael Adam Sokolov.M.S
    corecore