5,513 research outputs found

    Extended Object Tracking: Introduction, Overview and Applications

    Full text link
    This article provides an elaborate overview of current research in extended object tracking. We provide a clear definition of the extended object tracking problem and discuss its delimitation to other types of object tracking. Next, different aspects of extended object modelling are extensively discussed. Subsequently, we give a tutorial introduction to two basic and well used extended object tracking approaches - the random matrix approach and the Kalman filter-based approach for star-convex shapes. The next part treats the tracking of multiple extended objects and elaborates how the large number of feasible association hypotheses can be tackled using both Random Finite Set (RFS) and Non-RFS multi-object trackers. The article concludes with a summary of current applications, where four example applications involving camera, X-band radar, light detection and ranging (lidar), red-green-blue-depth (RGB-D) sensors are highlighted.Comment: 30 pages, 19 figure

    The Visual Centrifuge: Model-Free Layered Video Representations

    Full text link
    True video understanding requires making sense of non-lambertian scenes where the color of light arriving at the camera sensor encodes information about not just the last object it collided with, but about multiple mediums -- colored windows, dirty mirrors, smoke or rain. Layered video representations have the potential of accurately modelling realistic scenes but have so far required stringent assumptions on motion, lighting and shape. Here we propose a learning-based approach for multi-layered video representation: we introduce novel uncertainty-capturing 3D convolutional architectures and train them to separate blended videos. We show that these models then generalize to single videos, where they exhibit interesting abilities: color constancy, factoring out shadows and separating reflections. We present quantitative and qualitative results on real world videos.Comment: Appears in: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019). This arXiv contains the CVPR Camera Ready version of the paper (although we have included larger figures) as well as an appendix detailing the model architectur

    Real-time visual tracking using image processing and filtering methods

    Get PDF
    The main goal of this thesis is to develop real-time computer vision algorithms in order to detect and to track targets in uncertain complex environments purely based on a visual sensor. Two major subjects addressed by this work are: 1. The development of fast and robust image segmentation algorithms that are able to search and automatically detect targets in a given image. 2. The development of sound filtering algorithms to reduce the effects of noise in signals from the image processing. The main constraint of this research is that the algorithms should work in real-time with limited computing power on an onboard computer in an aircraft. In particular, we focus on contour tracking which tracks the outline of the target represented by contours in the image plane. This thesis is concerned with three specific categories, namely image segmentation, shape modeling, and signal filtering. We have designed image segmentation algorithms based on geometric active contours implemented via level set methods. Geometric active contours are deformable contours that automatically track the outlines of objects in images. In this approach, the contour in the image plane is represented as the zero-level set of a higher dimensional function. (One example of the higher dimensional function is a three-dimensional surface for a two-dimensional contour.) This approach handles the topological changes (e.g., merging, splitting) of the contour naturally. Although geometric active contours prevail in many fields of computer vision, they suffer from the high computational costs associated with level set methods. Therefore, simplified versions of level set methods such as fast marching methods are often used in problems of real-time visual tracking. This thesis presents the development of a fast and robust segmentation algorithm based on up-to-date extensions of level set methods and geometric active contours, namely a fast implementation of Chan-Vese's (active contour) model (FICVM). The shape prior is a useful cue in the recognition of the true target. For the contour tracker, the outline of the target can be easily disrupted by noise. In geometric active contours, to cope with deviations from the true outline of the target, a higher dimensional function is constructed based on the shape prior, and the contour tracks the outline of an object by considering the difference between the higher dimensional functions obtained from the shape prior and from a measurement in a given image. The higher dimensional function is often a distance map which requires high computational costs for construction. This thesis focuses on the extraction of shape information from only the zero-level set of the higher dimensional function. This strategy compensates for inaccuracies in the calculation of the shape difference that occur when a simplified higher dimensional function is used. This is named as contour-based shape modeling. Filtering is an essential element in tracking problems because of the presence of noise in system models and measurements. The well-known Kalman filter provides an exact solution only for problems which have linear models and Gaussian distributions (linear/Gaussian problems). For nonlinear/non-Gaussian problems, particle filters have received much attention in recent years. Particle filtering is useful in the approximation of complicated posterior probability distribution functions. However, the computational burden of particle filtering prevents it from performing at full capacity in real-time applications. This thesis concentrates on improving the processing time of particle filtering for real-time applications. In principle, we follow the particle filter in the geometric active contour framework. This thesis proposes an advanced blob tracking scheme in which a blob contains shape prior information of the target. This scheme simplifies the sampling process and quickly suggests the samples which have a high probability of being the target. Only for these samples is the contour tracking algorithm applied to obtain a more detailed state estimate. Curve evolution in the contour tracking is realized by the FICVM. The dissimilarity measure is calculated by the contour based shape modeling method and the shape prior is updated when it satisfies certain conditions. The new particle filter is applied to the problems of low contrast and severe daylight conditions, to cluttered environments, and to the appearing/disappearing target tracking. We have also demonstrated the utility of the filtering algorithm for multiple target tracking in the presence of occlusions. This thesis presents several test results from simulations and flight tests. In these tests, the proposed algorithms demonstrated promising results in varied situations of tracking.Ph.D.Committee Chair: Eric N. Johnson; Committee Co-Chair: Allen R. Tannenbaum; Committee Member: Anthony J. Calise; Committee Member: Eric Feron; Committee Member: Patricio A. Vel

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Trying to break new ground in aerial archaeology

    Get PDF
    Aerial reconnaissance continues to be a vital tool for landscape-oriented archaeological research. Although a variety of remote sensing platforms operate within the earth’s atmosphere, the majority of aerial archaeological information is still derived from oblique photographs collected during observer-directed reconnaissance flights, a prospection approach which has dominated archaeological aerial survey for the past century. The resulting highly biased imagery is generally catalogued in sub-optimal (spatial) databases, if at all, after which a small selection of images is orthorectified and interpreted. For decades, this has been the standard approach. Although many innovations, including digital cameras, inertial units, photogrammetry and computer vision algorithms, geographic(al) information systems and computing power have emerged, their potential has not yet been fully exploited in order to re-invent and highly optimise this crucial branch of landscape archaeology. The authors argue that a fundamental change is needed to transform the way aerial archaeologists approach data acquisition and image processing. By addressing the very core concepts of geographically biased aerial archaeological photographs and proposing new imaging technologies, data handling methods and processing procedures, this paper gives a personal opinion on how the methodological components of aerial archaeology, and specifically aerial archaeological photography, should evolve during the next decade if developing a more reliable record of our past is to be our central aim. In this paper, a possible practical solution is illustrated by outlining a turnkey aerial prospection system for total coverage survey together with a semi-automated back-end pipeline that takes care of photograph correction and image enhancement as well as the management and interpretative mapping of the resulting data products. In this way, the proposed system addresses one of many bias issues in archaeological research: the bias we impart to the visual record as a result of selective coverage. While the total coverage approach outlined here may not altogether eliminate survey bias, it can vastly increase the amount of useful information captured during a single reconnaissance flight while mitigating the discriminating effects of observer-based, on-the-fly target selection. Furthermore, the information contained in this paper should make it clear that with current technology it is feasible to do so. This can radically alter the basis for aerial prospection and move landscape archaeology forward, beyond the inherently biased patterns that are currently created by airborne archaeological prospection

    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

    Full text link
    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
    • …
    corecore