458 research outputs found

    Features for matching people in different views

    No full text
    There have been significant advances in the computer vision field during the last decade. During this period, many methods have been developed that have been successful in solving challenging problems including Face Detection, Object Recognition and 3D Scene Reconstruction. The solutions developed by computer vision researchers have been widely adopted and used in many real-life applications such as those faced in the medical and security industry. Among the different branches of computer vision, Object Recognition has been an area that has advanced rapidly in recent years. The successful introduction of approaches such as feature extraction and description has been an important factor in the growth of this area. In recent years, researchers have attempted to use these approaches and apply them to other problems such as Content Based Image Retrieval and Tracking. In this work, we present a novel system that finds correspondences between people seen in different images. Unlike other approaches that rely on a video stream to track the movement of people between images, here we present a feature-based approach where we locate a target’s new location in an image, based only on its visual appearance. Our proposed system comprises three steps. In the first step, a set of features is extracted from the target’s appearance. A novel algorithm is developed that allows extraction of features from a target that is particularly suitable to the modelling task. In the second step, each feature is characterised using a combined colour and texture descriptor. Inclusion of information relating to both colour and texture of a feature add to the descriptor’s distinctiveness. Finally, the target’s appearance and pose is modelled as a collection of such features and descriptors. This collection is then used as a template that allows us to search for a similar combination of features in other images that correspond to the target’s new location. We have demonstrated the effectiveness of our system in locating a target’s new position in an image, despite differences in viewpoint, scale or elapsed time between the images. The characterisation of a target as a collection of features also allows our system to robustly deal with the partial occlusion of the target

    A Review of Codebook Models in Patch-Based Visual Object Recognition

    No full text
    The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods

    Vehicle Classification For Automatic Traffic Density Estimation

    Get PDF
    Automatic traffic light control at intersection has recently become one of the most active research areas related to the development of intelligent transportation systems (ITS). Due to the massive growth in urbanization and traffic congestion, intelligent vision based traffic light controller is needed to reduce the traffi c delay and travel time especially in developing countries as the current automatic time based control is not realistic while sensor-based tra ffic light controller is not reliable in developing countries. Vision based traffi c light controller depends mainly on traffic congestion estimation at cross roads, because the main road junctions of a city are these roads where most of the road-beds are lost. Most of the previous studies related to this topic do not take unattended vehicles into consideration when estimating the tra ffic density or traffi c flow. In this study we would like to improve the performance of vision based traffi c light control by detecting stationary and unattended vehicles to give them higher weights, using image processing and pattern recognition techniques for much e ffective and e ffecient tra ffic congestion estimation

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    Real-time, long-term hand tracking with unsupervised initialization

    Get PDF
    This paper proposes a complete tracking system that is capable of long-term, real-time hand tracking with unsupervised initialization and error recovery. Initialization is steered by a three-stage hand detector, combining spatial and temporal information. Hand hypotheses are generated by a random forest detector in the first stage, whereas a simple linear classifier eliminates false positive detections. Resulting detections are tracked by particle filters that gather temporal statistics in order to make a final decision. The detector is scale and rotation invariant, and can detect hands in any pose in unconstrained environments. The resulting discriminative confidence map is combined with a generative particle filter based observation model to enable robust, long-term hand tracking in real-time. The proposed solution is evaluated using several challenging, publicly available datasets, and is shown to clearly outperform other state of the art object tracking methods

    Object Tracking: Appearance Modeling And Feature Learning

    Get PDF
    Object tracking in real scenes is an important problem in computer vision due to increasing usage of tracking systems day in and day out in various applications such as surveillance, security, monitoring and robotic vision. Object tracking is the process of locating objects of interest in every frame of video frames. Many systems have been proposed to address the tracking problem where the major challenges come from handling appearance variation during tracking caused by changing scale, pose, rotation, illumination and occlusion. In this dissertation, we address these challenges by introducing several novel tracking techniques. First, we developed a multiple object tracking system that deals specially with occlusion issues. The system depends on our improved KLT tracker for accurate and robust tracking during partial occlusion. In full occlusion, we applied a Kalman filter to predict the object\u27s new location and connect the trajectory parts. Many tracking methods depend on a rectangle or an ellipse mask to segment and track objects. Typically, using a larger or smaller mask will lead to loss of tracked objects. Second, we present an object tracking system (SegTrack) that deals with partial and full occlusions by employing improved segmentation methods: mixture of Gaussians and a silhouette segmentation algorithm. For re-identification, one or more feature vectors for each tracked object are used after target reappearing. Third, we propose a novel Bayesian Hierarchical Appearance Model (BHAM) for robust object tracking. Our idea is to model the appearance of a target as combination of multiple appearance models, each covering the target appearance changes under a certain situation (e.g. view angle). In addition, we built an object tracking system by integrating BHAM with background subtraction and the KLT tracker for static camera videos. For moving camera videos, we applied BHAM to cluster negative and positive target instances. As tracking accuracy depends mainly on finding good discriminative features to estimate the target location, finally, we propose to learn good features for generic object tracking using online convolutional neural networks (OCNN). In order to learn discriminative and stable features for tracking, we propose a novel object function to train OCNN by penalizing the feature variations in consecutive frames, and the tracker is built by integrating OCNN with a color-based multi-appearance model. Our experimental results on real-world videos show that our tracking systems have superior performance when compared with several state-of-the-art trackers. In the feature, we plan to apply the Bayesian Hierarchical Appearance Model (BHAM) for multiple objects tracking

    Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions

    Get PDF
    Head-pose estimation has many applications, such as social event analysis, human-robot and human-computer interaction, driving assistance, and so forth. Head-pose estimation is challenging because it must cope with changing illumination conditions, variabilities in face orientation and in appearance, partial occlusions of facial landmarks, as well as bounding-box-to-face alignment errors. We propose tu use a mixture of linear regressions with partially-latent output. This regression method learns to map high-dimensional feature vectors (extracted from bounding boxes of faces) onto the joint space of head-pose angles and bounding-box shifts, such that they are robustly predicted in the presence of unobservable phenomena. We describe in detail the mapping method that combines the merits of unsupervised manifold learning techniques and of mixtures of regressions. We validate our method with three publicly available datasets and we thoroughly benchmark four variants of the proposed algorithm with several state-of-the-art head-pose estimation methods.Comment: 12 pages, 5 figures, 3 table

    Automatic Test Methods for Image and Video Verification

    Get PDF
    In this thesis four methods for automatic verification of images and video on mobile platforms are developed. Both the case of recording images and video and the case of viewing images and video on the mobile lcd screen are considered. The first method is used to test the zoom function of the camera. It uses SURF decriptors along with clustering and histograms to determine which of six discrete zoom levels the current frame belongs to. The second method identifies color effects and color anomalies using histograms. The third method determines if the autofocus works correctly by measuring the average length of edges in the image. The fourth method is an artifact detection scheme using a non-reference implementation of the SSIM metric, used in conjunction with a for this purpose specially designed test setup. Together these methods form a tool kit for detecting the mnost common errors to occur in images and video during the development stage of mobile platforms
    • 

    corecore