713 research outputs found

    Partially obscured human detection based on component detectors using multiple feature descriptors

    Get PDF
    This paper presents a human detection system based on component detector using multiple feature descriptors. The contribution presents two issues for dealing with the problem of partially obscured human. First, it presents the extension of feature descriptors using multiple scales based Histograms of Oriented Gradients (HOG) and parallelogram based Haar-like feature (PHF) for improving the accuracy of the system. By using multiple scales based HOG, an extensive feature space allows obtaining high-discriminated features. Otherwise, the PHF is adaptive limb shapes of human in fast computing feature. Second, learning system using boosting classifications based approach is used for training and detecting the partially obscured human. The advantage of boosting is constructing a strong classification by combining a set of weak classifiers. However, the performance of boosting depends on the kernel of weak classifier. Therefore, the hybrid algorithms based on AdaBoost and SVM using the proposed feature descriptors is one of solutions for robust human detection.This paper presents a human detection system based on component detector using multiple feature descriptors. The contribution presents two issues for dealing with the problem of partially obscured human. First, it presents the extension of feature descriptors using multiple scales based Histograms of Oriented Gradients (HOG) and parallelogram based Haar-like feature (PHF) for improving the accuracy of the system. By using multiple scales based HOG, an extensive feature space allows obtaining high-discriminated features. Otherwise, the PHF is adaptive limb shapes of human in fast computing feature. Second, learning system using boosting classifications based approach is used for training and detecting the partially obscured human. The advantage of boosting is constructing a strong classification by combining a set of weak classifiers. However, the performance of boosting depends on the kernel of weak classifier. Therefore, the hybrid algorithms based on AdaBoost and SVM using the proposed feature descriptors is one of solutions for robust human detection

    Assessment of algorithms for mitosis detection in breast cancer histopathology images

    Get PDF
    The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers. However, mitosis counting is laborious, subjective and may suffer from low inter-observer agreement. With the wider acceptance of whole slide images in pathology labs, automatic image analysis has been proposed as a potential solution for these issues. In this paper, the results from the Assessment of Mitosis Detection Algorithms 2013 (AMIDA13) challenge are described. The challenge was based on a data set consisting of 12 training and 11 testing subjects, with more than one thousand annotated mitotic figures by multiple observers. Short descriptions and results from the evaluation of eleven methods are presented. The top performing method has an error rate that is comparable to the inter-observer agreement among pathologists

    Analysis of infrared polarisation signatures for vehicle detection

    Get PDF
    Thermal radiation emitted from objects within a scene tends to be partially polarised in a direction parallel to the surface normal, to an extent governed by properties of the surface material. This thesis investigates whether vehicle detection algorithms can be improved by the additional measurement of polarisation state as well as intensity in the long wave infrared. Knowledge about the polarimetric properties of scenes guides the development of histogram based and cluster based descriptors which are used in a traditional classification framework. The best performing histogram based method, the Polarimetric Histogram, which forms a descriptor based on the polarimetric vehicle signature is shown to outperform the standard Histogram of Oriented Gradients descriptor which uses intensity imagery alone. These descriptors then lead to a novel clustering algorithm which, at a false positive rate of 10−2 is shown to improve upon the Polarimetric Histogram descriptor, increasing the true positive rate from 0.19 to 0.63. In addition, a multi-modal detection framework which combines thermal intensity hotspot and polarimetric hotspot detections with a local motion detector is presented. Through the combination of these detectors, the false positive rate is shown to be reduced when compared to the result of individual detectors in isolation

    Exploratory search through large video corpora

    Get PDF
    Activity retrieval is a growing field in electrical engineering that specializes in the search and retrieval of relevant activities and events in video corpora. With the affordability and popularity of cameras for government, personal and retail use, the quantity of available video data is rapidly outscaling our ability to reason over it. Towards the end of empowering users to navigate and interact with the contents of these video corpora, we propose a framework for exploratory search that emphasizes activity structure and search space reduction over complex feature representations. Exploratory search is a user driven process wherein a person provides a system with a query describing the activity, event, or object he is interested in finding. Typically, this description takes the implicit form of one or more exemplar videos, but it can also involve an explicit description. The system returns candidate matches, followed by query refinement and iteration. System performance is judged by the run-time of the system and the precision/recall curve of of the query matches returned. Scaling is one of the primary challenges in video search. From vast web-video archives like youtube (1 billion videos and counting) to the 30 million active surveillance cameras shooting an estimated 4 billion hours of footage every week in the United States, trying to find a set of matches can be like looking for a needle in a haystack. Our goal is to create an efficient archival representation of video corpora that can be calculated in real-time as video streams in, and then enables a user to quickly get a set of results that match. First, we design a system for rapidly identifying simple queries in large-scale video corpora. Instead of focusing on feature design, our system focuses on the spatiotemporal relationships between those features as a means of disambiguating an activity of interest from background. We define a semantic feature vocabulary of concepts that are both readily extracted from video and easily understood by an operator. As data streams in, features are hashed to an inverted index and retrieved in constant time after the system is presented with a user's query. We take a zero-shot approach to exploratory search: the user manually assembles vocabulary elements like color, speed, size and type into a graph. Given that information, we perform an initial downsampling of the archived data, and design a novel dynamic programming approach based on genome-sequencing to search for similar patterns. Experimental results indicate that this approach outperforms other methods for detecting activities in surveillance video datasets. Second, we address the problem of representing complex activities that take place over long spans of space and time. Subgraph and graph matching methods have seen limited use in exploratory search because both problems are provably NP-hard. In this work, we render these problems computationally tractable by identifying the maximally discriminative spanning tree (MDST), and using dynamic programming to optimally reduce the archive data based on a custom algorithm for tree-matching in attributed relational graphs. We demonstrate the efficacy of this approach on popular surveillance video datasets in several modalities. Finally, we design an approach for successive search space reduction in subgraph matching problems. Given a query graph and archival data, our algorithm iteratively selects spanning trees from the query graph that optimize the expected search space reduction at each step until the archive converges. We use this approach to efficiently reason over video surveillance datasets, simulated data, as well as large graphs of protein data

    Detection and Recognition of Traffic Signs Inside the Attentional Visual Field of Drivers

    Get PDF
    Traffic sign detection and recognition systems are essential components of Advanced Driver Assistance Systems and self-driving vehicles. In this contribution we present a vision-based framework which detects and recognizes traffic signs inside the attentional visual field of drivers. This technique takes advantage of the driver\u27s 3D absolute gaze point obtained through the combined use of a front-view stereo imaging system and a non-contact 3D gaze tracker. We used a linear Support Vector Machine as a classifier and a Histogram of Oriented Gradient as features for detection. Recognition is performed by using Scale Invariant Feature Transforms and color information. Our technique detects and recognizes signs which are in the field of view of the driver and also provides indication when one or more signs have been missed by the driver

    Cloud-Induced Uncertainty for Visual Navigation

    Get PDF
    This research addresses the numerical distortion of features due to the presence of clouds in an image. The research aims to quantify the probability of a mismatch between two features in a single image, which will describe the likelihood that a visual navigation system incorrectly tracks a feature throughout an image sequence, leading to position miscalculations. First, an algorithm is developed for calculating transparency of clouds in images at the pixel level. The algorithm determines transparency based on the distance between each pixel color and the average pixel color of the clouds. The algorithm is used to create a dataset of cloudy aerial images. Matching features are then detected between the original and cloudy images, which allows a direct comparison between features with and without clouds. The transparency values are used to segment the detected features into three categories, based on whether the features are located in the regions without clouds, along edges of clouds, or with clouds. The error between features on the cloudy and cloud-free images is determined, and used as a basis for generating a synthetic dataset with statistically similar properties. Lastly, Monte Carlo techniques are used to find the probability of mismatching

    Active object recognition for 2D and 3D applications

    Get PDF
    Includes bibliographical referencesActive object recognition provides a mechanism for selecting informative viewpoints to complete recognition tasks as quickly and accurately as possible. One can manipulate the position of the camera or the object of interest to obtain more useful information. This approach can improve the computational efficiency of the recognition task by only processing viewpoints selected based on the amount of relevant information they contain. Active object recognition methods are based around how to select the next best viewpoint and the integration of the extracted information. Most active recognition methods do not use local interest points which have been shown to work well in other recognition tasks and are tested on images containing a single object with no occlusions or clutter. In this thesis we investigate using local interest points (SIFT) in probabilistic and non-probabilistic settings for active single and multiple object and viewpoint/pose recognition. Test images used contain objects that are occluded and occur in significant clutter. Visually similar objects are also included in our dataset. Initially we introduce a non-probabilistic 3D active object recognition system which consists of a mechanism for selecting the next best viewpoint and an integration strategy to provide feedback to the system. A novel approach to weighting the uniqueness of features extracted is presented, using a vocabulary tree data structure. This process is then used to determine the next best viewpoint by selecting the one with the highest number of unique features. A Bayesian framework uses the modified statistics from the vocabulary structure to update the system's confidence in the identity of the object. New test images are only captured when the belief hypothesis is below a predefined threshold. This vocabulary tree method is tested against randomly selecting the next viewpoint and a state-of-the-art active object recognition method by Kootstra et al.. Our approach outperforms both methods by correctly recognizing more objects with less computational expense. This vocabulary tree method is extended for use in a probabilistic setting to improve the object recognition accuracy. We introduce Bayesian approaches for object recognition and object and pose recognition. Three likelihood models are introduced which incorporate various parameters and levels of complexity. The occlusion model, which includes geometric information and variables that cater for the background distribution and occlusion, correctly recognizes all objects on our challenging database. This probabilistic approach is further extended for recognizing multiple objects and poses in a test images. We show through experiments that this model can recognize multiple objects which occur in close proximity to distractor objects. Our viewpoint selection strategy is also extended to the multiple object application and performs well when compared to randomly selecting the next viewpoint, the activation model and mutual information. We also study the impact of using active vision for shape recognition. Fourier descriptors are used as input to our shape recognition system with mutual information as the active vision component. We build multinomial and Gaussian distributions using this information, which correctly recognizes a sequence of objects. We demonstrate the effectiveness of active vision in object recognition systems. We show that even in different recognition applications using different low level inputs, incorporating active vision improves the overall accuracy and decreases the computational expense of object recognition systems

    Methods for efficient object categorization, detection, scene recognition, and image search

    Get PDF
    In the past few years there has been a tremendous growth in the usage of digital images. Users can now access millions of photos, a fact that poses the need of having methods that can efficiently and effectively search the visual information of interest. In this thesis, we propose methods to learn image representations to compactly represent a large collection of images, enabling accurate image recognition with linear classification models which offer the advantage of being efficient to both train and test. The entries of our descriptors are the output of a set of basis classifiers evaluated on the image, which capture the presence or absence of a set of high-level visual concepts. We propose two different techniques to automatically discover the visual concepts and learn the basis classifiers from a given labeled dataset of pictures, producing descriptors that highly-discriminate the original categories of the dataset. We empirically show that these descriptors are able to encode new unseen pictures, and produce state-of-the-art results in conjunct with cheap linear classifiers. We describe several strategies to aggregate the outputs of basis classifiers evaluated on multiple subwindows of the image in order to handle cases when the photo contains multiple objects and large amounts of clutter. We extend this framework for the task of object detection, where the goal is to spatially localize an object within an image. We use the output of a collection of detectors trained in an offline stage as features for new detection problems, showing competitive results with the current state of the art. Since generating rich manual annotations for an image dataset is a crucial limit of modern methods in object localization and detection, in this thesis we also propose a method to automatically generate training data for an object detector in a weakly-supervised fashion, yielding considerable savings in human annotation efforts. We show that our automatically-generated regions can be used to train object detectors with recognition results remarkably close to those obtained by training on manually annotated bounding boxes
    corecore