15,441 research outputs found

    Object detection for big data

    Get PDF
    "May 2014."Dissertation supervisor: Dr. Tony X. Han.Includes vita.We have observed significant advances in object detection over the past few decades and gladly seen the related research has began to contribute to the world: Vehicles could automatically stop before hitting any pedestrian; Face detectors have been integrated into smart phones and tablets; Video surveillance systems could locate the suspects and stop crimes. All these applications demonstrate the substantial research progress on object detection. However learning a robust object detector is still quite challenging due to the fact that object detection is a very unbalanced big data problem. In this dissertation, we aim at improving the object detector's performance from different aspects. For object detection, the state-of-the-art performance is achieved through supervised learning. The performances of object detectors of this kind are mainly determined by two factors: features and underlying classification algorithms. We have done thorough research on both of these factors. Our contribution involves model adaption, local learning, contextual boosting, template learning and feature development. Since the object detection is an unbalanced problem, in which positive examples are hard to be collected, we propose to adapt a general object detector for a specific scenario with a few positive examples; To handle the large intra-class variation problem lying in object detection task, we propose a local adaptation method to learn a set of efficient and effective detectors for a single object category; To extract the effective context from the huge amount of negative data in object detection, we introduce a novel contextual descriptor to iteratively improve the detector; To detect object with a depth sensor, we design an effective depth descriptor; To distinguish the object categories with the similar appearance, we propose a local feature embedding and template selection algorithm, which has been successfully incorporated into a real-world fine-grained object recognition application. All the proposed algorithms and featuIncludes bibliographical references (pages 117-130)

    Online Domain Adaptation for Multi-Object Tracking

    Full text link
    Automatically detecting, labeling, and tracking objects in videos depends first and foremost on accurate category-level object detectors. These might, however, not always be available in practice, as acquiring high-quality large scale labeled training datasets is either too costly or impractical for all possible real-world application scenarios. A scalable solution consists in re-using object detectors pre-trained on generic datasets. This work is the first to investigate the problem of on-line domain adaptation of object detectors for causal multi-object tracking (MOT). We propose to alleviate the dataset bias by adapting detectors from category to instances, and back: (i) we jointly learn all target models by adapting them from the pre-trained one, and (ii) we also adapt the pre-trained model on-line. We introduce an on-line multi-task learning algorithm to efficiently share parameters and reduce drift, while gradually improving recall. Our approach is applicable to any linear object detector, and we evaluate both cheap "mini-Fisher Vectors" and expensive "off-the-shelf" ConvNet features. We quantitatively measure the benefit of our domain adaptation strategy on the KITTI tracking benchmark and on a new dataset (PASCAL-to-KITTI) we introduce to study the domain mismatch problem in MOT.Comment: To appear at BMVC 201

    Activity-driven content adaptation for effective video summarisation

    Get PDF
    In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided

    Domain adaptation for pedestrian detection

    Get PDF
    Object detection is an essential component of many computer vision systems. The increase in the amount of collected digital data and new applications of computer vision have generated a demand for object detectors for many different types of scenes digitally captured in diverse settings. The appearance of objects captured across these different scenarios can vary significantly, causing readily available state-of-the-art object detectors to perform poorly in many of the scenes. One solution is to annotate and collect labelled data for each new scene and train a scene-specific object detector that is specialised to perform well for that scene, but such a method is labour intensive and impractical. In this thesis, we propose three novel contributions to learn scene-specific pedestrian detectors for scenes with minimal human supervision effort. In the first and second contributions, we formulate the problem as unsupervised domain adaptation in which a readily available generic pedestrian detector is automatically adapted to specific scenes (without any labelled data from the scenes). In the third contribution, we formulate it as a weakly supervised learning algorithm requiring annotations of only pedestrian centres. The first contribution is a detector adaptation algorithm using joint dataset feature learning. We use state-of-the-art deep learning for the purpose of detector adaptation by exploiting the assumption that the data lies on a low dimensional manifold. The algorithm significantly outperforms a state-of-the-art approach that makes use of a similar manifold assumption. The second contribution presents an efficient detector adaptation algorithm that makes effective use of cues (e.g spatio-temporal constraints) available in video. We show that, for videos, such cues can dramatically help with the detector adaptation. We extensively compare our approach with state-of-the-art algorithms and show that our algorithm outperforms the competing approaches despite being simpler to implement and apply. In the third contribution, we approach the task of reducing manual annotation effort by formulating the problem as a weakly supervised learning algorithm that requires annotation of only approximate centres of pedestrians (instead of the usual precise bounding boxes). Instead of assuming the availability of a generic detector and adapting it to new scenes as in the first two contributions, we collect manual annotation for new scenes but make the annotation task easier and faster. Our algorithm reduces the amount of manual annotation effort by approximately four times while maintaining a similar detection performance as the standard training methods. We evaluate each of the proposed algorithms on two challenging publicly available video datasets
    • …
    corecore