49,487 research outputs found

    Enhanced target recognition employing spatial correlation filters and affine scale invariant feature transform

    Get PDF
    A spatial domain optimal trade-off Maximum Average Correlation Height (SPOT-MACH) filter has been shown to have advantages over frequency domain implementations of the Optimal Trade-Off Maximum Average Correlation Height (OR-MACH) filter as it can be made locally adaptive to spatial variations in the input image background clutter and normalized for local intensity changes. This enables the spatial domain implementation to be resistant to illumination changes. The Affine Scale Invariant Feature Transform (ASIFT) is an extension of previous feature transform algorithms; its features are invariant to six affine parameters which are translation (2 parameters), zoom, rotation and two camera axis orientations. This results in it accurately matching increased numbers of key points which can then be used for matching between different images of the object being tested. In this paper a novel approach will be adopted for enhancing the performance of the spatial correlation filter (SPOT MACH filter) using ASIFT in a pre-processing stage enabling fully invariant object detection and recognition in images with geometric distortions. An optimization criterion is also be developed to overcome the temporal overhead of the spatial domain approach. In order to evaluate effectiveness of algorithm, experiments were conducted on two different data sets. Several test cases were created based on illumination, rotational and scale changes in the target object. The performance of correlation algorithms was also tested against composite images as references and it was found that this results in a well-trained filter with better detection ability even when the target object has gone through large rotational changes

    Car Detecting Method using high Resolution images

    Get PDF
    A car detection method is implemented using high resolution images, because these images gives high level of object details in the image as compare with satellite images. There are two feature extraction algorithms are used for implementation such as SIFT (Scale Invariant Feature Transform) and HOG (Histogram of Oriented Gradient). SIFT keypoints of objects are first extracted from a set of reference images and stored in a database. HOG descriptors are feature descriptors used in image processing for the purpose of object detection. The HOG technique counts occurrences of gradient orientation in localized portions of an image. The HOG algorithm used for extracting HOG features. These HOG features will be used for classification and object recognition. The classification process is performed using SVM (Support Vector Machine) classifier. The SVM builds a model with a training set that is presented to it and assigns test samples based on the model. Finally get the SIFT results and HOG results, then compare both results to check better accuracy performance. The proposed method detects the number of cars more accuratel

    Extraction of Exclusive Video Content from One Shot Video

    Get PDF
    With the popularity of personal digital devices, the amount of home video data is growing explosively. Many videos may only contain a single shot and are very short and their contents are diverse yet related with few major subjects or events. Users often ne ed to maintain their own video clip collections captured at different locations and time. These unedited and unorganized videos bring difficulties to their management and manipulation. This video composition system is used to generate aesthetically enhanced long - shot videos from short video clips. Our proposed system is to extract the video contents about a specific topic and compose them into a virtual one - shot presentation. All input short video clips are pre - processed and converted as one - shot video. Video frames are detected and categorized by using transition clues like human, object. Human and object frames are separated by implementing a face detection algorithm for the input one - shot video. Viola Jones face detection algorithm is used for separating human and object frames. There are three ingredients in this algorithm, worki ng in concert to enable a fast and a ccurate detection. The integral image for feature computation, adaboost for feature selection and an attentional cascade for efficient computational resource allocation. Objects are then categorized using SIFT (Scale Invariant Feature Transform) and SURF ( Speed Up Robust Features) algorithm

    Analysis of a biologically-inspired system for real-time object recognition

    Get PDF
    We present a biologically-inspired system for real-time, feed-forward object recognition in cluttered scenes. Our system utilizes a vocabulary of very sparse features that are shared between and within different object models. To detect objects in a novel scene, these features are located in the image, and each detected feature votes for all objects that are consistent with its presence. Due to the sharing of features between object models our approach is more scalable to large object databases than traditional methods. To demonstrate the utility of this approach, we train our system to recognize any of 50 objects in everyday cluttered scenes with substantial occlusion. Without further optimization we also demonstrate near-perfect recognition on a standard 3-D recognition problem. Our system has an interpretation as a sparsely connected feed-forward neural network, making it a viable model for fast, feed-forward object recognition in the primate visual system

    Robust Object-Based Watermarking Using SURF Feature Matching and DFT Domain

    Get PDF
    In this paper we propose a robust object-based watermarking method, in which the watermark is embedded into the middle frequencies band of the Discrete Fourier Transform (DFT) magnitude of the selected object region, altogether with the Speeded Up Robust Feature (SURF) algorithm to allow the correct watermark detection, even if the watermarked image has been distorted. To recognize the selected object region after geometric distortions, during the embedding process the SURF features are estimated and stored in advance to be used during the detection process. In the detection stage, the SURF features of the distorted image are estimated and match them with the stored ones. From the matching result, SURF features are used to compute the Affine-transformation parameters and the object region is recovered. The quality of the watermarked image is measured using the Peak Signal to Noise Ratio (PSNR), Structural Similarity Index (SSIM) and the Visual Information Fidelity (VIF). The experimental results show the proposed method provides robustness against several geometric distortions, signal processing operations and combined distortions. The receiver operating characteristics (ROC) curves also show the desirable detection performance of the proposed method. The comparison with a previously reported methods based on different techniques is also provided

    Oriented Response Networks

    Full text link
    Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a virtual filter bank containing the filter itself and its multiple unmaterialised rotated versions. During back-propagation, an ARF is collectively updated using errors from all its rotated versions. DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks. The oriented response produced by ORNs can also be used for image and object orientation estimation tasks. Over multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we consistently observe that replacing regular filters with the proposed ARFs leads to significant reduction in the number of network parameters and improvement in classification performance. We report the best results on several commonly used benchmarks.Comment: Accepted in CVPR 2017. Source code available at http://yzhou.work/OR

    Deformable Convolutional Networks

    Full text link
    Convolutional neural networks (CNNs) are inherently limited to model geometric transformations due to the fixed geometric structures in its building modules. In this work, we introduce two new modules to enhance the transformation modeling capacity of CNNs, namely, deformable convolution and deformable RoI pooling. Both are based on the idea of augmenting the spatial sampling locations in the modules with additional offsets and learning the offsets from target tasks, without additional supervision. The new modules can readily replace their plain counterparts in existing CNNs and can be easily trained end-to-end by standard back-propagation, giving rise to deformable convolutional networks. Extensive experiments validate the effectiveness of our approach on sophisticated vision tasks of object detection and semantic segmentation. The code would be released
    • …
    corecore