49,487 research outputs found
Enhanced target recognition employing spatial correlation filters and affine scale invariant feature transform
A spatial domain optimal trade-off Maximum Average Correlation Height (SPOT-MACH) filter has been shown to have advantages over frequency domain implementations of the Optimal Trade-Off Maximum Average Correlation Height (OR-MACH) filter as it can be made locally adaptive to spatial variations in the input image background clutter and normalized for local intensity changes. This enables the spatial domain implementation to be resistant to illumination changes. The Affine Scale Invariant Feature Transform (ASIFT) is an extension of previous feature transform algorithms; its features are invariant to six affine parameters which are translation (2 parameters), zoom, rotation and two camera axis orientations. This results in it accurately matching increased numbers of key points which can then be used for matching between different images of the object being tested. In this paper a novel approach will be adopted for enhancing the performance of the spatial correlation filter (SPOT MACH filter) using ASIFT in a pre-processing stage enabling fully invariant object detection and recognition in images with geometric distortions. An optimization criterion is also be developed to overcome the temporal overhead of the spatial domain approach. In order to evaluate effectiveness of algorithm, experiments were conducted on two different data sets. Several test cases were created based on illumination, rotational and scale changes in the target object. The performance of correlation algorithms was also tested against composite images as references and it was found that this results in a well-trained filter with better detection ability even when the target object has gone through large rotational changes
Car Detecting Method using high Resolution images
A car detection method is implemented using high resolution images, because these images gives high level of object details in the image as compare with satellite images. There are two feature extraction algorithms are used for implementation such as SIFT (Scale Invariant Feature Transform) and HOG (Histogram of Oriented Gradient). SIFT keypoints of objects are first extracted from a set of reference images and stored in a database. HOG descriptors are feature descriptors used in image processing for the purpose of object detection. The HOG technique counts occurrences of gradient orientation in localized portions of an image. The HOG algorithm used for extracting HOG features. These HOG features will be used for classification and object recognition. The classification process is performed using SVM (Support Vector Machine) classifier. The SVM builds a model with a training set that is presented to it and assigns test samples based on the model. Finally get the SIFT results and HOG results, then compare both results to check better accuracy performance. The proposed method detects the number of cars more accuratel
Extraction of Exclusive Video Content from One Shot Video
With the popularity of personal digital devices, the amount of home video data is growing explosively. Many videos may only contain a single shot and are very short and their contents are diverse yet related with few major subjects or events. Users often ne ed to maintain their own video clip collections captured at different locations and time. These unedited and unorganized videos bring difficulties to their management and manipulation. This video composition system is used to generate aesthetically enhanced long - shot videos from short video clips. Our proposed system is to extract the video contents about a specific topic and compose them into a virtual one - shot presentation. All input short video clips are pre - processed and converted as one - shot video. Video frames are detected and categorized by using transition clues like human, object. Human and object frames are separated by implementing a face detection algorithm for the input one - shot video. Viola Jones face detection algorithm is used for separating human and object frames. There are three ingredients in this algorithm, worki ng in concert to enable a fast and a ccurate detection. The integral image for feature computation, adaboost for feature selection and an attentional cascade for efficient computational resource allocation. Objects are then categorized using SIFT (Scale Invariant Feature Transform) and SURF ( Speed Up Robust Features) algorithm
Analysis of a biologically-inspired system for real-time object recognition
We present a biologically-inspired system for real-time, feed-forward object recognition in cluttered scenes. Our system utilizes a vocabulary of very sparse features that are shared between and within different object models. To detect objects in a novel scene, these features are located in the image, and each detected feature votes for all objects that are consistent with its presence. Due to the sharing of features between object models our approach is more scalable to large object databases than traditional methods. To demonstrate the utility of this approach, we train our system to recognize any of 50 objects in everyday cluttered scenes with substantial occlusion. Without further optimization we also demonstrate near-perfect recognition on a standard 3-D recognition problem. Our system has an interpretation as a sparsely connected feed-forward neural network, making it a viable model for fast, feed-forward object recognition in the primate visual system
Robust Object-Based Watermarking Using SURF Feature Matching and DFT Domain
In this paper we propose a robust object-based watermarking method, in which the watermark is embedded into the middle frequencies band of the Discrete Fourier Transform (DFT) magnitude of the selected object region, altogether with the Speeded Up Robust Feature (SURF) algorithm to allow the correct watermark detection, even if the watermarked image has been distorted. To recognize the selected object region after geometric distortions, during the embedding process the SURF features are estimated and stored in advance to be used during the detection process. In the detection stage, the SURF features of the distorted image are estimated and match them with the stored ones. From the matching result, SURF features are used to compute the Affine-transformation parameters and the object region is recovered. The quality of the watermarked image is measured using the Peak Signal to Noise Ratio (PSNR), Structural Similarity Index (SSIM) and the Visual Information Fidelity (VIF). The experimental results show the proposed method provides robustness against several geometric distortions, signal processing operations and combined distortions. The receiver operating characteristics (ROC) curves also show the desirable detection performance of the proposed method. The comparison with a previously reported methods based on different techniques is also provided
Oriented Response Networks
Deep Convolution Neural Networks (DCNNs) are capable of learning
unprecedentedly effective image representations. However, their ability in
handling significant local and global image rotations remains limited. In this
paper, we propose Active Rotating Filters (ARFs) that actively rotate during
convolution and produce feature maps with location and orientation explicitly
encoded. An ARF acts as a virtual filter bank containing the filter itself and
its multiple unmaterialised rotated versions. During back-propagation, an ARF
is collectively updated using errors from all its rotated versions. DCNNs using
ARFs, referred to as Oriented Response Networks (ORNs), can produce
within-class rotation-invariant deep features while maintaining inter-class
discrimination for classification tasks. The oriented response produced by ORNs
can also be used for image and object orientation estimation tasks. Over
multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we
consistently observe that replacing regular filters with the proposed ARFs
leads to significant reduction in the number of network parameters and
improvement in classification performance. We report the best results on
several commonly used benchmarks.Comment: Accepted in CVPR 2017. Source code available at http://yzhou.work/OR
Deformable Convolutional Networks
Convolutional neural networks (CNNs) are inherently limited to model
geometric transformations due to the fixed geometric structures in its building
modules. In this work, we introduce two new modules to enhance the
transformation modeling capacity of CNNs, namely, deformable convolution and
deformable RoI pooling. Both are based on the idea of augmenting the spatial
sampling locations in the modules with additional offsets and learning the
offsets from target tasks, without additional supervision. The new modules can
readily replace their plain counterparts in existing CNNs and can be easily
trained end-to-end by standard back-propagation, giving rise to deformable
convolutional networks. Extensive experiments validate the effectiveness of our
approach on sophisticated vision tasks of object detection and semantic
segmentation. The code would be released
- …