916 research outputs found

    Quantization Selection of Colour Histogram Bins to Categorize the Colour Appearance of Landscape Paintings for Image Retrieval

    Get PDF
    In the world of today, most images are digitized and kept in digital libraries for better organization and management. With the growth of information and communication technology, collection holders such as museums or cultural institutions have been increasingly interested in making their collections available anytime and anywhere for any Image Retrieval (IR) activities such as browsing and searching. In a colour image retrieval application, images retrieved by users are accomplished according to their specifications on what they want or acquire, which could be based upon so many concepts. We suggest an  approach to categorize the colour appearances of whole scene landscape painting images based on human colour perception. The colour features in the image are represented using a colour histogram. We then find  the suitable quantization bins that can be used to generate optimum colour histograms for all categories of colour appearances, which is selected based on theHarmonic Mean of the precision and recall,  also known as F-Score percentage higher saturated value. Colour appearance attributes in the CIELab colour model (L-Lightness, a and b are colour-opponent dimension) are used to generate colour appearance feature vectors namely the saturation metric, lightness metric and multicoloured metric. For the categorizations, we use the Nearest Neighbour (NN) method to detect the classes by using the predefined colour appearance descriptor measures and the pre-set thresholds.  The experimental results show that the quantization of CIELab colour model into 11 uniformly bins for each component had achieved the optimum result for all colour appearances categories

    Realtime Color Stereovision Processing

    Get PDF
    Recent developments in aviation have made micro air vehicles (MAVs) a reality. These featherweight palm-sized radio-controlled flying saucers embody the future of air-to-ground combat. No one has ever successfully implemented an autonomous control system for MAVs. Because MAVs are physically small with limited energy supplies, video signals offer superiority over radar for navigational applications. This research takes a step forward in real time machine vision processing. It investigates techniques for implementing a real time stereovision processing system using two miniature color cameras. The effects of poor-quality optics are overcome by a robust algorithm, which operates in real time and achieves frame rates up to 10 fps in ideal conditions. The vision system implements innovative work in the following five areas of vision processing: fast image registration preprocessing, object detection, feature correspondence, distortion-compensated ranging, and multi scale nominal frequency-based object recognition. Results indicate that the system can provide adequate obstacle avoidance feedback for autonomous vehicle control. However, typical relative position errors are about 10%-to high for surveillance applications. The range of operation is also limited to between 6 - 30 m. The root of this limitation is imprecise feature correspondence: with perfect feature correspondence the range would extend to between 0.5 - 30 m. Stereo camera separation limits the near range, while optical resolution limits the far range. Image frame sizes are 160x120 pixels. Increasing this size will improve far range characteristics but will also decrease frame rate. Image preprocessing proved to be less appropriate than precision camera alignment in this application. A proof of concept for object recognition shows promise for applications with more precise object detection. Future recommendations are offered in all five areas of vision processing

    Matching and Predicting Street Level Images

    Get PDF
    The paradigm of matching images to a very large dataset has been used for numerous vision tasks and is a powerful one. If the image dataset is large enough, one can expect to nd good matches of almost any image to the database, allowing label transfer [3, 15], and image editing or enhancement [6, 11]. Users of this approach will want to know how many images are required, and what features to use for nding semantic relevant matches. Furthermore, for navigation tasks or to exploit context, users will want to know the predictive quality of the dataset: can we predict the image that would be seen under changes in camera position? We address these questions in detail for one category of images: street level views. We have a dataset of images taken from an enumeration of positions and viewpoints within Pittsburgh.We evaluate how well we can match those images, using images from non-Pittsburgh cities, and how well we can predict the images that would be seen under changes in cam- era position. We compare performance for these tasks for eight di erent feature sets, nding a feature set that outperforms the others (HOG). A combination of all the features performs better in the prediction task than any individual feature. We used Amazon Mechanical Turk workers to rank the matches and predictions of di erent algorithm conditions by comparing each one to the selection of a random image. This approach can evaluate the e cacy of di erent feature sets and parameter settings for the matching paradigm with other image categories.United States. Dept. of Defense (ARDA VACE)United States. National Geospatial-Intelligence Agency (NEGI-1582-04- 0004)United States. National Geospatial-Intelligence Agency (MURI Grant N00014-06-1-0734)France. Agence nationale de la recherche (project HFIBMR (ANR-07-BLAN- 0331-01))Institut national de recherche en informatique et en automatique (France)Xerox Fellowship Progra

    DART: Distribution Aware Retinal Transform for Event-based Cameras

    Full text link
    We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-features classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101). (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) For overcoming the low-sample problem for the one-shot learning of a binary classifier, statistical bootstrapping is leveraged with online learning; (ii) To achieve tracker robustness, the scale and rotation equivariance property of the DART descriptors is exploited for the one-shot learning. (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset. (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201

    Free-hand sketch recognition by multi-kernel feature learning

    Get PDF
    Abstract Free-hand sketch recognition has become increasingly popular due to the recent expansion of portable touchscreen devices. However, the problem is non-trivial due to the complexity of internal structures that leads to intra-class variations, coupled with the sparsity in visual cues that results in inter-class ambiguities. In order to address the structural complexity, a novel structured representation for sketches is proposed to capture the holistic structure of a sketch. Moreover, to overcome the visual cue sparsity problem and therefore achieve state-of-the-art recognition performance, we propose a Multiple Kernel Learning (MKL) framework for sketch recognition, fusing several features common to sketches. We evaluate the performance of all the proposed techniques on the most diverse sketch dataset to date (Mathias et al., 2012), and offer detailed and systematic analyses of the performance of different features and representations, including a breakdown by sketch-super-category. Finally, we investigate the use of attributes as a high-level feature for sketches and show how this complements low-level features for improving recognition performance under the MKL framework, and consequently explore novel applications such as attribute-based retrieval

    Identity verification using computer vision for automatic garage door opening

    Get PDF
    We present a novel system for automatic identification of vehicles as part of an intelligent access control system for a garage entrance. Using a camera in the door, cars are detected and matched to the database of authenticated cars. Once a car is detected, License Plate Recognition (LPR) is applied using character detection and recognition. The found license plate number is matched with the database of authenticated plates. If the car is allowed access, the door will open automatically. The recognition of both cars and characters (LPR) is performed using state-ofthe- art shape descriptors and a linear classifier. Experiments have revealed that 90% of all cars are correctly authenticated from a single image only. Analysis of the computational complexity shows that an embedded implementation allows user authentication within approximately 300ms, which is well within the application constraints

    Development of Vision-based Response of Autonomous Vehicles Towards Emergency Vehicles Using Infrastructure Enabled Autonomy

    Get PDF
    The effectiveness of law enforcement and public safety is directly dependent on the time taken by first responders to arrive at the scene of an emergency. The primary objective of this thesis is to develop techniques and actions of response for an autonomous vehicle in emergency scenarios. This work discusses the methods developed to identify Emergency Vehicles (EV) and use its localized information to develop response actions for autonomous vehicles in emergency scenarios using an Infrastructure-Enabled Autonomy (IEA) setup. IEA is a new paradigm in autonomous vehicles research that aims at distributed intelligence architecture by transferring the core functionalities of sensing and localization to a roadside infrastructure setup. In this work two independent frameworks were developed to identify Emergency vehicles in a video feed using computer vision techniques: (1) A one-stage framework where an object detection algorithm is trained on a custom dataset to detect EVs, (2) A two-stage framework where an object classification is independently implemented in series with an object detection pipeline to classify vehicles into EVs and nonEVs. The performance of many popular classification models were compared on a combination of multi-spectral feature vectors of an image to identify the ideal combination to be used for EV identification rule. Localized position co-ordinates of an EV are obtained by deploying the classification routine on IEA. This position information is used as an input in an autonomous vehicle and an ideal response action is developed
    corecore