251 research outputs found

    SPARCNN: SPAtially Related Convolutional Neural Networks

    Full text link
    The ability to accurately detect and classify objects at varying pixel sizes in cluttered scenes is crucial to many Navy applications. However, detection performance of existing state-of the-art approaches such as convolutional neural networks (CNNs) degrade and suffer when applied to such cluttered and multi-object detection tasks. We conjecture that spatial relationships between objects in an image could be exploited to significantly improve detection accuracy, an approach that had not yet been considered by any existing techniques (to the best of our knowledge) at the time the research was conducted. We introduce a detection and classification technique called Spatially Related Detection with Convolutional Neural Networks (SPARCNN) that learns and exploits a probabilistic representation of inter-object spatial configurations within images from training sets for more effective region proposals to use with state-of-the-art CNNs. Our empirical evaluation of SPARCNN on the VOC 2007 dataset shows that it increases classification accuracy by 8% when compared to a region proposal technique that does not exploit spatial relations. More importantly, we obtained a higher performance boost of 18.8% when task difficulty in the test set is increased by including highly obscured objects and increased image clutter.Comment: 6 pages, AIPR 2016 submissio

    FINE-GRAINED OBJECT DETECTION

    Get PDF
    Object detection plays a vital role in many real-world computer vision applications such as selfdriving cars, human-less stores and general purpose robotic systems. Convolutional Neural Network(CNN) based Deep Learning has evolved to become the backbone of most computer vision algorithms, including object detection. Most of the research has focused on detecting objects that differ significantly e.g. a car, a person, and a bird. Achieving fine-grained object detection to detect different types within one class of objects from general object detection can be the next step. Fine-grained object detection is crucial to tasks like automated retail checkout. This research has developed deep learning models to detect 200 types of birds of similar size and shape. The models were trained and tested on CUB-200-2011 dataset. To the best of our knowledge, by attaining a mean Average Precision (mAP) of 71.5% we achieved an improvement of 5 percentage points over the previous best mAP of 66.2%

    RadioGalaxyNET: Dataset and Novel Computer Vision Algorithms for the Detection of Extended Radio Galaxies and Infrared Hosts

    Full text link
    Creating radio galaxy catalogues from next-generation deep surveys requires automated identification of associated components of extended sources and their corresponding infrared hosts. In this paper, we introduce RadioGalaxyNET, a multimodal dataset, and a suite of novel computer vision algorithms designed to automate the detection and localization of multi-component extended radio galaxies and their corresponding infrared hosts. The dataset comprises 4,155 instances of galaxies in 2,800 images with both radio and infrared channels. Each instance provides information about the extended radio galaxy class, its corresponding bounding box encompassing all components, the pixel-level segmentation mask, and the keypoint position of its corresponding infrared host galaxy. RadioGalaxyNET is the first dataset to include images from the highly sensitive Australian Square Kilometre Array Pathfinder (ASKAP) radio telescope, corresponding infrared images, and instance-level annotations for galaxy detection. We benchmark several object detection algorithms on the dataset and propose a novel multimodal approach to simultaneously detect radio galaxies and the positions of infrared hosts.Comment: Accepted for publication in PASA. The paper has 17 pages, 6 figures, 5 table

    Learning visual tasks with selective attention

    Get PDF
    Knowing where to look in an image can significantly improve performance in computer vision tasks by eliminating irrelevant information from the rest of the input image, and by breaking down complex scenes into simpler and more familiar sub-components. We show that a framework for identifying multiple task-relevant regions can be learned in current state-of-the-art deep network architectures, resulting in significant gains in several visual prediction tasks. We will demonstrate both directly and indirectly supervised models for selecting image regions and show how they can improve performance over baselines by means of focusing on the right areas

    Deep learning for real-world object detection

    Get PDF

    An Approach Of Features Extraction And Heatmaps Generation Based Upon Cnns And 3D Object Models

    Get PDF
    The rapid advancements in artificial intelligence have enabled recent progress of self-driving vehicles. However, the dependence on 3D object models and their annotations collected and owned by individual companies has become a major problem for the development of new algorithms. This thesis proposes an approach of directly using graphics models created from open-source datasets as the virtual representation of real-world objects. This approach uses Machine Learning techniques to extract 3D feature points and to create annotations from graphics models for the recognition of dynamic objects, such as cars, and for the verification of stationary and variable objects, such as buildings and trees. Moreover, it generates heat maps for the elimination of stationary/variable objects in real-time images before working on the recognition of dynamic objects. The proposed approach helps to bridge the gap between the virtual and physical worlds and to facilitate the development of new algorithms for self-driving vehicles

    Computer vision for plant and animal inventory

    Get PDF
    The population, composition, and spatial distribution of the plants and animals in certain regions are always important data for natural resource management, conservation and farming. The traditional ways to acquire such data require human participation. The procedure of data processing by human is usually cumbersome, expensive and time-consuming. Hence the algorithms for automatic animal and plant inventory show their worth and become a hot topic. We propose a series of computer vision methods for automated plant and animal inventory, to recognize, localize, categorize, track and count different objects of interest, including vegetation, trees, fishes and livestock animals. We make use of different sensors, hardware platforms, neural network architectures and pipelines to deal with the varied properties and challenges of these objects. (1) For vegetation analysis, we propose a fast multistage method to estimate the coverage. The reference board is localized based on its edge and texture features. And then a K-means color model of the board is generated. Finally, the vegetation is segmented at pixel level using the color model. The proposed method is robust to lighting condition changes. (2) For tree counting in aerial images, we propose a novel method called density transformer, or DENT, to learn and predict the density of the trees at different positions. DENT uses an efficient multi-receptive field network to extract visual features from different positions. A transformer encoder is applied to filter and transfer useful contextual information across different spatial positions. DENT significantly outperformed the existing state-of-art CNN detectors and regressors on both the dataset built by ourselves and an existing cross-site dataset. (3) We propose a framework of fish classification system using boat cameras. The framework contains two branches. A branch extracts the contextual information from the whole image. The other branch localizes all the individual fish and normalizes their poses. The classification results from the two branches are weighted based on the clearness of the image and the familiarness of the context. Our system achieved the top 1 percent rank in the competition of The Nature Conservancy Fisheries Monitoring. (4) We also propose a video-based pig counting algorithm using an inspection robot. We adopt a novel bottom-up keypoint tracking method and a novel spatial-aware temporal response filtering method to count the pigs. The proposed approach outperformed the other methods and even human competitors in the experiments.Includes bibliographical references
    • …
    corecore