Search CORE

251 research outputs found

SPARCNN: SPAtially Related Convolutional Neural Networks

Author: Aha David
Gupta Kalyan Moy
Turner JT
Publication venue
Publication date: 24/08/2017
Field of study

The ability to accurately detect and classify objects at varying pixel sizes in cluttered scenes is crucial to many Navy applications. However, detection performance of existing state-of the-art approaches such as convolutional neural networks (CNNs) degrade and suffer when applied to such cluttered and multi-object detection tasks. We conjecture that spatial relationships between objects in an image could be exploited to significantly improve detection accuracy, an approach that had not yet been considered by any existing techniques (to the best of our knowledge) at the time the research was conducted. We introduce a detection and classification technique called Spatially Related Detection with Convolutional Neural Networks (SPARCNN) that learns and exploits a probabilistic representation of inter-object spatial configurations within images from training sets for more effective region proposals to use with state-of-the-art CNNs. Our empirical evaluation of SPARCNN on the VOC 2007 dataset shows that it increases classification accuracy by 8% when compared to a region proposal technique that does not exploit spatial relations. More importantly, we obtained a higher performance boost of 18.8% when task difficulty in the test set is increased by including highly obscured objects and increased image clutter.Comment: 6 pages, AIPR 2016 submissio

arXiv.org e-Print Archive

Crossref

FINE-GRAINED OBJECT DETECTION

Author: Dalal Rahul
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2018
Field of study

Object detection plays a vital role in many real-world computer vision applications such as selfdriving cars, human-less stores and general purpose robotic systems. Convolutional Neural Network(CNN) based Deep Learning has evolved to become the backbone of most computer vision algorithms, including object detection. Most of the research has focused on detecting objects that differ significantly e.g. a car, a person, and a bird. Achieving fine-grained object detection to detect different types within one class of objects from general object detection can be the next step. Fine-grained object detection is crucial to tasks like automated retail checkout. This research has developed deep learning models to detect 200 types of birds of similar size and shape. The models were trained and tested on CUB-200-2011 dataset. To the best of our knowledge, by attaining a mean Average Precision (mAP) of 71.5% we achieved an improvement of 5 percentage points over the previous best mAP of 66.2%

SJSU ScholarWorks

RadioGalaxyNET: Dataset and Novel Computer Vision Algorithms for the Detection of Extended Radio Galaxies and Infrared Hosts

Author: Gupta Nikhel
Hayder Zeeshan
Huynh Minh
Norris Ray P.
Petersson Lars
Publication venue
Publication date: 30/11/2023
Field of study

Creating radio galaxy catalogues from next-generation deep surveys requires automated identification of associated components of extended sources and their corresponding infrared hosts. In this paper, we introduce RadioGalaxyNET, a multimodal dataset, and a suite of novel computer vision algorithms designed to automate the detection and localization of multi-component extended radio galaxies and their corresponding infrared hosts. The dataset comprises 4,155 instances of galaxies in 2,800 images with both radio and infrared channels. Each instance provides information about the extended radio galaxy class, its corresponding bounding box encompassing all components, the pixel-level segmentation mask, and the keypoint position of its corresponding infrared host galaxy. RadioGalaxyNET is the first dataset to include images from the highly sensitive Australian Square Kilometre Array Pathfinder (ASKAP) radio telescope, corresponding infrared images, and instance-level annotations for galaxy detection. We benchmark several object detection algorithms on the dataset and propose a novel multimodal approach to simultaneously detect radio galaxies and the positions of infrared hosts.Comment: Accepted for publication in PASA. The paper has 17 pages, 6 figures, 5 table

arXiv.org e-Print Archive

Learning visual tasks with selective attention

Author: Shih Kevin Jonathan
Publication venue
Publication date: 01/08/2017
Field of study

Knowing where to look in an image can significantly improve performance in computer vision tasks by eliminating irrelevant information from the rest of the input image, and by breaking down complex scenes into simpler and more familiar sub-components. We show that a framework for identifying multiple task-relevant regions can be learned in current state-of-the-art deep network architectures, resulting in significant gains in several visual prediction tasks. We will demonstrate both directly and indirectly supervised models for selecting image regions and show how they can improve performance over baselines by means of focusing on the right areas

Illinois Digital Environment for Access to Learning and Scholarship Repository

Deep learning for real-world object detection

Author: WU Xiongwei
Publication venue: Singapore Management University
Publication date: 01/07/2020
Field of study

Institutional Knowledge at Singapore Management University

An Approach Of Features Extraction And Heatmaps Generation Based Upon Cnns And 3D Object Models

Author: Pachika Shivani
Publication venue: 'University of Windsor Leddy Library'
Publication date: 27/09/2019
Field of study

The rapid advancements in artificial intelligence have enabled recent progress of self-driving vehicles. However, the dependence on 3D object models and their annotations collected and owned by individual companies has become a major problem for the development of new algorithms. This thesis proposes an approach of directly using graphics models created from open-source datasets as the virtual representation of real-world objects. This approach uses Machine Learning techniques to extract 3D feature points and to create annotations from graphics models for the recognition of dynamic objects, such as cars, and for the verification of stationary and variable objects, such as buildings and trees. Moreover, it generates heat maps for the elimination of stationary/variable objects in real-time images before working on the recognition of dynamic objects. The proposed approach helps to bridge the gap between the virtual and physical worlds and to facilitate the development of new algorithms for self-driving vehicles

Scholarship at UWindsor

Computer vision for plant and animal inventory

Author: Chen Guang
Publication venue: University of Missouri--Columbia
Publication date
Field of study

The population, composition, and spatial distribution of the plants and animals in certain regions are always important data for natural resource management, conservation and farming. The traditional ways to acquire such data require human participation. The procedure of data processing by human is usually cumbersome, expensive and time-consuming. Hence the algorithms for automatic animal and plant inventory show their worth and become a hot topic. We propose a series of computer vision methods for automated plant and animal inventory, to recognize, localize, categorize, track and count different objects of interest, including vegetation, trees, fishes and livestock animals. We make use of different sensors, hardware platforms, neural network architectures and pipelines to deal with the varied properties and challenges of these objects. (1) For vegetation analysis, we propose a fast multistage method to estimate the coverage. The reference board is localized based on its edge and texture features. And then a K-means color model of the board is generated. Finally, the vegetation is segmented at pixel level using the color model. The proposed method is robust to lighting condition changes. (2) For tree counting in aerial images, we propose a novel method called density transformer, or DENT, to learn and predict the density of the trees at different positions. DENT uses an efficient multi-receptive field network to extract visual features from different positions. A transformer encoder is applied to filter and transfer useful contextual information across different spatial positions. DENT significantly outperformed the existing state-of-art CNN detectors and regressors on both the dataset built by ourselves and an existing cross-site dataset. (3) We propose a framework of fish classification system using boat cameras. The framework contains two branches. A branch extracts the contextual information from the whole image. The other branch localizes all the individual fish and normalizes their poses. The classification results from the two branches are weighted based on the clearness of the image and the familiarness of the context. Our system achieved the top 1 percent rank in the competition of The Nature Conservancy Fisheries Monitoring. (4) We also propose a video-based pig counting algorithm using an inspection robot. We adopt a novel bottom-up keypoint tracking method and a novel spatial-aware temporal response filtering method to count the pigs. The proposed approach outperformed the other methods and even human competitors in the experiments.Includes bibliographical references

University of Missouri: MOspace