4,505 research outputs found

    Advances in Object and Activity Detection in Remote Sensing Imagery

    Get PDF
    The recent revolution in deep learning has enabled considerable development in the fields of object and activity detection. Visual object detection tries to find objects of target classes with precise localisation in an image and assign each object instance a corresponding class label. At the same time, activity recognition aims to determine the actions or activities of an agent or group of agents based on sensor or video observation data. It is a very important and challenging problem to detect, identify, track, and understand the behaviour of objects through images and videos taken by various cameras. Together, objects and their activity recognition in imaging data captured by remote sensing platforms is a highly dynamic and challenging research topic. During the last decade, there has been significant growth in the number of publications in the field of object and activity recognition. In particular, many researchers have proposed application domains to identify objects and their specific behaviours from air and spaceborne imagery. This Special Issue includes papers that explore novel and challenging topics for object and activity detection in remote sensing images and videos acquired by diverse platforms

    Computer vision for plant and animal inventory

    Get PDF
    The population, composition, and spatial distribution of the plants and animals in certain regions are always important data for natural resource management, conservation and farming. The traditional ways to acquire such data require human participation. The procedure of data processing by human is usually cumbersome, expensive and time-consuming. Hence the algorithms for automatic animal and plant inventory show their worth and become a hot topic. We propose a series of computer vision methods for automated plant and animal inventory, to recognize, localize, categorize, track and count different objects of interest, including vegetation, trees, fishes and livestock animals. We make use of different sensors, hardware platforms, neural network architectures and pipelines to deal with the varied properties and challenges of these objects. (1) For vegetation analysis, we propose a fast multistage method to estimate the coverage. The reference board is localized based on its edge and texture features. And then a K-means color model of the board is generated. Finally, the vegetation is segmented at pixel level using the color model. The proposed method is robust to lighting condition changes. (2) For tree counting in aerial images, we propose a novel method called density transformer, or DENT, to learn and predict the density of the trees at different positions. DENT uses an efficient multi-receptive field network to extract visual features from different positions. A transformer encoder is applied to filter and transfer useful contextual information across different spatial positions. DENT significantly outperformed the existing state-of-art CNN detectors and regressors on both the dataset built by ourselves and an existing cross-site dataset. (3) We propose a framework of fish classification system using boat cameras. The framework contains two branches. A branch extracts the contextual information from the whole image. The other branch localizes all the individual fish and normalizes their poses. The classification results from the two branches are weighted based on the clearness of the image and the familiarness of the context. Our system achieved the top 1 percent rank in the competition of The Nature Conservancy Fisheries Monitoring. (4) We also propose a video-based pig counting algorithm using an inspection robot. We adopt a novel bottom-up keypoint tracking method and a novel spatial-aware temporal response filtering method to count the pigs. The proposed approach outperformed the other methods and even human competitors in the experiments.Includes bibliographical references

    PeopleNet: A Novel People Counting Framework for Head-Mounted Moving Camera Videos

    Get PDF
    Traditional crowd counting (optical flow or feature matching) techniques have been upgraded to deep learning (DL) models due to their lack of automatic feature extraction and low-precision outcomes. Most of these models were tested on surveillance scene crowd datasets captured by stationary shooting equipment. It is very challenging to perform people counting from the videos shot with a head-mounted moving camera; this is mainly due to mixing the temporal information of the moving crowd with the induced camera motion. This study proposed a transfer learning-based PeopleNet model to tackle this significant problem. For this, we have made some significant changes to the standard VGG16 model, by disabling top convolutional blocks and replacing its standard fully connected layers with some new fully connected and dense layers. The strong transfer learning capability of the VGG16 network yields in-depth insights of the PeopleNet into the good quality of density maps resulting in highly accurate crowd estimation. The performance of the proposed model has been tested over a self-generated image database prepared from moving camera video clips, as there is no public and benchmark dataset for this work. The proposed framework has given promising results on various crowd categories such as dense, sparse, average, etc. To ensure versatility, we have done self and cross-evaluation on various crowd counting models and datasets, which proves the importance of the PeopleNet model in adverse defense of society
    • …
    corecore