838 research outputs found
R-Net: A Deep Network for Multi-oriented Vehicle Detection in Aerial Images and Videos
Vehicle detection is a significant and challenging task in aerial remote
sensing applications. Most existing methods detect vehicles with regular
rectangle boxes and fail to offer the orientation of vehicles. However, the
orientation information is crucial for several practical applications, such as
the trajectory and motion estimation of vehicles. In this paper, we propose a
novel deep network, called rotatable region-based residual network (R-Net),
to detect multi-oriented vehicles in aerial images and videos. More specially,
R-Net is utilized to generate rotatable rectangular target boxes in a half
coordinate system. First, we use a rotatable region proposal network (R-RPN) to
generate rotatable region of interests (R-RoIs) from feature maps produced by a
deep convolutional neural network. Here, a proposed batch averaging rotatable
anchor (BAR anchor) strategy is applied to initialize the shape of vehicle
candidates. Next, we propose a rotatable detection network (R-DN) for the final
classification and regression of the R-RoIs. In R-DN, a novel rotatable
position sensitive pooling (R-PS pooling) is designed to keep the position and
orientation information simultaneously while downsampling the feature maps of
R-RoIs. In our model, R-RPN and R-DN can be trained jointly. We test our
network on two open vehicle detection image datasets, namely DLR 3K Munich
Dataset and VEDAI Dataset, demonstrating the high precision and robustness of
our method. In addition, further experiments on aerial videos show the good
generalization capability of the proposed method and its potential for vehicle
tracking in aerial videos. The demo video is available at
https://youtu.be/xCYD-tYudN0
Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset
Vehicle classification is a hot computer vision topic, with studies ranging
from ground-view up to top-view imagery. In remote sensing, the usage of
top-view images allows for understanding city patterns, vehicle concentration,
traffic management, and others. However, there are some difficulties when
aiming for pixel-wise classification: (a) most vehicle classification studies
use object detection methods, and most publicly available datasets are designed
for this task, (b) creating instance segmentation datasets is laborious, and
(c) traditional instance segmentation methods underperform on this task since
the objects are small. Thus, the present research objectives are: (1) propose a
novel semi-supervised iterative learning approach using GIS software, (2)
propose a box-free instance segmentation approach, and (3) provide a city-scale
vehicle dataset. The iterative learning procedure considered: (1) label a small
number of vehicles, (2) train on those samples, (3) use the model to classify
the entire image, (4) convert the image prediction into a polygon shapefile,
(5) correct some areas with errors and include them in the training data, and
(6) repeat until results are satisfactory. To separate instances, we considered
vehicle interior and vehicle borders, and the DL model was the U-net with the
Efficient-net-B7 backbone. When removing the borders, the vehicle interior
becomes isolated, allowing for unique object identification. To recover the
deleted 1-pixel borders, we proposed a simple method to expand each prediction.
The results show better pixel-wise metrics when compared to the Mask-RCNN (82%
against 67% in IoU). On per-object analysis, the overall accuracy, precision,
and recall were greater than 90%. This pipeline applies to any remote sensing
target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa
MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition in UAV Videos
Visual data collected from Unmanned Aerial Vehicles (UAVs) has opened a new
frontier of computer vision that requires automated analysis of aerial
images/videos. However, the existing UAV datasets primarily focus on object
detection. An object detector does not differentiate between the moving and
non-moving objects. Given a real-time UAV video stream, how can we both
localize and classify the moving objects, i.e. perform moving object
recognition (MOR)? The MOR is one of the essential tasks to support various UAV
vision-based applications including aerial surveillance, search and rescue,
event recognition, urban and rural scene understanding.To the best of our
knowledge, no labeled dataset is available for MOR evaluation in UAV videos.
Therefore, in this paper, we introduce MOR-UAV, a large-scale video dataset for
MOR in aerial videos. We achieve this by labeling axis-aligned bounding boxes
for moving objects which requires less computational resources than producing
pixel-level estimates. We annotate 89,783 moving object instances collected
from 30 UAV videos, consisting of 10,948 frames in various scenarios such as
weather conditions, occlusion, changing flying altitude and multiple camera
views. We assigned the labels for two categories of vehicles (car and heavy
vehicle). Furthermore, we propose a deep unified framework MOR-UAVNet for MOR
in UAV videos. Since, this is a first attempt for MOR in UAV videos, we present
16 baseline results based on the proposed framework over the MOR-UAV dataset
through quantitative and qualitative experiments. We also analyze the
motion-salient regions in the network through multiple layer visualizations.
The MOR-UAVNet works online at inference as it requires only few past frames.
Moreover, it doesn't require predefined target initialization from user.
Experiments also demonstrate that the MOR-UAV dataset is quite challenging
A Deep Neural Network Model for Realtime Semantic-Segmentation Video Processing supported to Autonomous Vehicles
Traffic congestion has been a huge problem, especially in urban area during peak hours, which causes a major problem for any unmanned/autonomous vehicles and also accumulate environmental pollution. The solutions for managing and monitoring the traffic flow is challenging that not only asks for performing accurately and flexibly on routes but also requires the lowest installation costs. In this paper, we propose a synthetic method that uses deep learning-based video processing to derive density of traffic object over infrastructure which can support usefull information for autonomous vehicles in a smart control system. The idea is using the semantic segmentation, which is the process of linking each pixel in an image to a class label to produce masked map that support collecting class distribution among each frame. Moreover, an aerial dataset named Saigon Aerial with more than 110 samples is also created in this paper to support unique observation in a biggest city in Vietnam, HoChiMinh city. To present our idea, we evaluated different semantic segmentation models on 2 datasets: Saigon Aerial and UAVid. Also to track our model’s performance, F1 and Mean Intersection over Union metrics are also taken into account. The code and dataset are uploaded to Github and Kaggle repository respectively as follow: Saigon Aerial Code, Saigon Aerial dataset
Underwater Aerial Vehicle Networks Based Image Analysis By Deep Learning Architecture Integrated With 5G System
With its astonishing ability to learn representation from data, deep neural networks (DNNs) have made efficient advances in the processing of pictures, time series, spoken language, audio, video, and many other types of data.In an effort to compile the volume of information generated in remote sensing field's subfields, surveys and literature revisions explicitly concerning DNNs methods applications are carried out Aerial sensing research has recently been dominated by applications based on Unmanned Aerial Vehicles (UAVs).There hasn't yet been a literature review that integrates the "deep learning" and "UAV remote sensing" thematics.This research propose novel technique in underwater aerial vehicle networks based image analysis by feature extraction and classification utilizing DL methods. here UAV based images through on 5G module is collected and this image has been processed for noise removal, smoothening and normalization. The processed image features has been extracted using multilayer extreme learning based convolutional neural networks. Then extracted deep features has been classified utilizingrecursive elimination based radial basis function networks. The experimental analysis is carried out based on numerous UAV image dataset in terms of accuracy, precision, recall, F-measure, RMSE and MAP.Proposed method attained accuracy of 96%, precision of 94%, recall of 85%, F- measure of 72%, RMSE of 48%, MAP of 41%
Advances in Object and Activity Detection in Remote Sensing Imagery
The recent revolution in deep learning has enabled considerable development in the fields of object and activity detection. Visual object detection tries to find objects of target classes with precise localisation in an image and assign each object instance a corresponding class label. At the same time, activity recognition aims to determine the actions or activities of an agent or group of agents based on sensor or video observation data. It is a very important and challenging problem to detect, identify, track, and understand the behaviour of objects through images and videos taken by various cameras. Together, objects and their activity recognition in imaging data captured by remote sensing platforms is a highly dynamic and challenging research topic. During the last decade, there has been significant growth in the number of publications in the field of object and activity recognition. In particular, many researchers have proposed application domains to identify objects and their specific behaviours from air and spaceborne imagery. This Special Issue includes papers that explore novel and challenging topics for object and activity detection in remote sensing images and videos acquired by diverse platforms
- …