838 research outputs found

    R3^3-Net: A Deep Network for Multi-oriented Vehicle Detection in Aerial Images and Videos

    Get PDF
    Vehicle detection is a significant and challenging task in aerial remote sensing applications. Most existing methods detect vehicles with regular rectangle boxes and fail to offer the orientation of vehicles. However, the orientation information is crucial for several practical applications, such as the trajectory and motion estimation of vehicles. In this paper, we propose a novel deep network, called rotatable region-based residual network (R3^3-Net), to detect multi-oriented vehicles in aerial images and videos. More specially, R3^3-Net is utilized to generate rotatable rectangular target boxes in a half coordinate system. First, we use a rotatable region proposal network (R-RPN) to generate rotatable region of interests (R-RoIs) from feature maps produced by a deep convolutional neural network. Here, a proposed batch averaging rotatable anchor (BAR anchor) strategy is applied to initialize the shape of vehicle candidates. Next, we propose a rotatable detection network (R-DN) for the final classification and regression of the R-RoIs. In R-DN, a novel rotatable position sensitive pooling (R-PS pooling) is designed to keep the position and orientation information simultaneously while downsampling the feature maps of R-RoIs. In our model, R-RPN and R-DN can be trained jointly. We test our network on two open vehicle detection image datasets, namely DLR 3K Munich Dataset and VEDAI Dataset, demonstrating the high precision and robustness of our method. In addition, further experiments on aerial videos show the good generalization capability of the proposed method and its potential for vehicle tracking in aerial videos. The demo video is available at https://youtu.be/xCYD-tYudN0

    Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset

    Full text link
    Vehicle classification is a hot computer vision topic, with studies ranging from ground-view up to top-view imagery. In remote sensing, the usage of top-view images allows for understanding city patterns, vehicle concentration, traffic management, and others. However, there are some difficulties when aiming for pixel-wise classification: (a) most vehicle classification studies use object detection methods, and most publicly available datasets are designed for this task, (b) creating instance segmentation datasets is laborious, and (c) traditional instance segmentation methods underperform on this task since the objects are small. Thus, the present research objectives are: (1) propose a novel semi-supervised iterative learning approach using GIS software, (2) propose a box-free instance segmentation approach, and (3) provide a city-scale vehicle dataset. The iterative learning procedure considered: (1) label a small number of vehicles, (2) train on those samples, (3) use the model to classify the entire image, (4) convert the image prediction into a polygon shapefile, (5) correct some areas with errors and include them in the training data, and (6) repeat until results are satisfactory. To separate instances, we considered vehicle interior and vehicle borders, and the DL model was the U-net with the Efficient-net-B7 backbone. When removing the borders, the vehicle interior becomes isolated, allowing for unique object identification. To recover the deleted 1-pixel borders, we proposed a simple method to expand each prediction. The results show better pixel-wise metrics when compared to the Mask-RCNN (82% against 67% in IoU). On per-object analysis, the overall accuracy, precision, and recall were greater than 90%. This pipeline applies to any remote sensing target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa

    MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition in UAV Videos

    Full text link
    Visual data collected from Unmanned Aerial Vehicles (UAVs) has opened a new frontier of computer vision that requires automated analysis of aerial images/videos. However, the existing UAV datasets primarily focus on object detection. An object detector does not differentiate between the moving and non-moving objects. Given a real-time UAV video stream, how can we both localize and classify the moving objects, i.e. perform moving object recognition (MOR)? The MOR is one of the essential tasks to support various UAV vision-based applications including aerial surveillance, search and rescue, event recognition, urban and rural scene understanding.To the best of our knowledge, no labeled dataset is available for MOR evaluation in UAV videos. Therefore, in this paper, we introduce MOR-UAV, a large-scale video dataset for MOR in aerial videos. We achieve this by labeling axis-aligned bounding boxes for moving objects which requires less computational resources than producing pixel-level estimates. We annotate 89,783 moving object instances collected from 30 UAV videos, consisting of 10,948 frames in various scenarios such as weather conditions, occlusion, changing flying altitude and multiple camera views. We assigned the labels for two categories of vehicles (car and heavy vehicle). Furthermore, we propose a deep unified framework MOR-UAVNet for MOR in UAV videos. Since, this is a first attempt for MOR in UAV videos, we present 16 baseline results based on the proposed framework over the MOR-UAV dataset through quantitative and qualitative experiments. We also analyze the motion-salient regions in the network through multiple layer visualizations. The MOR-UAVNet works online at inference as it requires only few past frames. Moreover, it doesn't require predefined target initialization from user. Experiments also demonstrate that the MOR-UAV dataset is quite challenging

    A Deep Neural Network Model for Realtime Semantic-Segmentation Video Processing supported to Autonomous Vehicles

    Get PDF
    Traffic congestion has been a huge problem, especially in urban area during peak hours, which causes a major problem for any unmanned/autonomous vehicles and also accumulate environmental pollution. The solutions for managing and monitoring the traffic flow is challenging that not only asks for performing accurately and flexibly on routes but also requires the lowest installation costs. In this paper, we propose a synthetic method that uses deep learning-based video processing to derive density of traffic object over infrastructure which can support usefull information for autonomous vehicles in a smart control system. The idea is using the semantic segmentation, which is the process of linking each pixel in an image to a class label to produce masked map that support collecting class distribution among each frame. Moreover, an aerial dataset named Saigon Aerial with more than 110 samples is also created in this paper to support unique observation in a biggest city in Vietnam, HoChiMinh city. To present our idea, we evaluated different semantic segmentation models on 2 datasets: Saigon Aerial and UAVid. Also to track our model’s performance, F1 and Mean Intersection over Union metrics are also taken into account. The code and dataset are uploaded to Github and Kaggle repository respectively as follow: Saigon Aerial Code, Saigon Aerial dataset

    Underwater Aerial Vehicle Networks Based Image Analysis By Deep Learning Architecture Integrated With 5G System

    Get PDF
    With its astonishing ability to learn representation from data, deep neural networks (DNNs) have made efficient advances in the processing of pictures, time series, spoken language, audio, video, and many other types of data.In an effort to compile the volume of information generated in remote sensing field's subfields, surveys and literature revisions explicitly concerning DNNs methods applications are carried out Aerial sensing research has recently been dominated by applications based on Unmanned Aerial Vehicles (UAVs).There hasn't yet been a literature review that integrates the "deep learning" and "UAV remote sensing" thematics.This research propose novel technique in underwater aerial vehicle networks based image analysis by feature extraction and classification utilizing DL methods. here UAV based images through on 5G module is collected and this image has been processed for noise removal, smoothening and normalization. The processed image features has been extracted using multilayer extreme learning based convolutional neural networks. Then extracted deep features has been classified utilizingrecursive elimination based radial basis function networks. The experimental analysis is carried out based on numerous UAV image dataset in terms of accuracy, precision, recall, F-measure, RMSE and MAP.Proposed method attained accuracy of 96%, precision of 94%, recall of 85%, F- measure of 72%, RMSE of 48%, MAP of 41%

    Advances in Object and Activity Detection in Remote Sensing Imagery

    Get PDF
    The recent revolution in deep learning has enabled considerable development in the fields of object and activity detection. Visual object detection tries to find objects of target classes with precise localisation in an image and assign each object instance a corresponding class label. At the same time, activity recognition aims to determine the actions or activities of an agent or group of agents based on sensor or video observation data. It is a very important and challenging problem to detect, identify, track, and understand the behaviour of objects through images and videos taken by various cameras. Together, objects and their activity recognition in imaging data captured by remote sensing platforms is a highly dynamic and challenging research topic. During the last decade, there has been significant growth in the number of publications in the field of object and activity recognition. In particular, many researchers have proposed application domains to identify objects and their specific behaviours from air and spaceborne imagery. This Special Issue includes papers that explore novel and challenging topics for object and activity detection in remote sensing images and videos acquired by diverse platforms
    • …
    corecore