Search CORE

550 research outputs found

Smart environment monitoring through micro unmanned aerial vehicles

Author: Pannone Daniele
Publication venue
Publication date: 28/02/2019
Field of study

In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

Archivio della ricerca- Università di Roma La Sapienza

Determination of Elevations for Excavation Operations Using Drone Technologies

Author: Jiang Yuhan
Publication venue: e-Publications@Marquette
Publication date: 01/07/2020
Field of study

Using deep learning technology to rapidly estimate depth information from a single image has been studied in many situations, but it is new in construction site elevation determinations, and challenges are not limited to the lack of datasets. This dissertation presents the research results of utilizing drone ortho-imaging and deep learning to estimate construction site elevations for excavation operations. It provides two flexible options of fast elevation determination including a low-high-ortho-image-pair-based method and a single-frame-ortho-image-based method. The success of this research project advanced the ortho-imaging utilization in construction surveying, strengthened CNNs (convolutional neural networks) to work with large scale images, and contributed dense image pixel matching with different scales.This research project has three major tasks. First, the high-resolution ortho-image and elevation-map datasets were acquired using the low-high ortho-image pair-based 3D-reconstruction method. In detail, a vertical drone path is designed first to capture a 2:1 scale ortho-image pair of a construction site at two different altitudes. Then, to simultaneously match the pixel pairs and determine elevations, the developed pixel matching and virtual elevation algorithm provides the candidate pixel pairs in each virtual plane for matching, and the four-scaling patch feature descriptors are used to match them. Experimental results show that 92% of pixels in the pixel grid were strongly matched, where the accuracy of elevations was within ±5 cm.Second, the acquired high-resolution datasets were applied to train and test the ortho-image encoder and elevation-map decoder, where the max-pooling and up-sampling layers link the ortho-image and elevation-map in the same pixel coordinate. This convolutional encoder-decoder was supplemented with an input ortho-image overlapping disassembling and output elevation-map assembling algorithm to crop the high-resolution datasets into multiple small-patch datasets for model training and testing. Experimental results indicated 128×128-pixel small-patch had the best elevation estimation performance, where 21.22% of the selected points were exactly matched with “ground truth,” 31.21% points were accurately matched within ±5 cm. Finally, vegetation was identified in high-resolution ortho-images and removed from corresponding elevation-maps using the developed CNN-based image classification model and the vegetation removing algorithm. Experimental results concluded that the developed CNN model using 32×32-pixel ortho-image and class-label small-patch datasets had 93% accuracy in identifying objects and localizing objects’ edges

epublications@Marquette

Siamese network based features fusion for adaptive visual tracking

Author: B Liu
D Held
J Gao
J Zhang
JF Henriques
JF Henriques
K Chen
K Zhang
L Bertinetto
M Danelljan
M Danelljan
M Mueller
P Liang
S Hare
Y Li
Y Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

© Springer Nature Switzerland AG 2018. Visual object tracking is a popular but challenging problem in computer vision. The main challenge is the lack of priori knowledge of the tracking target, which may be only supervised of a bounding box given in the first frame. Besides, the tracking suffers from many influences as scale variations, deformations, partial occlusions and motion blur, etc. To solve such a challenging problem, a suitable tracking framework is demanded to adopt different tracking scenes. This paper presents a novel approach for robust visual object tracking by multiple features fusion in the Siamese Network. Hand-crafted appearance features and CNN features are combined to mutually compensate for their shortages and enhance the advantages. The proposed network is processed as follows. Firstly, different features are extracted from the tracking frames. Secondly, the extracted features are employed via Correlation Filter respectively to learn corresponding templates, which are used to generate response maps respectively. And finally, the multiple response maps are fused to get a better response map, which can help to locate the target location more accurately. Comprehensive experiments are conducted on three benchmarks: Temple-Color, OTB50 and UAV123. Experimental results demonstrate that the proposed approach achieves state-of-the-art performance on these benchmarks

Crossref

OPUS - University of Technology Sydney