2,523 research outputs found
Backbone Can Not be Trained at Once: Rolling Back to Pre-trained Network for Person Re-Identification
In person re-identification (ReID) task, because of its shortage of trainable
dataset, it is common to utilize fine-tuning method using a classification
network pre-trained on a large dataset. However, it is relatively difficult to
sufficiently fine-tune the low-level layers of the network due to the gradient
vanishing problem. In this work, we propose a novel fine-tuning strategy that
allows low-level layers to be sufficiently trained by rolling back the weights
of high-level layers to their initial pre-trained weights. Our strategy
alleviates the problem of gradient vanishing in low-level layers and robustly
trains the low-level layers to fit the ReID dataset, thereby increasing the
performance of ReID tasks. The improved performance of the proposed strategy is
validated via several experiments. Furthermore, without any add-ons such as
pose estimation or segmentation, our strategy exhibits state-of-the-art
performance using only vanilla deep convolutional neural network architecture.Comment: Accepted to AAAI 201
Smart environment monitoring through micro unmanned aerial vehicles
In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
Real-Time Online Human Tracking with a Stereo Camera for Person-Following Robots
Person-Following Robots have been studied for multiple decades now. Recently, person-following robots have relied on various sensors (e.g., radar, infrared, laser, ultrasonic, etc). However, these technologies lack the use of the most reliable information from visible colors (visible light cameras) for high-level perception; therefore, many of them are not stable when the robot is placed under complex environments (e.g., crowded scenes, occlusion, target disappearance, etc.). In this thesis, we are presenting three different approaches to track a human target for person-following robots in challenging situations (e.g., partial and full occlusions, appearance changes, pose changes, illumination changes, or distractor wearing the similar clothes, etc.) with a stereo depth camera. The newest tracker (SiamMDH, a Siamese convolutional neural network based tracker with temporary appearance model) implemented in this work achieves 98.92% accuracy with location error threshold 50 pixels and 92.94% success rate with IoU threshold 0.5 on our extensive person-following dataset
- …