1,684 research outputs found

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

    People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting

    Get PDF
    In this paper we propose a technique to adapt a convolutional neural network (CNN) based object counter to additional visual domains and object types while still preserving the original counting function. Domain-specific normalisation and scaling operators are trained to allow the model to adjust to the statistical distributions of the various visual domains. The developed adaptation technique is used to produce a singular patch-based counting regressor capable of counting various object types including people, vehicles, cell nuclei and wildlife. As part of this study a challenging new cell counting dataset in the context of tissue culture and patient diagnosis is constructed. This new collection, referred to as the Dublin Cell Counting (DCC) dataset, is the first of its kind to be made available to the wider computer vision community. State-of-the-art object counting performance is achieved in both the Shanghaitech (parts A and B) and Penguins datasets while competitive performance is observed on the TRANCOS and Modified Bone Marrow (MBM) datasets, all using a shared counting model.Comment: 10 page

    Edge-Computing Deep Learning-Based Computer Vision Systems

    Get PDF
    Computer vision has become ubiquitous in today\u27s society, with applications ranging from medical imaging to visual diagnostics to aerial monitoring to self-driving vehicles and many more. Common to many of these applications are visual perception systems which consist of classification, localization, detection, and segmentation components, just to name a few. Recently, the development of deep neural networks (DNN) have led to great advancements in pushing state-of-the-art performance in each of these areas. Unlike traditional computer vision algorithms, DNNs have the ability to generalize features previously hand-crafted by engineers specific to the application; this assumption models the human visual system\u27s ability to generalize its surroundings. Moreover, convolutional neural networks (CNN) have been shown to not only match, but exceed performance of traditional computer vision algorithms as the filters of the network are able to learn important features present in the data. In this research we aim to develop numerous applications including visual warehouse diagnostics and shipping yard managements systems, aerial monitoring and tracking from the perspective of the drone, perception system model for an autonomous vehicle, and vehicle re-identification for surveillance and security. The deep learning models developed for each application attempt to match or exceed state-of-the-art performance in both accuracy and inference time; however, this is typically a trade-off when designing a network where one or the other can be maximized. We investigate numerous object-detection architectures including Faster R-CNN, SSD, YOLO, and a few other variations in an attempt to determine the best architecture for each application. We constrain performance metrics to only investigate inference times rather than training times as none of the optimizations performed in this research have any effect on training time. Further, we will also investigate re-identification of vehicles as a separate application add-on to the object-detection pipeline. Re-identification will allow for a more robust representation of the data while leveraging techniques for security and surveillance. We also investigate comparisons between architectures that could possibly lead to the development of new architectures with the ability to not only perform inference relatively quickly (or in close-to real-time), but also match the state-of-the-art in accuracy performance. New architecture development, however, depends on the application and its requirements; some applications need to run on edge-computing (EC) devices, while others have slightly larger inference windows which allow for cloud computing with powerful accelerators

    Close Formation Flight Missions Using Vision-Based Position Detection System

    Get PDF
    In this thesis, a formation flight architecture is described along with the implementation and evaluation of a state-of-the-art vision-based algorithm for solving the problem of estimating and tracking a leader vehicle within a close-formation configuration. A vision-based algorithm that uses Darknet architecture and a formation flight control law to track and follow a leader with desired clearance in forward, lateral directions are developed and implemented. The architecture is run on a flight computer that handles the process in real-time while integrating navigation sensors and a stereo camera. Numerical simulations along with indoor and outdoor actual flight tests demonstrate the capabilities of detection and tracking by providing a low cost, compact size and low weight solution for the problem of estimating the location of other cooperative or non-cooperative flying vehicles within a formation architecture

    Cyclist Detection, Tracking, and Trajectory Analysis in Urban Traffic Video Data

    Full text link
    The major objective of this thesis work is examining computer vision and machine learning detection methods, tracking algorithms and trajectory analysis for cyclists in traffic video data and developing an efficient system for cyclist counting. Due to the growing number of cyclist accidents on urban roads, methods for collecting information on cyclists are of significant importance to the Department of Transportation. The collected information provides insights into solving critical problems related to transportation planning, implementing safety countermeasures, and managing traffic flow efficiently. Intelligent Transportation System (ITS) employs automated tools to collect traffic information from traffic video data. In comparison to other road users, such as cars and pedestrians, the automated cyclist data collection is relatively a new research area. In this work, a vision-based method for gathering cyclist count data at intersections and road segments is developed. First, we develop methodology for an efficient detection and tracking of cyclists. The combination of classification features along with motion based properties are evaluated to detect cyclists in the test video data. A Convolutional Neural Network (CNN) based detector called You Only Look Once (YOLO) is implemented to increase the detection accuracy. In the next step, the detection results are fed into a tracker which is implemented based on the Kernelized Correlation Filters (KCF) which in cooperation with the bipartite graph matching algorithm allows to track multiple cyclists, concurrently. Then, a trajectory rebuilding method and a trajectory comparison model are applied to refine the accuracy of tracking and counting. The trajectory comparison is performed based on semantic similarity approach. The proposed counting method is the first cyclist counting method that has the ability to count cyclists under different movement patterns. The trajectory data obtained can be further utilized for cyclist behavioral modeling and safety analysis

    Aerial Object Detection using Learnable Bounding Boxes

    Get PDF
    Current methods in computer vision and object detection rely heavily on neural networks and deep learning. This active area of research is used in applications such as autonomous driving, aerial imaging, defense and surveillance. State-of-the-art object detection methods rely on rectangular shaped, horizontal/vertical bounding boxes drawn over an object to accurately localize its position. Such orthogonal bounding boxes ignore object pose, resulting in reduced object localization, and limiting downstream tasks such as object understanding and tracking. To overcome these limitations, this research presents object detection improvements that aid tighter and more precise detections. In particular, we modify the object detection anchor box definition to firstly include rotations along with height and width and secondly to allow arbitrary four corner point shapes. Further, the introduction of new anchor boxes gives the model additional freedom to model objects which are centered about a 45-degree axis of rotation. The resulting network allows minimum compromises in speed and reliability while providing more accurate localization. We present results on the DOTA dataset, showing the value of the flexible object boundaries, especially with rotated and non-rectangular objects

    Application-aware optimization of Artificial Intelligence for deployment on resource constrained devices

    Get PDF
    Artificial intelligence (AI) is changing people's everyday life. AI techniques such as Deep Neural Networks (DNN) rely on heavy computational models, which are in principle designed to be executed on powerful HW platforms, such as desktop or server environments. However, the increasing need to apply such solutions in people's everyday life has encouraged the research for methods to allow their deployment on embedded, portable and stand-alone devices, such as mobile phones, which exhibit relatively low memory and computational resources. Such methods targets both the development of lightweight AI algorithms and their acceleration through dedicated HW. This thesis focuses on the development of lightweight AI solutions, with attention to deep neural networks, to facilitate their deployment on resource constrained devices. Focusing on the computer vision field, we show how putting together the self learning ability of deep neural networks with application-specific knowledge, in the form of feature engineering, it is possible to dramatically reduce the total memory and computational burden, thus allowing the deployment on edge devices. The proposed approach aims to be complementary to already existing application-independent network compression solutions. In this work three main DNN optimization goals have been considered: increasing speed and accuracy, allowing training at the edge, and allowing execution on a microcontroller. For each of these we deployed the resulting algorithm to the target embedded device and measured its performance
    corecore