7,328 research outputs found

    A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images

    Full text link
    Semantic segmentation is the pixel-wise labelling of an image. Since the problem is defined at the pixel level, determining image class labels only is not acceptable, but localising them at the original image pixel resolution is necessary. Boosted by the extraordinary ability of convolutional neural networks (CNN) in creating semantic, high level and hierarchical image features; excessive numbers of deep learning-based 2D semantic segmentation approaches have been proposed within the last decade. In this survey, we mainly focus on the recent scientific developments in semantic segmentation, specifically on deep learning-based methods using 2D images. We started with an analysis of the public image sets and leaderboards for 2D semantic segmantation, with an overview of the techniques employed in performance evaluation. In examining the evolution of the field, we chronologically categorised the approaches into three main periods, namely pre-and early deep learning era, the fully convolutional era, and the post-FCN era. We technically analysed the solutions put forward in terms of solving the fundamental problems of the field, such as fine-grained localisation and scale invariance. Before drawing our conclusions, we present a table of methods from all mentioned eras, with a brief summary of each approach that explains their contribution to the field. We conclude the survey by discussing the current challenges of the field and to what extent they have been solved.Comment: Updated with new studie

    Automatic classification of abandoned objects for surveillance of public premises

    Full text link
    One of the core components of any visual surveillance system is object classification, where detected objects are classified into different categories of interest. Although in airports or train stations, abandoned objects are mainly luggage or trolleys, none of the existing works in the literature have attempted to classify or recognize trolleys. In this paper, we analyzed and classified images of trolley(s), bag(s), single person(s), and group(s) of people by using various shape features with a number of uncluttered and cluttered images and applied multiframe integration to overcome partial occlusions and obtain better recognition results. We also tested the proposed techniques on data extracted from a wellrecognized and recent data set, PETS 2007 benchmark data set[16]. Our experimental results show that the features extracted are invariant to data set and classification scheme chosen. For our four-class object recognition problem, we achieved an average recognition accuracy of 70%. © 2008 IEEE

    Video analytics for security systems

    Get PDF
    This study has been conducted to develop robust event detection and object tracking algorithms that can be implemented in real time video surveillance applications. The aim of the research has been to produce an automated video surveillance system that is able to detect and report potential security risks with minimum human intervention. Since the algorithms are designed to be implemented in real-life scenarios, they must be able to cope with strong illumination changes and occlusions. The thesis is divided into two major sections. The first section deals with event detection and edge based tracking while the second section describes colour measurement methods developed to track objects in crowded environments. The event detection methods presented in the thesis mainly focus on detection and tracking of objects that become stationary in the scene. Objects such as baggage left in public places or vehicles parked illegally can cause a serious security threat. A new pixel based classification technique has been developed to detect objects of this type in cluttered scenes. Once detected, edge based object descriptors are obtained and stored as templates for tracking purposes. The consistency of these descriptors is examined using an adaptive edge orientation based technique. Objects are tracked and alarm events are generated if the objects are found to be stationary in the scene after a certain period of time. To evaluate the full capabilities of the pixel based classification and adaptive edge orientation based tracking methods, the model is tested using several hours of real-life video surveillance scenarios recorded at different locations and time of day from our own and publically available databases (i-LIDS, PETS, MIT, ViSOR). The performance results demonstrate that the combination of pixel based classification and adaptive edge orientation based tracking gave over 95% success rate. The results obtained also yield better detection and tracking results when compared with the other available state of the art methods. In the second part of the thesis, colour based techniques are used to track objects in crowded video sequences in circumstances of severe occlusion. A novel Adaptive Sample Count Particle Filter (ASCPF) technique is presented that improves the performance of the standard Sample Importance Resampling Particle Filter by up to 80% in terms of computational cost. An appropriate particle range is obtained for each object and the concept of adaptive samples is introduced to keep the computational cost down. The objective is to keep the number of particles to a minimum and only to increase them up to the maximum, as and when required. Variable standard deviation values for state vector elements have been exploited to cope with heavy occlusion. The technique has been tested on different video surveillance scenarios with variable object motion, strong occlusion and change in object scale. Experimental results show that the proposed method not only tracks the object with comparable accuracy to existing particle filter techniques but is up to five times faster. Tracking objects in a multi camera environment is discussed in the final part of the thesis. The ASCPF technique is deployed within a multi-camera environment to track objects across different camera views. Such environments can pose difficult challenges such as changes in object scale and colour features as the objects move from one camera view to another. Variable standard deviation values of the ASCPF have been utilized in order to cope with sudden colour and scale changes. As the object moves from one scene to another, the number of particles, together with the spread value, is increased to a maximum to reduce any effects of scale and colour change. Promising results are obtained when the ASCPF technique is tested on live feeds from four different camera views. It was found that not only did the ASCPF method result in the successful tracking of the moving object across different views but also maintained the real time frame rate due to its reduced computational cost thus indicating that the method is a potential practical solution for multi camera tracking applications

    Using Unmanned Aerial Systems for Deriving Forest Stand Characteristics in Mixed Hardwoods of West Virginia

    Get PDF
    Forest inventory information is a principle driver for forest management decisions. Information gathered through these inventories provides a summary of the condition of forested stands. The method by which remote sensing aids land managers is changing rapidly. Imagery produced from unmanned aerial systems (UAS) offer high temporal and spatial resolutions to small-scale forest management. UAS imagery is less expensive and easier to coordinate to meet project needs compared to traditional manned aerial imagery. This study focused on producing an efficient and approachable work flow for producing forest stand board volume estimates from UAS imagery in mixed hardwood stands of West Virginia. A supplementary aim of this project was to evaluate which season was best to collect imagery for forest inventory. True color imagery was collected with a DJI Phantom 3 Professional UAS and was processed in Agisoft Photoscan Professional. Automated tree crown segmentation was performed with Trimble eCognition Developer’s multi-resolution segmentation function with manual optimization of parameters through an iterative process. Individual tree volume metrics were derived from field data relationships and volume estimates were processed in EZ CRUZ forest inventory software. The software, at best, correctly segmented 43% of the individual tree crowns. No correlation between season of imagery acquisition and quality of segmentation was shown. Volume and other stand characteristics were not accurately estimated and were faulted by poor segmentation. However, the imagery was able to capture gaps consistently and provide a visualization of forest health. Difficulties, successes and time required for these procedures were thoroughly noted

    Visual Crowd Analysis: Open Research Problems

    Full text link
    Over the last decade, there has been a remarkable surge in interest in automated crowd monitoring within the computer vision community. Modern deep-learning approaches have made it possible to develop fully-automated vision-based crowd-monitoring applications. However, despite the magnitude of the issue at hand, the significant technological advancements, and the consistent interest of the research community, there are still numerous challenges that need to be overcome. In this article, we delve into six major areas of visual crowd analysis, emphasizing the key developments in each of these areas. We outline the crucial unresolved issues that must be tackled in future works, in order to ensure that the field of automated crowd monitoring continues to progress and thrive. Several surveys related to this topic have been conducted in the past. Nonetheless, this article thoroughly examines and presents a more intuitive categorization of works, while also depicting the latest breakthroughs within the field, incorporating more recent studies carried out within the last few years in a concise manner. By carefully choosing prominent works with significant contributions in terms of novelty or performance gains, this paper presents a more comprehensive exposition of advancements in the current state-of-the-art.Comment: Accepted in AI Magazine published by Wiley Periodicals LLC on behalf of the Association for the Advancement of Artificial Intelligenc

    FlowerPhenoNet: Automated Flower Detection from Multi-View Image Sequences Using Deep Neural Networks for Temporal Plant Phenotyping Analysis

    Get PDF
    A phenotype is the composite of an observable expression of a genome for traits in a given environment. The trajectories of phenotypes computed from an image sequence and timing of important events in a plant’s life cycle can be viewed as temporal phenotypes and indicative of the plant’s growth pattern and vigor. In this paper, we introduce a novel method called FlowerPhenoNet, which uses deep neural networks for detecting flowers from multiview image sequences for high-throughput temporal plant phenotyping analysis. Following flower detection, a set of novel flower-based phenotypes are computed, e.g., the day of emergence of the first flower in a plant’s life cycle, the total number of flowers present in the plant at a given time, the highest number of flowers bloomed in the plant, growth trajectory of a flower, and the blooming trajectory of a plant. To develop a new algorithm and facilitate performance evaluation based on experimental analysis, a benchmark dataset is indispensable. Thus, we introduce a benchmark dataset called FlowerPheno, which comprises image sequences of three flowering plant species, e.g., sunflower, coleus, and canna, captured by a visible light camera in a high-throughput plant phenotyping platform from multiple view angles. The experimental analyses on the FlowerPheno dataset demonstrate the efficacy of the FlowerPhenoNet
    corecore