15 research outputs found

    ON THE USE OF DEEP NEURAL NETWORKS FOR THE DETECTION OF SMALL VEHICLES IN ORTHO-IMAGES

    Get PDF
    International audienceThis paper addresses the question of the detection of small targets (vehicles) in ortho-images. This question differs from the general task of detecting objects in images by several aspects. First, the vehicles to be detected are small, typically smaller than 20x20 pixels. Second, due to the multifarious-ness of the landscapes of the earth, several pixel structures similar to that of a vehicle might emerge (roof tops, shadow patterns, rocks, buildings), whereas within the vehicle class the inter-class variability is limited as they all look alike from afar. Finally, the imbalance between the vehicles and the rest of the picture is enormous in most cases. Specifically, this paper is focused on the detection tasks introduced by the VeDAI dataset [1]. This work supports an extensive study of the problems one might face when applying deep neural networks with low resolution and scarce data and proposes some solutions. One of the contributions of this paper is a network severely outperforming the state-of-the-art while being much simpler to implement and a lot faster than competitive approaches. We also list the limitations of this approach and provide several new ideas to further improve our results

    Online fault detection based on typicality and eccentricity data analytics

    Get PDF
    Fault detection is a task of major importance in industry nowadays, since that it can considerably reduce the risk of accidents involving human lives, in addition to production and, consequently, financial losses. Therefore, fault detection systems have been largely studied in the past few years, resulting in many different methods and approaches to solve such problem. This paper presents a detailed study on fault detection on industrial processes based on the recently introduced eccentricity and typicality data analytics (TEDA) approach. TEDA is a recursive and non-parametric method, firstly proposed to the general problem of anomaly detection on data streams. It is based on the measures of data density and proximity from each read data point to the analyzed data set. TEDA is an online autonomous learning algorithm that does not require a priori knowledge about the process, is completely free of user- and problem-defined parameters, requires very low computational effort and, thus, is very suitable for real-time applications. The results further presented were generated by the application of TEDA to a pilot plant for industrial process

    Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset

    Full text link
    Vehicle classification is a hot computer vision topic, with studies ranging from ground-view up to top-view imagery. In remote sensing, the usage of top-view images allows for understanding city patterns, vehicle concentration, traffic management, and others. However, there are some difficulties when aiming for pixel-wise classification: (a) most vehicle classification studies use object detection methods, and most publicly available datasets are designed for this task, (b) creating instance segmentation datasets is laborious, and (c) traditional instance segmentation methods underperform on this task since the objects are small. Thus, the present research objectives are: (1) propose a novel semi-supervised iterative learning approach using GIS software, (2) propose a box-free instance segmentation approach, and (3) provide a city-scale vehicle dataset. The iterative learning procedure considered: (1) label a small number of vehicles, (2) train on those samples, (3) use the model to classify the entire image, (4) convert the image prediction into a polygon shapefile, (5) correct some areas with errors and include them in the training data, and (6) repeat until results are satisfactory. To separate instances, we considered vehicle interior and vehicle borders, and the DL model was the U-net with the Efficient-net-B7 backbone. When removing the borders, the vehicle interior becomes isolated, allowing for unique object identification. To recover the deleted 1-pixel borders, we proposed a simple method to expand each prediction. The results show better pixel-wise metrics when compared to the Mask-RCNN (82% against 67% in IoU). On per-object analysis, the overall accuracy, precision, and recall were greater than 90%. This pipeline applies to any remote sensing target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Recognizing Objects And Reasoning About Their Interactions

    Get PDF
    The task of scene understanding involves recognizing the different objects present in the scene, segmenting the scene into meaningful regions, as well as obtaining a holistic understanding of the activities taking place in the scene. Each of these problems has received considerable interest within the computer vision community. We present contributions to two aspects of visual scene understanding. First we explore multiple methods of feature selection for the problem of object detection. We demonstrate the use of Principal Component Analysis to detect avifauna in field observation videos. We improve on existing approaches by making robust decisions based on regional features and by a feature selection strategy that chooses different features in different parts of the image. We then demonstrate the use of Partial Least Squares to detect vehicles in aerial and satellite imagery. We propose two new feature sets; Color Probability Maps are used to capture the color statistics of vehicles and their surroundings, and Pairs of Pixels are used to capture captures the structural characteristics of objects. A powerful feature selection analysis based on Partial Least Squares is employed to deal with the resulting high dimensional feature space (almost 70,000 dimensions). We also propose an Incremental Multiple Kernel Learning (IMKL) scheme to detect vehicles in a traffic surveillance scenario. Obtaining task and scene specific datasets of visual categories is far more tedious than obtaining a generic dataset of the same classes. Our IMKL approach initializes on a generic training database and then tunes itself to the classification task at hand. Second, we develop a video understanding system for scene elements, such as bus stops, crosswalks, and intersections, that are characterized more by qualitative activities and geometry than by intrinsic appearance. The domain models for scene elements are not learned from a corpus of video, but instead, naturally elicited by humans, and represented as probabilistic logic rules within a Markov Logic Network framework. Human elicited models, however, represent object interactions as they occur in the 3D world rather than describing their appearance projection in some specific 2D image plane. We bridge this gap by recovering qualitative scene geometry to analyze object interactions in the 3D world and then reasoning about scene geometry, occlusions and common sense domain knowledge using a set of meta-rules
    corecore