820 research outputs found

    Optimum Pipeline for Visual Terrain Classification Using Improved Bag of Visual Words and Fusion Methods

    Get PDF
    We propose an optimum pipeline and develop the hybrid representation to produce an effective and efficient visual terrain classification system. The bag of visual words (BOVW) framework has emerged as a promising approach and effective paradigm for visual terrain classification. The method includes four main steps: (1) feature extraction, (2) codebook generation, (3) feature coding, and (4) pooling and normalization. Recent researches have primarily focused on feature extraction in the development of new handcrafted descriptors that are specific to the visual terrain. However, the effects of other steps on visual terrain classification are still unknown. At the same time, fusion methods are often used to boost classification performance by exploring the complementarity of diverse features. We provide a comprehensive study of all steps in the BOVW framework and different fusion methods for visual terrain classification. Then, multiple approaches in each step and their effects are explored on the visual terrain dataset. Finally, the feature preprocessing technique, improved BOVW framework, and fusion method are used to construct an optimum pipeline for visual terrain classification. The hybrid representation developed by the optimum pipeline performs effectively and rapidly for visual terrain classification in the terrain dataset, outperforming those current methods. Furthermore, it is robust to diverse noises and illumination alterations

    On Semantic Segmentation and Path Planning for Autonomous Vehicles within Off-Road Environments

    Get PDF
    There are many challenges involved in creating a fully autonomous vehicle capable of safely navigating through off-road environments. In this work we focus on two of the most prominent such challenges, namely scene understanding and path planning. Scene understanding is a challenging computer vision task with recent advances in convolutional neural networks (CNN) achieving results that notably surpass prior traditional feature driven approaches. Here, we build on recent work in urban road-scene understanding, training a state of the art CNN architecture towards the task of classifying off-road scenes. We analyse the effects of transfer learning and training data set size on CNN performance, evaluating multiple configurations of the network at multiple points during the training cycle, investigating in depth how the training process is affected. We compare this CNN to a more traditional feature-driven approach with Support Vector Machine (SVM) classifier and demonstrate state-of-the-art results in this particularly challenging problem of off-road scene understanding. We then expand on this with the addition of multi-channel RGBD data, which we encode in multiple configurations for CNN input. We evaluate each of these configuration over our own off-road RGBD data set and compare performance to that of the network model trained using RGB data. Next, we investigate end-to-end navigation, whereby a machine learning algorithm optimises to predict the vehicle control inputs of a human driver. After evaluating such a technique in an off-road environment and identifying several limitations, we propose a new approach in which a CNN learns to predict vehicle path visually, combining a novel approach to automatic training data creation with state of the art CNN architecture to map a predicted route directly onto image pixels. We then evaluate this approach using our off-road data set, and demonstrate effectiveness surpassing existing end-to-end methods

    Recent Advances in Image Restoration with Applications to Real World Problems

    Get PDF
    In the past few decades, imaging hardware has improved tremendously in terms of resolution, making widespread usage of images in many diverse applications on Earth and planetary missions. However, practical issues associated with image acquisition are still affecting image quality. Some of these issues such as blurring, measurement noise, mosaicing artifacts, low spatial or spectral resolution, etc. can seriously affect the accuracy of the aforementioned applications. This book intends to provide the reader with a glimpse of the latest developments and recent advances in image restoration, which includes image super-resolution, image fusion to enhance spatial, spectral resolution, and temporal resolutions, and the generation of synthetic images using deep learning techniques. Some practical applications are also included

    Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles

    Get PDF
    Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE

    Semantic Segmentation in Long-term Visual Localization

    Get PDF
    Tato práce má pět hlavních cílů. Nejprve mapuje datové sady používané pro dlouhodobou vizuální lokalizaci a vybere vhodné datové sady pro další vyhodnocení. Dále je vybrán a vylepšen jeden ze současných state-of-the-art přístupů. Výsledky s pečlivě vyladěnými parametry vybrané metody dosahují lepších výsledků lokalizace. Dále je ukázáno, že dynamické objekty v obrázku jsou pro dlouhodobou vizuální lokalizaci zbytečné, protože neobsahují žádnou užitečnou informaci a lze je zcela odstranit. Čtvrtým cílem této práce je pokusit se vložit sémantickou informaci do detektoru a deskriptoru klíčových bodů SuperPoint úpravou trénovacích dat. Závěrem je dosaženo nových state-of-the-art výsledků na vybrané datové sadě aplikací nového přístupu filtrování klíčových bodů založeného na sémantické informaci. Význam této práce ukazuje důležitost analýzy obrazové informace v úloze dlouhodobé vizuální lokalizace a detekce klíčových bodů obecně.ObhájenoThis thesis has five main goals. At first, it maps the datasets used for long-term visual localization and selects viable datasets for further evaluation. Next, one of the current state-of-the-art pipelines is selected and enhanced. Results with carefully fine-tuned methods' parameters accomplish better localization results. Furthermore, it shows that dynamic objects in an image are unnecessary for long-term visual localization because they do not contain any helpful information and can be ignored. The fourth goal in this thesis is to embed semantic segmentation information into the SuperPoint keypoint detector and descriptor by editing training data. Finally, the new state-of-the-art results on a selected dataset are achieved by applying a novel keypoint filtering approach based on semantic segmentation information. The significance of this work shows the importance of analyzing underlying image information in long-term visual localization and keypoint detection in general

    Machine Learning in Image Analysis and Pattern Recognition

    Get PDF
    This book is to chart the progress in applying machine learning, including deep learning, to a broad range of image analysis and pattern recognition problems and applications. In this book, we have assembled original research articles making unique contributions to the theory, methodology and applications of machine learning in image analysis and pattern recognition

    Visual Recognition of Human Rights Violations

    Get PDF
    This thesis is concerned with the automation of human rights violation recognition in images. Solving this problem is extremely beneficial to human rights organisations and investigators, who are often interested in identifying and documenting potential violations of human rights within images. It will allow them to avoid the overwhelming task of analysing large volumes of images manually. However, visual recognition of human rights violations is challenging and previously unattempted. Through the use of computer vision, the notion of visual recognition of human rights violations is forged in this thesis, whilst this area is addressed by strongly considering the constraints related to the usability and flexibility of a real practice. Firstly, image datasets of human rights violations which are suitable for training and testing modern visual representations, such as convolutional neural networks (CNNs) are introduced for the first time ever. Secondly, we develop and apply transfer learning models specific to the human rights violation recognition problem. Various fusion methods are proposed for performing an equivalence and complementarity analysis of object-centric and scene-centric deep image representations for the task of human rights violation recognition. Additionally, a web demo for predicting human rights violations that may be used directly by human rights advocates and analysts is developed. Next, the problem of recognising displaced people from still images is considered. To solve this, a novel mechanism centred around the level of control each person feels of the situation is developed. By leveraging this mechanism, typical image classification turns into a uniform framework that infers potential displaced people from images. Finally, a human-centric approach for recognising rich information about two emotional states is proposed. The derived global emotional traits are harnessed alongside a data-driven CNN classifier to efficiently infer two of the most widespread modern abuses against human rights, child labour and displaced populations

    Very High Resolution (VHR) Satellite Imagery: Processing and Applications

    Get PDF
    Recently, growing interest in the use of remote sensing imagery has appeared to provide synoptic maps of water quality parameters in coastal and inner water ecosystems;, monitoring of complex land ecosystems for biodiversity conservation; precision agriculture for the management of soils, crops, and pests; urban planning; disaster monitoring, etc. However, for these maps to achieve their full potential, it is important to engage in periodic monitoring and analysis of multi-temporal changes. In this context, very high resolution (VHR) satellite-based optical, infrared, and radar imaging instruments provide reliable information to implement spatially-based conservation actions. Moreover, they enable observations of parameters of our environment at greater broader spatial and finer temporal scales than those allowed through field observation alone. In this sense, recent very high resolution satellite technologies and image processing algorithms present the opportunity to develop quantitative techniques that have the potential to improve upon traditional techniques in terms of cost, mapping fidelity, and objectivity. Typical applications include multi-temporal classification, recognition and tracking of specific patterns, multisensor data fusion, analysis of land/marine ecosystem processes and environment monitoring, etc. This book aims to collect new developments, methodologies, and applications of very high resolution satellite data for remote sensing. The works selected provide to the research community the most recent advances on all aspects of VHR satellite remote sensing
    corecore