3,159 research outputs found

    Autonomous navigation for guide following in crowded indoor environments

    No full text
    The requirements for assisted living are rapidly changing as the number of elderly patients over the age of 60 continues to increase. This rise places a high level of stress on nurse practitioners who must care for more patients than they are capable. As this trend is expected to continue, new technology will be required to help care for patients. Mobile robots present an opportunity to help alleviate the stress on nurse practitioners by monitoring and performing remedial tasks for elderly patients. In order to produce mobile robots with the ability to perform these tasks, however, many challenges must be overcome. The hospital environment requires a high level of safety to prevent patient injury. Any facility that uses mobile robots, therefore, must be able to ensure that no harm will come to patients whilst in a care environment. This requires the robot to build a high level of understanding about the environment and the people with close proximity to the robot. Hitherto, most mobile robots have used vision-based sensors or 2D laser range finders. 3D time-of-flight sensors have recently been introduced and provide dense 3D point clouds of the environment at real-time frame rates. This provides mobile robots with previously unavailable dense information in real-time. I investigate the use of time-of-flight cameras for mobile robot navigation in crowded environments in this thesis. A unified framework to allow the robot to follow a guide through an indoor environment safely and efficiently is presented. Each component of the framework is analyzed in detail, with real-world scenarios illustrating its practical use. Time-of-flight cameras are relatively new sensors and, therefore, have inherent problems that must be overcome to receive consistent and accurate data. I propose a novel and practical probabilistic framework to overcome many of the inherent problems in this thesis. The framework fuses multiple depth maps with color information forming a reliable and consistent view of the world. In order for the robot to interact with the environment, contextual information is required. To this end, I propose a region-growing segmentation algorithm to group points based on surface characteristics, surface normal and surface curvature. The segmentation process creates a distinct set of surfaces, however, only a limited amount of contextual information is available to allow for interaction. Therefore, a novel classifier is proposed using spherical harmonics to differentiate people from all other objects. The added ability to identify people allows the robot to find potential candidates to follow. However, for safe navigation, the robot must continuously track all visible objects to obtain positional and velocity information. A multi-object tracking system is investigated to track visible objects reliably using multiple cues, shape and color. The tracking system allows the robot to react to the dynamic nature of people by building an estimate of the motion flow. This flow provides the robot with the necessary information to determine where and at what speeds it is safe to drive. In addition, a novel search strategy is proposed to allow the robot to recover a guide who has left the field-of-view. To achieve this, a search map is constructed with areas of the environment ranked according to how likely they are to reveal the guide’s true location. Then, the robot can approach the most likely search area to recover the guide. Finally, all components presented are joined to follow a guide through an indoor environment. The results achieved demonstrate the efficacy of the proposed components

    Bringing Lunar LiDAR Back Down to Earth: Mapping Our Industrial Heritage through Deep Transfer Learning

    Get PDF
    This is the final version. Available on open access from MDPI via the DOI in this recordThis article presents a novel deep learning method for semi-automated detection of historic mining pits using aerial LiDAR data. The recent emergence of national scale remotely sensed datasets has created the potential to greatly increase the rate of analysis and recording of cultural heritage sites. However, the time and resources required to process these datasets in traditional desktop surveys presents a near insurmountable challenge. The use of artificial intelligence to carry out preliminary processing of vast areas could enable experts to prioritize their prospection focus; however, success so far has been hindered by the lack of large training datasets in this field. This study develops an innovative transfer learning approach, utilizing a deep convolutional neural network initially trained on Lunar LiDAR datasets and reapplied here in an archaeological context. Recall rates of 80% and 83% were obtained on the 0.5 m and 0.25 m resolution datasets respectively, with false positive rates maintained below 20%. These results are state of the art and demonstrate that this model is an efficient, effective tool for semi-automated object detection for this type of archaeological objects. Further tests indicated strong potential for detection of other types of archaeological objects when trained accordingly

    Crowd Counting with Decomposed Uncertainty

    Full text link
    Research in neural networks in the field of computer vision has achieved remarkable accuracy for point estimation. However, the uncertainty in the estimation is rarely addressed. Uncertainty quantification accompanied by point estimation can lead to a more informed decision, and even improve the prediction quality. In this work, we focus on uncertainty estimation in the domain of crowd counting. With increasing occurrences of heavily crowded events such as political rallies, protests, concerts, etc., automated crowd analysis is becoming an increasingly crucial task. The stakes can be very high in many of these real-world applications. We propose a scalable neural network framework with quantification of decomposed uncertainty using a bootstrap ensemble. We demonstrate that the proposed uncertainty quantification method provides additional insight to the crowd counting problem and is simple to implement. We also show that our proposed method exhibits the state of the art performances in many benchmark crowd counting datasets.Comment: Accepted in AAAI 2020 (Main Technical Track

    Utilization and experimental evaluation of occlusion aware kernel correlation filter tracker using RGB-D

    Get PDF
    Unlike deep-learning which requires large training datasets, correlation filter-based trackers like Kernelized Correlation Filter (KCF) uses implicit properties of tracked images (circulant matrices) for training in real-time. Despite their practical application in tracking, a need for a better understanding of the fundamentals associated with KCF in terms of theoretically, mathematically, and experimentally exists. This thesis first details the workings prototype of the tracker and investigates its effectiveness in real-time applications and supporting visualizations. We further address some of the drawbacks of the tracker in cases of occlusions, scale changes, object rotation, out-of-view and model drift with our novel RGB-D Kernel Correlation tracker. We also study the use of particle filter to improve trackers\u27 accuracy. Our results are experimentally evaluated using a) standard dataset and b) real-time using Microsoft Kinect V2 sensor. We believe this work will set the basis for better understanding the effectiveness of kernel-based correlation filter trackers and to further define some of its possible advantages in tracking

    Image-based Decision Support Systems: Technical Concepts, Design Knowledge, and Applications for Sustainability

    Get PDF
    Unstructured data accounts for 80-90% of all data generated, with image data contributing its largest portion. In recent years, the field of computer vision, fueled by deep learning techniques, has made significant advances in exploiting this data to generate value. However, often computer vision models are not sufficient for value creation. In these cases, image-based decision support systems (IB-DSSs), i.e., decision support systems that rely on images and computer vision, can be used to create value by combining human and artificial intelligence. Despite its potential, there is only little work on IB-DSSs so far. In this thesis, we develop technical foundations and design knowledge for IBDSSs and demonstrate the possible positive effect of IB-DSSs on environmental sustainability. The theoretical contributions of this work are based on and evaluated in a series of artifacts in practical use cases: First, we use technical experiments to demonstrate the feasibility of innovative approaches to exploit images for IBDSSs. We show the feasibility of deep-learning-based computer vision and identify future research opportunities based on one of our practical use cases. Building on this, we develop and evaluate a novel approach for combining human and artificial intelligence for value creation from image data. Second, we develop design knowledge that can serve as a blueprint for future IB-DSSs. We perform two design science research studies to formulate generalizable principles for purposeful design — one for IB-DSSs and one for the subclass of image-mining-based decision support systems (IM-DSSs). While IB-DSSs can provide decision support based on single images, IM-DSSs are suitable when large amounts of image data are available and required for decision-making. Third, we demonstrate the viability of applying IBDSSs to enhance environmental sustainability by performing life cycle assessments for two practical use cases — one in which the IB-DSS enables a prolonged product lifetime and one in which the IB-DSS facilitates an improvement of manufacturing processes. We hope this thesis will contribute to expand the use and effectiveness of imagebased decision support systems in practice and will provide directions for future research

    Drone Obstacle Avoidance and Navigation Using Artificial Intelligence

    Get PDF
    This thesis presents an implementation and integration of a robust obstacle avoidance and navigation module with ardupilot. It explores the problems in the current solution of obstacle avoidance and tries to mitigate it with a new design. With the recent innovation in artificial intelligence, it also explores opportunities to enable and improve the functionalities of obstacle avoidance and navigation using AI techniques. Understanding different types of sensors for both navigation and obstacle avoidance is required for the implementation of the design and a study of the same is presented as a background. A research on an autonomous car is done for better understanding autonomy and learning how it is solving the problem of obstacle avoidance and navigation. The implementation part of the thesis is focused on the design of a robust obstacle avoidance module and is tested with obstacle avoidance sensors such as Garmin lidar and Realsense r200. Image segmentation is used to verify the possibility of using the convolutional neural network for better understanding the nature of obstacles. Similarly, the end to end control with a single camera input using a deep neural network is used for verifying the possibility of using AI for navigation. In the end, a robust obstacle avoidance library is developed and tested both in the simulator and real drone. Image segmentation is implemented, deployed and tested. A possibility of an end to end control is also verified by obtaining a proof of concept

    FAST ROTATED BOUNDING BOX ANNOTATIONS FOR OBJECT DETECTION

    Get PDF
    Traditionally, object detection models use a large amount of annotated data and axis-aligned bounding boxes (AABBs) are often chosen as the image annotation technique for both training and predictions. The purpose of annotating the objects in the images is to indicate the regions of interest with the corresponding labels. Accurate object annotations help the computer vision models to understand the distinct patterns of the image features to recognize and localize different classes of objects. However, AABBs are often a poor fit for elongated object instances. It’s also challenging to localize objects with AABBs in densely packed aerial images because of overlapping adjacent bounding boxes. Alternatively, using rectangular annotations that can be oriented diagonally, also known as rotated bounding boxes (RBB), can provide a much tighter fit for elongated objects and reduce the potential bounding box overlap between adjacent objects. However, RBBs are much more time-consuming and tedious to annotate than AABBs for large datasets. In this work, we propose a novel annotation tool named as FastRoLabelImg (Fast Rotated LabelImg) for producing high-quality RBB annotations with low time and effort. The tool generates accurate RBB proposals for objects of interest as the annotator makes progress through the dataset. It can also adapt available AABBs to generate RBB proposals. Furthermore, a multipoint box drawing system is provided to reduce manual RBB annotation time compared to the existing methods. Across three diverse datasets, we show that the proposal generation methods can achieve a maximum of 88.9% manual workload reduction. We also show that our proposed manual annotation method is twice as fast as the existing system with the same accuracy by conducting a participant study. Lastly, we publish the RBB annotations for two public datasets in order to motivate future research that will contribute in developing more competent object detection algorithms capable of RBB predictions

    Automatic Rural Road Centerline Extraction from Aerial Images for a Forest Fire Support System

    Get PDF
    In the last decades, Portugal has been severely affected by forest fires which have caused massive damage both environmentally and socially. Having a well-structured and precise mapping of rural roads is critical to help firefighters to mitigate these events. The traditional process of extracting rural roads centerlines from aerial images is extremely time-consuming and tedious, because the mapping operator has to manually label the road area and extract the road centerline. A frequent challenge in the process of extracting rural roads centerlines is the high amount of environmental complexity and road occlusions caused by vehicles, shadows, wild vegetation, and trees, bringing heterogeneous segments that can be further improved. This dissertation proposes an approach to automatically detect rural road segments as well as extracting the road centerlines from aerial images. The proposed method focuses on two main steps: on the first step, an architecture based on a deep learning model (DeepLabV3+) is used, to extract the road features maps and detect the rural roads. On the second step, the first stage of the process is an optimization for improving road connections, as well as cleaning white small objects from the predicted image by the neural network. Finally, a morphological approach is proposed to extract the rural road centerlines from the previously detected roads by using thinning algorithms like the Zhang-Suen and Guo-Hall methods. With the automation of these two stages, it is now possible to detect and extract road centerlines from complex rural environments automatically and faster than the traditional ways, and possibly integrating that data in a Geographical Information System (GIS), allowing the creation of real-time mapping applications.Nas últimas décadas, Portugal tem sido severamente afetado por fogos florestais, que têm causado grandes estragos ambientais e sociais. Possuir um sistema de mapeamento de estradas rurais bem estruturado e preciso é essencial para ajudar os bombeiros a mitigar este tipo de eventos. Os processos tradicionais de extração de eixos de via em estradas rurais a partir de imagens aéreas são extremamente demorados e fastidiosos. Um desafio frequente na extração de eixos de via de estradas rurais é a alta complexidade dos ambientes rurais e de estes serem obstruídos por veículos, sombras, vegetação selvagem e árvores, trazendo segmentos heterogéneos que podem ser melhorados. Esta dissertação propõe uma abordagem para detetar automaticamente estradas rurais, bem como extrair os eixos de via de imagens aéreas. O método proposto concentra-se em duas etapas principais: na primeira etapa é utilizada uma arquitetura baseada em modelos de aprendizagem profunda (DeepLabV3+), para detetar as estradas rurais. Na segunda etapa, primeiramente é proposta uma otimização de intercessões melhorando as conexões relativas aos eixos de via, bem como a remoção de pequenos artefactos que estejam a introduzir ruído nas imagens previstas pela rede neuronal. E, por último, é utilizada uma abordagem morfológica para extrair os eixos de via das estradas previamente detetadas recorrendo a algoritmos de esqueletização tais como os algoritmos Zhang-Suen e Guo-Hall. Automatizando estas etapas, é então possível extrair eixos de via de ambientes rurais de grande complexidade de forma automática e com uma maior rapidez em relação aos métodos tradicionais, permitindo, eventualmente, integrar os dados num Sistema de Informação Geográfica (SIG), possibilitando a criação de aplicativos de mapeamento em tempo real
    corecore