Search CORE

1,230 research outputs found

Effective image enhancement and fast object detection for improved UAV applications

Author: Tian Yikun
Publication venue
Publication date: 01/01/2023
Field of study

As an emerging field, unmanned aerial vehicles (UAVs) feature from interdisciplinary techniques in science, engineering and industrial sectors. The massive applications span from remote sensing, precision agriculture, marine inspection, coast guarding, environmental monitoring, natural resources monitoring, e.g. forest, land and river, and disaster assessment, to smart city, intelligent transportation and logistics and delivery. With the fast growing demands from a wide range of application sectors, there is always a bottleneck how to improve the efficiency and efficacy of UAV in operation. Often, smart decision making is needed from the captured footages in a real-time manner, yet this is severely affected by the poor image quality, ineffective object detection and recognition models, and lack of robust and light models for supporting the edge computing and real deployment. In this thesis, several innovative works have been focused and developed to tackle some of the above issues. First of all, considering the quality requirements of the UAV images, various approaches and models have been proposed, yet they focus on different aspects and produce inconsistent results. As such, the work in this thesis has been categorised into denoising and dehazing focused, followed by comprehensive evaluation in terms of both qualitative and quantitative assessment. These will provide valuable insights and useful guidance to help the end user and research community. For fast and effective object detection and recognition, deep learning based models, especially the YOLO series, are popularly used. However, taking the YOLOv7 as the baseline, the performance is very much affected by a few factors, such as the low quality of the UAV images and the high-level of demanding of resources, leading to unsatisfactory performance in accuracy and processing speed. As a result, three major improvements, namely transformer, CIoULoss and the GhostBottleneck module, are introduced in this work to improve feature extraction, decision making in detection and recognition, and running efficiency. Comprehensive experiments on both publicly available and self-collected datasets have validated the efficiency and efficacy of the proposed algorithm. In addition, to facilitate the real deployment such as edge computing scenarios, embedded implementation of the key algorithm modules is introduced. These include the creative implementation on the Xavier NX platform, in comparison to the standard workstation settings with the NVIDIA GPUs. As a result, it has demonstrated promising results with improved performance in reduced resources consumption of the CPU/GPU usage and enhanced frame rate of real-time processing to benefit the real-time deployment with the uncompromised edge computing. Through these innovative investigation and development, a better understanding has been established on key challenges associated with UAV and Simultaneous Localisation and Mapping (SLAM) based applications, and possible solutions are presented. Keywords: Unmanned aerial vehicles (UAV); Simultaneous Localisation and Mapping (SLAM); denoising; dehazing; object detection; object recognition; deep learning; YOLOv7; transformer; GhostBottleneck; scene matching; embedded implementation; Xavier NX; edge computing.As an emerging field, unmanned aerial vehicles (UAVs) feature from interdisciplinary techniques in science, engineering and industrial sectors. The massive applications span from remote sensing, precision agriculture, marine inspection, coast guarding, environmental monitoring, natural resources monitoring, e.g. forest, land and river, and disaster assessment, to smart city, intelligent transportation and logistics and delivery. With the fast growing demands from a wide range of application sectors, there is always a bottleneck how to improve the efficiency and efficacy of UAV in operation. Often, smart decision making is needed from the captured footages in a real-time manner, yet this is severely affected by the poor image quality, ineffective object detection and recognition models, and lack of robust and light models for supporting the edge computing and real deployment. In this thesis, several innovative works have been focused and developed to tackle some of the above issues. First of all, considering the quality requirements of the UAV images, various approaches and models have been proposed, yet they focus on different aspects and produce inconsistent results. As such, the work in this thesis has been categorised into denoising and dehazing focused, followed by comprehensive evaluation in terms of both qualitative and quantitative assessment. These will provide valuable insights and useful guidance to help the end user and research community. For fast and effective object detection and recognition, deep learning based models, especially the YOLO series, are popularly used. However, taking the YOLOv7 as the baseline, the performance is very much affected by a few factors, such as the low quality of the UAV images and the high-level of demanding of resources, leading to unsatisfactory performance in accuracy and processing speed. As a result, three major improvements, namely transformer, CIoULoss and the GhostBottleneck module, are introduced in this work to improve feature extraction, decision making in detection and recognition, and running efficiency. Comprehensive experiments on both publicly available and self-collected datasets have validated the efficiency and efficacy of the proposed algorithm. In addition, to facilitate the real deployment such as edge computing scenarios, embedded implementation of the key algorithm modules is introduced. These include the creative implementation on the Xavier NX platform, in comparison to the standard workstation settings with the NVIDIA GPUs. As a result, it has demonstrated promising results with improved performance in reduced resources consumption of the CPU/GPU usage and enhanced frame rate of real-time processing to benefit the real-time deployment with the uncompromised edge computing. Through these innovative investigation and development, a better understanding has been established on key challenges associated with UAV and Simultaneous Localisation and Mapping (SLAM) based applications, and possible solutions are presented. Keywords: Unmanned aerial vehicles (UAV); Simultaneous Localisation and Mapping (SLAM); denoising; dehazing; object detection; object recognition; deep learning; YOLOv7; transformer; GhostBottleneck; scene matching; embedded implementation; Xavier NX; edge computing

STAX (Strathclyde Repository)

Mono-hydra:Real-time 3D scene graph construction from monocular camera input with IMU

Author: Nex F.
Udugama U. V. B. L.
Vosselman G.
Publication venue: ArXiv.org
Publication date: 10/08/2023
Field of study

University of Twente Research Information

RoboChain: A Secure Data-Sharing Framework for Human-Robot Interaction

Author: Ferrer Eduardo Castelló
Hardjono Thomas
Pentland Alex
Rudovic Ognjen
Publication venue
Publication date: 14/02/2018
Field of study

Robots have potential to revolutionize the way we interact with the world around us. One of their largest potentials is in the domain of mobile health where they can be used to facilitate clinical interventions. However, to accomplish this, robots need to have access to our private data in order to learn from these data and improve their interaction capabilities. Furthermore, to enhance this learning process, the knowledge sharing among multiple robot units is the natural step forward. However, to date, there is no well-established framework which allows for such data sharing while preserving the privacy of the users (e.g., the hospital patients). To this end, we introduce RoboChain - the first learning framework for secure, decentralized and computationally efficient data and model sharing among multiple robot units installed at multiple sites (e.g., hospitals). RoboChain builds upon and combines the latest advances in open data access and blockchain technologies, as well as machine learning. We illustrate this framework using the example of a clinical intervention conducted in a private network of hospitals. Specifically, we lay down the system architecture that allows multiple robot units, conducting the interventions at different hospitals, to perform efficient learning without compromising the data privacy.Comment: 7 pages, 6 figure

arXiv.org e-Print Archive

DSpace@MIT

Robotic Detection of a Human-Comprehensible Gestural Language for Underwater Multi-Human-Robot Collaboration

Author: Enan Sadman Sakib
Fulton Michael
Sattar Junaed
Publication venue
Publication date: 12/07/2022
Field of study

In this paper, we present a motion-based robotic communication framework that enables non-verbal communication among autonomous underwater vehicles (AUVs) and human divers. We design a gestural language for AUV-to-AUV communication which can be easily understood by divers observing the conversation unlike typical radio frequency, light, or audio based AUV communication. To allow AUVs to visually understand a gesture from another AUV, we propose a deep network (RRCommNet) which exploits a self-attention mechanism to learn to recognize each message by extracting maximally discriminative spatio-temporal features. We train this network on diverse simulated and real-world data. Our experimental evaluations, both in simulation and in closed-water robot trials, demonstrate that the proposed RRCommNet architecture is able to decipher gesture-based messages with an average accuracy of 88-94% on simulated data, 73-83% on real data (depending on the version of the model used). Further, by performing a message transcription study with human participants, we also show that the proposed language can be understood by humans, with an overall transcription accuracy of 88%. Finally, we discuss the inference runtime of RRCommNet on embedded GPU hardware, for real-time use on board AUVs in the field

arXiv.org e-Print Archive

Utilizing Reinforcement Learning and Computer Vision in a Pick-And-Place Operation for Sorting Objects in Motion

Author: Sand Kristoffer
Solberg Trygve Andre Olsøy
Publication venue: 'University of Agder'
Publication date: 01/01/2023
Field of study

This master's thesis studies the implementation of advanced machine learning (ML) techniques in industrial automation systems, focusing on applying machine learning to enable and evolve autonomous sorting capabilities in robotic manipulators. In particular, Inverse Kinematics (IK) and Reinforcement Learning (RL) are investigated as methods for controlling a UR10e robotic arm for pick-and-place of moving objects on a conveyor belt within a small-scale sorting facility. A camera-based computer vision system applying YOLOv8 is used for real-time object detection and instance segmentation. Perception data is utilized to ascertain optimal grip points, specifically through an implemented algorithm that outputs optimal grip position, angle, and width. As the implemented system includes testing and evaluation on a physical system, the intricacies of hardware control, specifically the reverse engineering of an OnRobot RG6 gripper is elaborated as part of this study. The system is implemented on the Robotic Operating System (ROS), and its design is in particular driven by high modularity and scalability in mind. The camera-based vision system serves as the primary input, while the robot control is the output. The implemented system design allows for the evaluation of motion control employing both IK and RL. Computation of IK is conducted via MoveIt2, while the RL model is trained and computed in NVIDIA Isaac Sim. The high-level control of the robotic manipulator was accomplished with use of Proximal Policy Optimization (PPO). The main result of the research is a novel reward function for the pick-and-place operation that takes into account distance and orientation from the target object. In addition, the provided system administers task control by independently initializing pick-and-place operation phases for each environment. The findings demonstrate that PPO was able to significantly enhance the velocity, accuracy, and adaptability of industrial automation. Our research shows that accurate control of the robot arm can be reached by training the PPO Model purely by applying a digital twin simulation

Agder University Research Archive

Real-time performance-focused on localisation techniques for autonomous vehicle: a review

Author: Lu Yongqiang
Ma Hongjie
Smart Edward
Yu Hui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/05/2021
Field of study

Portsmouth University Research Portal (Pure)

Recommended from our members

Real-time smart and standalone vision/IMU navigation sensor

Author: D Nister
D Scaramuzza
F Fraundorfer
Gianfranco Visentin
JE Dennis Jr
K Konolige
Lounis Chermak
M Hwangbo
M Maimone
Mark Richardson
MI Lourakis
MLA Lourakis
Nabil Aouf
S Tanathong
SN Sinha
SY Chen
Y Tian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In this paper, we present a smart, standalone, multi-platform stereo vision/IMU-based navigation system, providing ego-motion estimation. The real-time visual odometry algorithm is run on a nano ITX single-board computer (SBC) of 1.9 GHz CPU and 16-core GPU. High-resolution stereo images of 1.2 megapixel provide high-quality data. Tracking of up to 750 features is made possible at 5 fps thanks to a minimal, but efficient, features detection–stereo matching–feature tracking scheme runs on the GPU. Furthermore, the feature tracking algorithm benefits from assistance of a 100 Hz IMU whose accelerometer and gyroscope data provide inertial features prediction enhancing execution speed and tracking efficiency. In a space mission context, we demonstrate robustness and accuracy of the real-time generated 6-degrees-of-freedom trajectories from our visual odometry algorithm. Performance evaluations are comparable to ground truth measurements from an external motion capture system

City Research Online

Crossref

Springer - Publisher Connector

Cranfield CERES

Enabling Multi-LiDAR Sensing in GNSS-Denied Environments: SLAM Dataset, Benchmark, and UAV Tracking with LiDAR-as-a-camera

Author: Ha Sier
Publication venue
Publication date: 29/08/2023
Field of study

The rise of Light Detection and Ranging (LiDAR) sensors has profoundly impacted industries ranging from automotive to urban planning. As these sensors become increasingly affordable and compact, their applications are diversifying, driving precision, and innovation. This thesis delves into LiDAR's advancements in autonomous robotic systems, with a focus on its role in simultaneous localization and mapping (SLAM) methodologies and LiDAR as a camera-based tracking for Unmanned Aerial Vehicles (UAV). Our contributions span two primary domains: the Multi-Modal LiDAR SLAM Benchmark, and the LiDAR-as-a-camera UAV Tracking. In the former, we have expanded our previous multi-modal LiDAR dataset by adding more data sequences from various scenarios. In contrast to the previous dataset, we employ different ground truth-generating approaches. We propose a new multi-modal multi-lidar SLAM-assisted and ICP-based sensor fusion method for generating ground truth maps. Additionally, we also supplement our data with new open road sequences with GNSS-RTK. This enriched dataset, supported by high-resolution LiDAR, provides detailed insights through an evaluation of ten configurations, pairing diverse LiDAR sensors with state-of-the-art SLAM algorithms. In the latter contribution, we leverage a custom YOLOv5 model trained on panoramic low-resolution images from LiDAR reflectivity (LiDAR-as-a-camera) to detect UAVs, demonstrating the superiority of this approach over point cloud or image-only methods. Additionally, we evaluated the real-time performance of our approach on the Nvidia Jetson Nano, a popular mobile computing platform. Overall, our research underscores the transformative potential of integrating advanced LiDAR sensors with autonomous robotics. By bridging the gaps between different technological approaches, we pave the way for more versatile and efficient applications in the future

UTUPub

Towards the simulation of cooperative perception applications by leveraging distributed sensing infrastructures

Author: Oliveira Miguel Ferreira
Publication venue
Publication date: 09/10/2023
Field of study

With the rapid development of Automated Vehicles (AV), the boundaries of their function alities are being pushed and new challenges are being imposed. In increasingly complex and dynamic environments, it is fundamental to rely on more powerful onboard sensors and usually AI. However, there are limitations to this approach. As AVs are increasingly being integrated in several industries, expectations regarding their cooperation ability is growing, and vehicle-centric approaches to sensing and reasoning, become hard to integrate. The proposed approach is to extend perception to the environment, i.e. outside of the vehicle, by making it smarter, via the deployment of wireless sensors and actuators. This will vastly improve the perception capabilities in dynamic and unpredictable scenarios and often in a cheaper way, relying mostly in the use of lower cost sensors and embedded devices, which rely on their scale deployment instead of centralized sensing abilities. Consequently, to support the development and deployment of such cooperation actions in a seamless way, we require the usage of co-simulation frameworks, that can encompass multiple perspectives of control and communications for the AVs, the wireless sensors and actuators and other actors in the environment. In this work, we rely on ROS2 and micro-ROS as the underlying technologies for integrating several simulation tools, to construct a framework, capable of supporting the development, test and validation of such smart, cooperative environments. This endeavor was undertaken by building upon an existing simulation framework known as AuNa. We extended its capabilities to facilitate the simulation of cooperative scenarios by incorporat ing external sensors placed within the environment rather than just relying on vehicle-based sensors. Moreover, we devised a cooperative perception approach within this framework, showcasing its substantial potential and effectiveness. This will enable the demonstration of multiple cooperation scenarios and also ease the deployment phase by relying on the same software architecture.Com o rápido desenvolvimento dos Veículos Autónomos (AV), os limites das suas funcional idades estão a ser alcançados e novos desafios estão a surgir. Em ambientes complexos e dinâmicos, é fundamental a utilização de sensores de alta capacidade e, na maioria dos casos, inteligência artificial. Mas existem limitações nesta abordagem. Como os AVs estão a ser integrados em várias indústrias, as expectativas quanto à sua capacidade de cooperação estão a aumentar, e as abordagens de perceção e raciocínio centradas no veículo, tornam-se difíceis de integrar. A abordagem proposta consiste em extender a perceção para o ambiente, isto é, fora do veículo, tornando-a inteligente, através do uso de sensores e atuadores wireless. Isto irá melhorar as capacidades de perceção em cenários dinâmicos e imprevisíveis, reduzindo o custo, pois a abordagem será baseada no uso de sensores low-cost e sistemas embebidos, que dependem da sua implementação em grande escala em vez da capacidade de perceção centralizada. Consequentemente, para apoiar o desenvolvimento e implementação destas ações em cooperação, é necessária a utilização de frameworks de co-simulação, que abranjam múltiplas perspetivas de controlo e comunicação para os AVs, sensores e atuadores wireless, e outros atores no ambiente. Neste trabalho será utilizado ROS2 e micro-ROS como as tecnologias subjacentes para a integração das ferramentas de simulação, de modo a construir uma framework capaz de apoiar o desenvolvimento, teste e validação de ambientes inteligentes e cooperativos. Esta tarefa foi realizada com base numa framework de simulação denominada AuNa. Foram expandidas as suas capacidades para facilitar a simulação de cenários cooperativos através da incorporação de sensores externos colocados no ambiente, em vez de depender apenas de sensores montados nos veículos. Além disso, concebemos uma abordagem de perceção cooperativa usando a framework, demonstrando o seu potencial e eficácia. Isto irá permitir a demonstração de múltiplos cenários de cooperação e também facilitar a fase de implementação, utilizando a mesma arquitetura de software

Repositório Científico do Instituto Politécnico do Porto