63 research outputs found

    Tactical ISR/C2 Integration with AI/ML Augmentation

    NPS NRP Project PresentationNAVPLAN 2021 specifies Distributed Maritime Operations (DMO) with a tactical grid to connect distributed nodes with processing at the tactical edge to include Artificial Intelligence/Machine Learning (AI/ML) in support of Expeditionary Advanced Base Operations (EABO) and Littoral Operations in a Contested Environment (LOCE). Joint All-Domain Command and Control (JADC2) is the concept for sensor integration. However, Intelligence, Surveillance and Reconnaissance (ISR) and Command and Control (C2) hardware and software have yet to be fully defined, tools integrated, and configurations tested. This project evaluates options for ISR and C2 integration into a Common Operational Picture (COP) with AI/ML for decision support on tactical clouds in support of DMO, EABO, LOCE and JADC2 objectives.Commander, Naval Surface Forces (CNSF)U.S. Fleet Forces Command (USFF)This research is supported by funding from the Naval Postgraduate School, Naval Research Program (PE 0605853N/2098). https://nps.edu/nrpChief of Naval Operations (CNO)Approved for public release. Distribution is unlimited.

    Autonomous Drones for Trail Navigation using DNNs

    Στην παρούσα διπλωματική εργασία, προτείνεται ο σχεδιασμός και η υλοποίηση ενός πρότυπου drone που έχει τη δυνατότητα αυτόνομης πλοήγησης σε δασικό μονοπάτι χωρίς πρότερη γνώση του περιβάλλοντα χώρου. Χρησιμοποιεί σύστημα τεχνητής όρασης τριών επιπέδων: (i) ένα νευρωνικό δίκτυο βάθους (DNN) για εκτίμηση πλευρικής μετατόπισης και προσανατολισμού ως προς το κέντρο του μονοπατιού, (ii) ένα DNN για αναγνώριση αντικειμένων, και (iii) ένα σύστημα αποφυγής εμποδίων. Η σύνθεση του μικρού εναέριου σκάφους (MAV) έγινε από διαθέσιμα εξαρτήματα (hardware) του εργαστηρίου. Για τον αλγόριθμο ακολουθίας δασικών μονοπατιών, ως βάση νευρωνικού δικτύου χρησιμοποιήθηκε το TrailNet. Στη συνέχεια επανεκπαιδεύτηκε και εμπλουτίστηκε με σύνολο δεδομένων που δημιουργήθηκε από την δασική περιοχή της Πανεπιστημιούπολης Ιλισίων, προσαρμόζοντάς το στην τοπική βλάστηση. Για την επιλογή των βέλτιστων αλγορίθμων αναγνώρισης αντικειμένων, έγινε δοκιμή και αξιολόγηση από αντίστοιχους της τελευταίας γενιάς στην πλακέτα επεξεργασίας Jetson TX2 της NVIDIA. Τέλος δίνεται πρόταση πειραματικής πτήσης με συγκεκριμένες παραμέτρους για την αξιολόγηση της ορθής λειτουργίας.This thesis proposes the design and implementation of a prototype drone stack that is able to autonomously navigate through a forest trail path without having prior knowledge of the surrounding area. It uses a 3 level vision system: (i) a deep neural network (DNN) for estimating the view orientation and lateral offset of the vehicle with respect to the trail center, (ii) a DNN for object detection and (iii) a Guidance system for obstacle avoidance. Hardware synthesis of the Micro Aerial Vehicle (MAV) was built upon hardware parts, available from the lab. Trail following algorithm makes use of TrailNet’s neural network. It was also retrained and enriched by a newly created dataset, formed with footage from the nearby forest canopy of Ilisia Univesity Campus. This also made the model more adaptive to local vegetation characteristics. For object detection service, a comparison between well-known algorithms was made and an evaluation was done in terms of accuracy and efficiency. These were tested on NVIDIA’s Jetson TX2 Dev Kit board. At last, a suggestion of an experimental flight is given with particular parameters, for the evaluation of the proper operation

    Vision-based Learning for Drones: A Survey

    Drones as advanced cyber-physical systems are undergoing a transformative shift with the advent of vision-based learning, a field that is rapidly gaining prominence due to its profound impact on drone autonomy and functionality. Different from existing task-specific surveys, this review offers a comprehensive overview of vision-based learning in drones, emphasizing its pivotal role in enhancing their operational capabilities under various scenarios. We start by elucidating the fundamental principles of vision-based learning, highlighting how it significantly improves drones' visual perception and decision-making processes. We then categorize vision-based control methods into indirect, semi-direct, and end-to-end approaches from the perception-control perspective. We further explore various applications of vision-based drones with learning capabilities, ranging from single-agent systems to more complex multi-agent and heterogeneous system scenarios, and underscore the challenges and innovations characterizing each area. Finally, we explore open questions and potential solutions, paving the way for ongoing research and development in this dynamic and rapidly evolving field. With growing large language models (LLMs) and embodied intelligence, vision-based learning for drones provides a promising but challenging road towards artificial general intelligence (AGI) in 3D physical world

    Positioning in 5G and 6G Networks—A Survey

    Determining the position of ourselves or our assets has always been important to humans. Technology has helped us, from sextants to outdoor global positioning systems, but real-time indoor positioning has been a challenge. Among the various solutions, network-based positioning became an option with the arrival of 5G mobile networks. The new radio technologies, minimized end-to-end latency, specialized control protocols, and booming computation capacities at the network edge offered the opportunity to leverage the overall capabilities of the 5G network for positioning—indoors and outdoors. This paper provides an overview of network-based positioning, from the basics to advanced, state-of-the-art machine-learning-supported solutions. One of the main contributions is the detailed comparison of machine learning techniques used for network-based positioning. Since new requirements are already in place for 6G networks, our paper makes a leap towards positioning with 6G networks. In order to also highlight the practical side of the topic, application examples from different domains are presented with a special focus on industrial and vehicular scenarios

    Virtual Reality via Object Pose Estimation and Active Learning:Realizing Telepresence Robots with Aerial Manipulation Capabilities

    This paper presents a novel telepresence system for advancing aerial manipulation indynamic and unstructured environments. The proposed system not only features a haptic device, but also a virtual reality (VR) interface that provides real-time 3D displays of the robot’s workspace as well as a haptic guidance to its remotely located operator. To realize this, multiple sensors, namely, a LiDAR, cameras, and IMUs are utilized. For processing of the acquired sensory data, pose estimation pipelines are devised for industrial objects of both known and unknown geometries. We further propose an active learning pipeline in order to increase the sample efficiency of a pipeline component that relies on a Deep Neural Network (DNN) based object detector. All these algorithms jointly address various challenges encountered during the execution of perception tasks in industrial scenarios. In the experiments, exhaustive ablation studies are provided to validate the proposed pipelines. Method-ologically, these results commonly suggest how an awareness of the algorithms’ own failures and uncertainty (“introspection”) can be used to tackle the encountered problems. Moreover, outdoor experiments are conducted to evaluate the effectiveness of the overall system in enhancing aerial manipulation capabilities. In particular, with flight campaigns over days and nights, from spring to winter, and with different users and locations, we demonstrate over 70 robust executions of pick-and-place, force application and peg-in-hole tasks with the DLR cable-Suspended Aerial Manipulator (SAM). As a result, we show the viability of the proposed system in future industrial applications


    In recent years, there has been significant amount of research work on human activity classification relying either on Inertial Measurement Unit (IMU) data or data from static cameras providing a third-person view. There has been relatively less work using wearable cameras, providing egocentric view, which is a first-person view providing the view of the environment as seen by the wearer. Using only IMU data limits the variety and complexity of the activities that can be detected. Deep machine learning has achieved great success in image and video processing in recent years. Neural network based models provide improved accuracy in multiple fields in computer vision. However, there has been relatively less work focusing on designing specific models to improve the performance of egocentric image/video tasks. As deep neural networks keep improving the accuracy in computer vision tasks, the robustness and resilience of the networks should be improved as well to make it possible to be applied in safety-crucial areas such as autonomous driving. Motivated by these considerations, in the first part of the thesis, the problem of human activity detection and classification from egocentric cameras is addressed. First, anew method is presented to count the number of footsteps and compute the total traveled distance by using the data from the IMU sensors and camera of a smart phone. By incorporating data from multiple sensor modalities, and calculating the length of each step, instead of using preset stride lengths and assuming equal-length steps, the proposed method provides much higher accuracy compared to commercially available step counting apps. After the application of footstep counting, more complicated human activities, such as steps of preparing a recipe and sitting on a sofa, are taken into consideration. Multiple classification methods, non-deep learning and deep-learning-based, are presented, which employ both ego-centric camera and IMU data. Then, a Genetic Algorithm-based approach is employed to set the parameters of an activity classification network autonomously and performance is compared with empirically-set parameters. Then, a new framework is introduced to reduce the computational cost of human temporal activity recognition from egocentric videos while maintaining the accuracy at a comparable level. The actor-critic model of reinforcement learning is applied to optical flow data to locate a bounding box around region of interest, which is then used for clipping a sub-image from a video frame. A shallow and deeper 3D convolutional neural network is designed to process the original image and the clipped image region, respectively.Next, a systematic method is introduced that autonomously and simultaneously optimizes multiple parameters of any deep neural network by using a bi-generative adversarial network (Bi-GAN) guiding a genetic algorithm(GA). The proposed Bi-GAN allows the autonomous exploitation and choice of the number of neurons for the fully-connected layers, and number of filters for the convolutional layers, from a large range of values. The Bi-GAN involves two generators, and two different models compete and improve each other progressively with a GAN-based strategy to optimize the networks during a GA evolution.In this analysis, three different neural network layers and datasets are taken into consideration: First, 3D convolutional layers for ModelNet40 dataset. We applied the proposed approach on a 3D convolutional network by using the ModelNet40 dataset. ModelNet is a dataset of 3D point clouds. The goal is to perform shape classification over 40shape classes. LSTM layers for UCI HAR dataset. UCI HAR dataset is composed of InertialMeasurement Unit (IMU) data captured during activities of standing, sitting, laying, walking, walking upstairs and walking downstairs. These activities were performed by 30 subjects, and the 3-axial linear acceleration and 3-axial angular velocity were collected at a constant rate of 50Hz. 2D convolutional layers for Chars74k Dataset. Chars74k dataset contains 64 classes(0-9, A-Z, a-z), 7705 characters obtained from natural images, 3410 hand-drawn characters using a tablet PC and 62992 synthesised characters from computer fonts giving a total of over 74K images. In the final part of the thesis, network robustness and resilience for neural network models is investigated from adversarial examples (AEs) and automatic driving conditions. The transferability of adversarial examples across a wide range of real-world computer vision tasks, including image classification, explicit content detection, optical character recognition(OCR), and object detection are investigated. It represents the cybercriminal’s situation where an ensemble of different detection mechanisms need to be evaded all at once.Novel dispersion Reduction(DR) attack is designed, which is a practical attack that overcomes existing attacks’ limitation of requiring task-specific loss functions by targeting on the “dispersion” of internal feature map. In the autonomous driving scenario, the adversarial machine learning attacks against the complete visual perception pipeline in autonomous driving is studied. A novel attack technique, tracker hijacking, that can effectively fool Multi-Object Tracking (MOT) using AEs on object detection is presented. Using this technique, successful AEs on as few as one single frame can move an existing object in to or out of the headway of an autonomous vehicle to cause potential safety hazards

    Agilicious: Open-source and open-hardware agile quadrotor for vision-based flight

    Autonomous, agile quadrotor flight raises fundamental challenges for robotics research in terms of perception, planning, learning, and control. A versatile and standardized platform is needed to accelerate research and let practitioners focus on the core problems. To this end, we present Agilicious, a codesigned hardware and software framework tailored to autonomous, agile quadrotor flight. It is completely open source and open hardware and supports both model-based and neural network–based controllers. Also, it provides high thrust-to-weight and torque-to-inertia ratios for agility, onboard vision sensors, graphics processing unit (GPU)–accelerated compute hardware for real-time perception and neural network inference, a real-time flight controller, and a versatile software stack. In contrast to existing frameworks, Agilicious offers a unique combination of flexible software stack and high-performance hardware. We compare Agilicious with prior works and demonstrate it on different agile tasks, using both model-based and neural network–based controllers. Our demonstrators include trajectory tracking at up to 5g and 70 kilometers per hour in a motion capture system, and vision-based acrobatic flight and obstacle avoidance in both structured and unstructured environments using solely onboard perception. Last, we demonstrate its use for hardware-in-the-loop simulation in virtual reality environments. Because of its versatility, we believe that Agilicious supports the next generation of scientific and industrial quadrotor research

    A Cost-Effective Person-Following System for Assistive Unmanned Vehicles with Deep Learning at the Edge

    The vital statistics of the last century highlight a sharp increment of the average age of the world population with a consequent growth of the number of older people. Service robotics applications have the potentiality to provide systems and tools to support the autonomous and self-sufficient older adults in their houses in everyday life, thereby avoiding the task of monitoring them with third parties. In this context, we propose a cost-effective modular solution to detect and follow a person in an indoor, domestic environment. We exploited the latest advancements in deep learning optimization techniques, and we compared different neural network accelerators to provide a robust and flexible person-following system at the edge. Our proposed cost-effective and power-efficient solution is fully-integrable with pre-existing navigation stacks and creates the foundations for the development of fully-autonomous and self-contained service robotics applications

    State of the art in vision-based localization techniques for autonomous navigation systems

    Failure Analysis in Next-Generation Critical Cellular Communication Infrastructures

    The advent of communication technologies marks a transformative phase in critical infrastructure construction, where the meticulous analysis of failures becomes paramount in achieving the fundamental objectives of continuity, security, and availability. This survey enriches the discourse on failures, failure analysis, and countermeasures in the context of the next-generation critical communication infrastructures. Through an exhaustive examination of existing literature, we discern and categorize prominent research orientations with focuses on, namely resource depletion, security vulnerabilities, and system availability concerns. We also analyze constructive countermeasures tailored to address identified failure scenarios and their prevention. Furthermore, the survey emphasizes the imperative for standardization in addressing failures related to Artificial Intelligence (AI) within the ambit of the sixth-generation (6G) networks, accounting for the forward-looking perspective for the envisioned intelligence of 6G network architecture. By identifying new challenges and delineating future research directions, this survey can help guide stakeholders toward unexplored territories, fostering innovation and resilience in critical communication infrastructure development and failure prevention