606 research outputs found

    A Robust Object Detection System for Driverless Vehicles through Sensor Fusion and Artificial Intelligence Techniques

    Get PDF
    Since the early 1990s, various research domains have been concerned with the concept of autonomous driving, leading to the widespread implementation of numerous advanced driver assistance features. However, fully automated vehicles have not yet been introduced to the market. The process of autonomous driving can be outlined through the following stages: environment perception, ego-vehicle localization, trajectory estimation, path planning, and vehicle control. Environment perception is partially based on computer vision algorithms that can detect and track surrounding objects. The process of objects detection performed by autonomous vehicles is considered challenging for several reasons, such as the presence of multiple dynamic objects in the same scene, interaction between objects, real-time speed requirements, and the presence of diverse weather conditions (e.g., rain, snow, fog, etc.). Although many studies have been conducted on objects detection performed by autonomous vehicles, it remains a challenging task, and improving the performance of object detection in diverse driving scenes is an ongoing field. This thesis aims to develop novel methods for the detection and 3D localization of surrounding dynamic objects in driving scenes in different rainy weather conditions. In this thesis, firstly, owing to the frequent occurrence of rain and its negative effect on the performance of objects detection operation, a real-time lightweight deraining network is proposed; it works on single real-time images separately. Rain streaks and the accumulation of rain streaks introduce distinct visual degradation effects to captured images. The proposed deraining network effectively removes both rain streaks and accumulated rain streaks from images. It makes use of the progressive operation of two main stages: rain streaks removal and rain streaks accumulation removal. The rain streaks removal stage is based on a Residual Network (ResNet) to maintain real-time performance and avoid adding to the computational complexity. Furthermore, the application of recursive computations involves the sharing of network parameters. Meanwhile, distant rain streaks accumulate and induce a distortion similar to fogging. Thus, it could be mitigated in a way similar to defogging. This stage relies on a transmission-guided lightweight network (TGL-Net). The proposed deraining network was evaluated on five datasets having synthetic rain of different properties and two other datasets with real rainy scenes. Secondly, an emphasis has been put on proposing a novel sensory system that achieves realtime multiple dynamic objects detection in driving scenes. The proposed sensory system utilizes a monocular camera and a 2D Light Detection and Ranging (LiDAR) sensor in a complementary fusion approach. YOLOv3- a baseline real-time object detection algorithm has been used to detect and classify objects in images captured by the camera; detected objects are surrounded by bounding boxes to localize them within the frames. Since objects present in a driving scene are dynamic and usually occluding each other, an algorithm has been developed to differentiate objects whose bounding boxes are overlapping. Moreover, the locations of bounding boxes within frames (in pixels) are converted into real-world angular coordinates. A 2D LiDAR was used to obtain depth measurements while maintaining low computational requirements in order to save resources for other autonomous driving related operations. A novel technique has been developed and tested for processing and mapping 2D LiDAR measurements with corresponding bounding boxes. The detection accuracy of the proposed system was manually evaluated in different real-time scenarios. Finally, the effectiveness of the proposed deraining network was validated in terms of its impact on objects detection in the context of de-rained images. Results of the proposed deraining network were compared to existing baseline deraining networks and have shown that the running time of the proposed network is 2.23× faster than the average running time of baseline deraining networks while achieving 1.2× improvement when tested on different synthetic datasets. Moreover, tests on the LiDAR measurements showed an average error of ±0.04m in real driving scenes. Also, both deraining and objects detection are jointly tested, and it was demonstrated that performing deraining ahead of objects detection caused 1.45× enhancement in the object detection precision

    Modelling, Simulation and Data Analysis in Acoustical Problems

    Get PDF
    Modelling and simulation in acoustics is currently gaining importance. In fact, with the development and improvement of innovative computational techniques and with the growing need for predictive models, an impressive boost has been observed in several research and application areas, such as noise control, indoor acoustics, and industrial applications. This led us to the proposal of a special issue about “Modelling, Simulation and Data Analysis in Acoustical Problems”, as we believe in the importance of these topics in modern acoustics’ studies. In total, 81 papers were submitted and 33 of them were published, with an acceptance rate of 37.5%. According to the number of papers submitted, it can be affirmed that this is a trending topic in the scientific and academic community and this special issue will try to provide a future reference for the research that will be developed in coming years

    Trustworthy Edge Machine Learning: A Survey

    Full text link
    The convergence of Edge Computing (EC) and Machine Learning (ML), known as Edge Machine Learning (EML), has become a highly regarded research area by utilizing distributed network resources to perform joint training and inference in a cooperative manner. However, EML faces various challenges due to resource constraints, heterogeneous network environments, and diverse service requirements of different applications, which together affect the trustworthiness of EML in the eyes of its stakeholders. This survey provides a comprehensive summary of definitions, attributes, frameworks, techniques, and solutions for trustworthy EML. Specifically, we first emphasize the importance of trustworthy EML within the context of Sixth-Generation (6G) networks. We then discuss the necessity of trustworthiness from the perspective of challenges encountered during deployment and real-world application scenarios. Subsequently, we provide a preliminary definition of trustworthy EML and explore its key attributes. Following this, we introduce fundamental frameworks and enabling technologies for trustworthy EML systems, and provide an in-depth literature review of the latest solutions to enhance trustworthiness of EML. Finally, we discuss corresponding research challenges and open issues.Comment: 27 pages, 7 figures, 10 table

    Recent Developments in Video Surveillance

    Get PDF
    With surveillance cameras installed everywhere and continuously streaming thousands of hours of video, how can that huge amount of data be analyzed or even be useful? Is it possible to search those countless hours of videos for subjects or events of interest? Shouldn’t the presence of a car stopped at a railroad crossing trigger an alarm system to prevent a potential accident? In the chapters selected for this book, experts in video surveillance provide answers to these questions and other interesting problems, skillfully blending research experience with practical real life applications. Academic researchers will find a reliable compilation of relevant literature in addition to pointers to current advances in the field. Industry practitioners will find useful hints about state-of-the-art applications. The book also provides directions for open problems where further advances can be pursued

    Low complexity in-loop perceptual video coding

    Get PDF
    The tradition of broadcast video is today complemented with user generated content, as portable devices support video coding. Similarly, computing is becoming ubiquitous, where Internet of Things (IoT) incorporate heterogeneous networks to communicate with personal and/or infrastructure devices. Irrespective, the emphasises is on bandwidth and processor efficiencies, meaning increasing the signalling options in video encoding. Consequently, assessment for pixel differences applies uniform cost to be processor efficient, in contrast the Human Visual System (HVS) has non-uniform sensitivity based upon lighting, edges and textures. Existing perceptual assessments, are natively incompatible and processor demanding, making perceptual video coding (PVC) unsuitable for these environments. This research allows existing perceptual assessment at the native level using low complexity techniques, before producing new pixel-base image quality assessments (IQAs). To manage these IQAs a framework was developed and implemented in the high efficiency video coding (HEVC) encoder. This resulted in bit-redistribution, where greater bits and smaller partitioning were allocated to perceptually significant regions. Using a HEVC optimised processor the timing increase was < +4% and < +6% for video streaming and recording applications respectively, 1/3 of an existing low complexity PVC solution. Future work should be directed towards perceptual quantisation which offers the potential for perceptual coding gain

    Lidar-based scene understanding for autonomous driving using deep learning

    Get PDF
    With over 1.35 million fatalities related to traffic accidents worldwide, autonomous driving was foreseen at the beginning of this century as a feasible solution to improve security in our roads. Nevertheless, it is meant to disrupt our transportation paradigm, allowing to reduce congestion, pollution, and costs, while increasing the accessibility, efficiency, and reliability of the transportation for both people and goods. Although some advances have gradually been transferred into commercial vehicles in the way of Advanced Driving Assistance Systems (ADAS) such as adaptive cruise control, blind spot detection or automatic parking, however, the technology is far from mature. A full understanding of the scene is actually needed so that allowing the vehicles to be aware of the surroundings, knowing the existing elements of the scene, as well as their motion, intentions and interactions. In this PhD dissertation, we explore new approaches for understanding driving scenes from 3D LiDAR point clouds by using Deep Learning methods. To this end, in Part I we analyze the scene from a static perspective using independent frames to detect the neighboring vehicles. Next, in Part II we develop new ways for understanding the dynamics of the scene. Finally, in Part III we apply all the developed methods to accomplish higher level challenges such as segmenting moving obstacles while obtaining their rigid motion vector over the ground. More specifically, in Chapter 2 we develop a 3D vehicle detection pipeline based on a multi-branch deep-learning architecture and propose a Front (FR-V) and a Bird’s Eye view (BE-V) as 2D representations of the 3D point cloud to serve as input for training our models. Later on, in Chapter 3 we apply and further test this method on two real uses-cases, for pre-filtering moving obstacles while creating maps to better localize ourselves on subsequent days, as well as for vehicle tracking. From the dynamic perspective, in Chapter 4 we learn from the 3D point cloud a novel dynamic feature that resembles optical flow from RGB images. For that, we develop a new approach to leverage RGB optical flow as pseudo ground truth for training purposes but allowing the use of only 3D LiDAR data at inference time. Additionally, in Chapter 5 we explore the benefits of combining classification and regression learning problems to face the optical flow estimation task in a joint coarse-and-fine manner. Lastly, in Chapter 6 we gather the previous methods and demonstrate that with these independent tasks we can guide the learning of higher challenging problems such as segmentation and motion estimation of moving vehicles from our own moving perspective.Con más de 1,35 millones de muertes por accidentes de tráfico en el mundo, a principios de siglo se predijo que la conducción autónoma sería una solución viable para mejorar la seguridad en nuestras carreteras. Además la conducción autónoma está destinada a cambiar nuestros paradigmas de transporte, permitiendo reducir la congestión del tráfico, la contaminación y el coste, a la vez que aumentando la accesibilidad, la eficiencia y confiabilidad del transporte tanto de personas como de mercancías. Aunque algunos avances, como el control de crucero adaptativo, la detección de puntos ciegos o el estacionamiento automático, se han transferido gradualmente a vehículos comerciales en la forma de los Sistemas Avanzados de Asistencia a la Conducción (ADAS), la tecnología aún no ha alcanzado el suficiente grado de madurez. Se necesita una comprensión completa de la escena para que los vehículos puedan entender el entorno, detectando los elementos presentes, así como su movimiento, intenciones e interacciones. En la presente tesis doctoral, exploramos nuevos enfoques para comprender escenarios de conducción utilizando nubes de puntos en 3D capturadas con sensores LiDAR, para lo cual empleamos métodos de aprendizaje profundo. Con este fin, en la Parte I analizamos la escena desde una perspectiva estática para detectar vehículos. A continuación, en la Parte II, desarrollamos nuevas formas de entender las dinámicas del entorno. Finalmente, en la Parte III aplicamos los métodos previamente desarrollados para lograr desafíos de nivel superior, como segmentar obstáculos dinámicos a la vez que estimamos su vector de movimiento sobre el suelo. Específicamente, en el Capítulo 2 detectamos vehículos en 3D creando una arquitectura de aprendizaje profundo de dos ramas y proponemos una vista frontal (FR-V) y una vista de pájaro (BE-V) como representaciones 2D de la nube de puntos 3D que sirven como entrada para entrenar nuestros modelos. Más adelante, en el Capítulo 3 aplicamos y probamos aún más este método en dos casos de uso reales, tanto para filtrar obstáculos en movimiento previamente a la creación de mapas sobre los que poder localizarnos mejor en los días posteriores, como para el seguimiento de vehículos. Desde la perspectiva dinámica, en el Capítulo 4 aprendemos de la nube de puntos en 3D una característica dinámica novedosa que se asemeja al flujo óptico sobre imágenes RGB. Para ello, desarrollamos un nuevo enfoque que aprovecha el flujo óptico RGB como pseudo muestras reales para entrenamiento, usando solo information 3D durante la inferencia. Además, en el Capítulo 5 exploramos los beneficios de combinar los aprendizajes de problemas de clasificación y regresión para la tarea de estimación de flujo óptico de manera conjunta. Por último, en el Capítulo 6 reunimos los métodos anteriores y demostramos que con estas tareas independientes podemos guiar el aprendizaje de problemas de más alto nivel, como la segmentación y estimación del movimiento de vehículos desde nuestra propia perspectivaAmb més d’1,35 milions de morts per accidents de trànsit al món, a principis de segle es va predir que la conducció autònoma es convertiria en una solució viable per millorar la seguretat a les nostres carreteres. D’altra banda, la conducció autònoma està destinada a canviar els paradigmes del transport, fent possible així reduir la densitat del trànsit, la contaminació i el cost, alhora que augmentant l’accessibilitat, l’eficiència i la confiança del transport tant de persones com de mercaderies. Encara que alguns avenços, com el control de creuer adaptatiu, la detecció de punts cecs o l’estacionament automàtic, s’han transferit gradualment a vehicles comercials en forma de Sistemes Avançats d’Assistència a la Conducció (ADAS), la tecnologia encara no ha arribat a aconseguir el grau suficient de maduresa. És necessària, doncs, una total comprensió de l’escena de manera que els vehicles puguin entendre l’entorn, detectant els elements presents, així com el seu moviment, intencions i interaccions. A la present tesi doctoral, explorem nous enfocaments per tal de comprendre les diferents escenes de conducció utilitzant núvols de punts en 3D capturats amb sensors LiDAR, mitjançant l’ús de mètodes d’aprenentatge profund. Amb aquest objectiu, a la Part I analitzem l’escena des d’una perspectiva estàtica per a detectar vehicles. A continuació, a la Part II, desenvolupem noves formes d’entendre les dinàmiques de l’entorn. Finalment, a la Part III apliquem els mètodes prèviament desenvolupats per a aconseguir desafiaments d’un nivell superior, com, per exemple, segmentar obstacles dinàmics al mateix temps que estimem el seu vector de moviment respecte al terra. Concretament, al Capítol 2 detectem vehicles en 3D creant una arquitectura d’aprenentatge profund amb dues branques, i proposem una vista frontal (FR-V) i una vista d’ocell (BE-V) com a representacions 2D del núvol de punts 3D que serveixen com a punt de partida per entrenar els nostres models. Més endavant, al Capítol 3 apliquem i provem de nou aquest mètode en dos casos d’ús reals, tant per filtrar obstacles en moviment prèviament a la creació de mapes en els quals poder localitzar-nos millor en dies posteriors, com per dur a terme el seguiment de vehicles. Des de la perspectiva dinàmica, al Capítol 4 aprenem una nova característica dinàmica del núvol de punts en 3D que s’assembla al flux òptic sobre imatges RGB. Per a fer-ho, desenvolupem un nou enfocament que aprofita el flux òptic RGB com pseudo mostres reals per a entrenament, utilitzant només informació 3D durant la inferència. Després, al Capítol 5 explorem els beneficis que s’obtenen de combinar els aprenentatges de problemes de classificació i regressió per la tasca d’estimació de flux òptic de manera conjunta. Finalment, al Capítol 6 posem en comú els mètodes anteriors i demostrem que mitjançant aquests processos independents podem abordar l’aprenentatge de problemes més complexos, com la segmentació i estimació del moviment de vehicles des de la nostra pròpia perspectiva

    Solar-Powered Deep Learning-Based Recognition System of Daily Used Objects and Human Faces for Assistance of the Visually Impaired

    Get PDF
    This paper introduces a novel low-cost solar-powered wearable assistive technology (AT) device, whose aim is to provide continuous, real-time object recognition to ease the finding of the objects for visually impaired (VI) people in daily life. The system consists of three major components: a miniature low-cost camera, a system on module (SoM) computing unit, and an ultrasonic sensor. The first is worn on the user’s eyeglasses and acquires real-time video of the nearby space. The second is worn as a belt and runs deep learning-based methods and spatial algorithms which process the video coming from the camera performing objects’ detection and recognition. The third assists on positioning the objects found in the surrounding space. The developed device provides audible descriptive sentences as feedback to the user involving the objects recognized and their position referenced to the user gaze. After a proper power consumption analysis, a wearable solar harvesting system, integrated with the developed AT device, has been designed and tested to extend the energy autonomy in the dierent operating modes and scenarios. Experimental results obtained with the developed low-cost AT device have demonstrated an accurate and reliable real-time object identification with an 86% correct recognition rate and 215 ms average time interval (in case of high-speed SoM operating mode) for the image processing. The proposed system is capable of recognizing the 91 objects oered by the Microsoft Common Objects in Context (COCO) dataset plus several custom objects and human faces. In addition, a simple and scalable methodology for using image datasets and training of Convolutional Neural Networks (CNNs) is introduced to add objects to the system and increase its repertory. It is also demonstrated that comprehensive trainings involving 100 images per targeted object achieve 89% recognition rates, while fast trainings with only 12 images achieve acceptable recognition rates of 55%

    Mind the Gap: Developments in Autonomous Driving Research and the Sustainability Challenge

    Get PDF
    Scientific knowledge on autonomous-driving technology is expanding at a faster-than-ever pace. As a result, the likelihood of incurring information overload is particularly notable for researchers, who can struggle to overcome the gap between information processing requirements and information processing capacity. We address this issue by adopting a multi-granulation approach to latent knowledge discovery and synthesis in large-scale research domains. The proposed methodology combines citation-based community detection methods and topic modeling techniques to give a concise but comprehensive overview of how the autonomous vehicle (AV) research field is conceptually structured. Thirteen core thematic areas are extracted and presented by mining the large data-rich environments resulting from 50 years of AV research. The analysis demonstrates that this research field is strongly oriented towards examining the technological developments needed to enable the widespread rollout of AVs, whereas it largely overlooks the wide-ranging sustainability implications of this sociotechnical transition. On account of these findings, we call for a broader engagement of AV researchers with the sustainability concept and we invite them to increase their commitment to conducting systematic investigations into the sustainability of AV deployment. Sustainability research is urgently required to produce an evidence-based understanding of what new sociotechnical arrangements are needed to ensure that the systemic technological change introduced by AV-based transport systems can fulfill societal functions while meeting the urgent need for more sustainable transport solutions
    corecore