39 research outputs found

    GPT-4V as Traffic Assistant: An In-depth Look at Vision Language Model on Complex Traffic Events

    Full text link
    The recognition and understanding of traffic incidents, particularly traffic accidents, is a topic of paramount importance in the realm of intelligent transportation systems and intelligent vehicles. This area has continually captured the extensive focus of both the academic and industrial sectors. Identifying and comprehending complex traffic events is highly challenging, primarily due to the intricate nature of traffic environments, diverse observational perspectives, and the multifaceted causes of accidents. These factors have persistently impeded the development of effective solutions. The advent of large vision-language models (VLMs) such as GPT-4V, has introduced innovative approaches to addressing this issue. In this paper, we explore the ability of GPT-4V with a set of representative traffic incident videos and delve into the model's capacity of understanding these complex traffic situations. We observe that GPT-4V demonstrates remarkable cognitive, reasoning, and decision-making ability in certain classic traffic events. Concurrently, we also identify certain limitations of GPT-4V, which constrain its understanding in more intricate scenarios. These limitations merit further exploration and resolution

    Intelligent Transportation Systems Using External Infrastructure: A Literature Survey

    Full text link
    The main problems in transportation are accidents, increasingly slow traffic flow, and pollution. An intelligent transportation system (ITS) using external infrastructure can overcome these problems. For this reason, the number of such systems is increasing dramatically, and therefore requires an adequate overview. To the best of our knowledge, no current systematic review of existing ITS solutions exists. To fill this knowledge gap, our paper provides an overview of existing ITS that use external infrastructure worldwide. Accordingly, this paper addresses current questions and challenges. For this purpose, we performed a literature review of documents that describe existing ITS solutions from 2009 until today. We categorized the results according to technology levels and analyzed its hardware system setup and value-added contributions. In doing so, we made the ITS solutions comparable and highlighted past development alongside current trends. We analyzed more than 357 papers, including 52 test bed projects. In summary, current ITSs can deliver accurate information about individuals in traffic situations in real-time. However, further research into ITS should focus on more reliable perception of the traffic using modern sensors, plug-and-play mechanisms, and secure real-time distribution of the digital twins in a decentralized manner. By addressing these topics, the development of intelligent transportation systems will be able to take a step towards its comprehensive roll-out.Comment: 18 Pages, 4 Tables, 5 Figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Real-Time And Robust 3D Object Detection with Roadside LiDARs

    Full text link
    This work aims to address the challenges in autonomous driving by focusing on the 3D perception of the environment using roadside LiDARs. We design a 3D object detection model that can detect traffic participants in roadside LiDARs in real-time. Our model uses an existing 3D detector as a baseline and improves its accuracy. To prove the effectiveness of our proposed modules, we train and evaluate the model on three different vehicle and infrastructure datasets. To show the domain adaptation ability of our detector, we train it on an infrastructure dataset from China and perform transfer learning on a different dataset recorded in Germany. We do several sets of experiments and ablation studies for each module in the detector that show that our model outperforms the baseline by a significant margin, while the inference speed is at 45 Hz (22 ms). We make a significant contribution with our LiDAR-based 3D detector that can be used for smart city applications to provide connected and automated vehicles with a far-reaching view. Vehicles that are connected to the roadside sensors can get information about other vehicles around the corner to improve their path and maneuver planning and to increase road traffic safety.Comment: arXiv admin note: substantial text overlap with arXiv:2204.0013

    Vision Language Models in Autonomous Driving and Intelligent Transportation Systems

    Full text link
    The applications of Vision-Language Models (VLMs) in the fields of Autonomous Driving (AD) and Intelligent Transportation Systems (ITS) have attracted widespread attention due to their outstanding performance and the ability to leverage Large Language Models (LLMs). By integrating language data, the vehicles, and transportation systems are able to deeply understand real-world environments, improving driving safety and efficiency. In this work, we present a comprehensive survey of the advances in language models in this domain, encompassing current models and datasets. Additionally, we explore the potential applications and emerging research directions. Finally, we thoroughly discuss the challenges and research gap. The paper aims to provide researchers with the current work and future trends of VLMs in AD and ITS

    3D Understanding of Deformable Linear Objects: Datasets and Transferability Benchmark

    Full text link
    Deformable linear objects are vastly represented in our everyday lives. It is often challenging even for humans to visually understand them, as the same object can be entangled so that it appears completely different. Examples of deformable linear objects include blood vessels and wiring harnesses, vital to the functioning of their corresponding systems, such as the human body and a vehicle. However, no point cloud datasets exist for studying 3D deformable linear objects. Therefore, we are introducing two point cloud datasets, PointWire and PointVessel. We evaluated state-of-the-art methods on the proposed large-scale 3D deformable linear object benchmarks. Finally, we analyzed the generalization capabilities of these methods by conducting transferability experiments on the PointWire and PointVessel datasets

    RT-DLO: Real-Time Deformable Linear Objects Instance Segmentation

    Get PDF
    Deformable Linear Objects (DLOs) such as cables, wires, ropes, and elastic tubes are numerously present both in domestic and industrial environments. Unfortunately, robotic systems handling DLOs are rare and have limited capabilities due to the challenging nature of perceiving them. Hence, we propose a novel approach named RT-DLO for real-time instance segmentation of DLOs. First, the DLOs are semantically segmented from the background. Afterward, a novel method to separate the DLO instances is applied. It employs the generation of a graph representation of the scene given the semantic mask where the graph nodes are sampled from the DLOs center-lines whereas the graph edges are selected based on topological reasoning. RT-DLO is experimentally evaluated against both DLO-specific and general-purpose instance segmentation deep learning approaches, achieving overall better performances in terms of accuracy and inference time

    A Survey of Robotics Control Based on Learning-Inspired Spiking Neural Networks

    Get PDF
    Biological intelligence processes information using impulses or spikes, which makes those living creatures able to perceive and act in the real world exceptionally well and outperform state-of-the-art robots in almost every aspect of life. To make up the deficit, emerging hardware technologies and software knowledge in the fields of neuroscience, electronics, and computer science have made it possible to design biologically realistic robots controlled by spiking neural networks (SNNs), inspired by the mechanism of brains. However, a comprehensive review on controlling robots based on SNNs is still missing. In this paper, we survey the developments of the past decade in the field of spiking neural networks for control tasks, with particular focus on the fast emerging robotics-related applications. We first highlight the primary impetuses of SNN-based robotics tasks in terms of speed, energy efficiency, and computation capabilities. We then classify those SNN-based robotic applications according to different learning rules and explicate those learning rules with their corresponding robotic applications. We also briefly present some existing platforms that offer an interaction between SNNs and robotics simulations for exploration and exploitation. Finally, we conclude our survey with a forecast of future challenges and some associated potential research topics in terms of controlling robots based on SNNs

    TUMTraf Event: Calibration and Fusion Resulting in a Dataset for Roadside Event-Based and RGB Cameras

    Full text link
    Event-based cameras are predestined for Intelligent Transportation Systems (ITS). They provide very high temporal resolution and dynamic range, which can eliminate motion blur and improve detection performance at night. However, event-based images lack color and texture compared to images from a conventional RGB camera. Considering that, data fusion between event-based and conventional cameras can combine the strengths of both modalities. For this purpose, extrinsic calibration is necessary. To the best of our knowledge, no targetless calibration between event-based and RGB cameras can handle multiple moving objects, nor does data fusion optimized for the domain of roadside ITS exist. Furthermore, synchronized event-based and RGB camera datasets considering roadside perspective are not yet published. To fill these research gaps, based on our previous work, we extended our targetless calibration approach with clustering methods to handle multiple moving objects. Furthermore, we developed an early fusion, simple late fusion, and a novel spatiotemporal late fusion method. Lastly, we published the TUMTraf Event Dataset, which contains more than 4,111 synchronized event-based and RGB images with 50,496 labeled 2D boxes. During our extensive experiments, we verified the effectiveness of our calibration method with multiple moving objects. Furthermore, compared to a single RGB camera, we increased the detection performance of up to +9 % mAP in the day and up to +13 % mAP during the challenging night with our presented event-based sensor fusion methods. The TUMTraf Event Dataset is available at https://innovation-mobility.com/tumtraf-dataset.Comment: 18 pages, 10 figures, 6 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl
    corecore