27 research outputs found

    leave a trace - A People Tracking System Meets Anomaly Detection

    Full text link
    Video surveillance always had a negative connotation, among others because of the loss of privacy and because it may not automatically increase public safety. If it was able to detect atypical (i.e. dangerous) situations in real time, autonomously and anonymously, this could change. A prerequisite for this is a reliable automatic detection of possibly dangerous situations from video data. This is done classically by object extraction and tracking. From the derived trajectories, we then want to determine dangerous situations by detecting atypical trajectories. However, due to ethical considerations it is better to develop such a system on data without people being threatened or even harmed, plus with having them know that there is such a tracking system installed. Another important point is that these situations do not occur very often in real, public CCTV areas and may be captured properly even less. In the artistic project leave a trace the tracked objects, people in an atrium of a institutional building, become actor and thus part of the installation. Visualisation in real-time allows interaction by these actors, which in turn creates many atypical interaction situations on which we can develop our situation detection. The data set has evolved over three years and hence, is huge. In this article we describe the tracking system and several approaches for the detection of atypical trajectories

    Vision-based safe autonomous UAV landing with panoramic sensors

    Get PDF
    The remarkable growth of unmanned aerial vehicles (UAVs) has also raised concerns about safety measures during their missions. To advance towards safer autonomous aerial robots, this thesis strives to develop a safe autonomous UAV landing solution, a vital part of every UAV operation. The project proposes a vision-based framework for monitoring the landing area by leveraging the omnidirectional view of a single panoramic camera pointing upwards to detect and localize any person within the landing zone. Then, it sends this information to approaching UAVs to either hover and wait or adaptively search for a more optimal position to land themselves. We utilize and fine-tune the YOLOv7 object detection model, an XGBooxt model for localizing nearby people, and the open-source ROS and PX4 frameworks for communications and drone control. We present both simulation and real-world indoor experimental results to demonstrate the capability of our methods

    PL-EVIO: Robust Monocular Event-based Visual Inertial Odometry with Point and Line Features

    Full text link
    Event cameras are motion-activated sensors that capture pixel-level illumination changes instead of the intensity image with a fixed frame rate. Compared with the standard cameras, it can provide reliable visual perception during high-speed motions and in high dynamic range scenarios. However, event cameras output only a little information or even noise when the relative motion between the camera and the scene is limited, such as in a still state. While standard cameras can provide rich perception information in most scenarios, especially in good lighting conditions. These two cameras are exactly complementary. In this paper, we proposed a robust, high-accurate, and real-time optimization-based monocular event-based visual-inertial odometry (VIO) method with event-corner features, line-based event features, and point-based image features. The proposed method offers to leverage the point-based features in the nature scene and line-based features in the human-made scene to provide more additional structure or constraints information through well-design feature management. Experiments in the public benchmark datasets show that our method can achieve superior performance compared with the state-of-the-art image-based or event-based VIO. Finally, we used our method to demonstrate an onboard closed-loop autonomous quadrotor flight and large-scale outdoor experiments. Videos of the evaluations are presented on our project website: https://b23.tv/OE3QM6

    3D Multiple Object Tracking on Autonomous Driving: A Literature Review

    Full text link
    3D multi-object tracking (3D MOT) stands as a pivotal domain within autonomous driving, experiencing a surge in scholarly interest and commercial promise over recent years. Despite its paramount significance, 3D MOT confronts a myriad of formidable challenges, encompassing abrupt alterations in object appearances, pervasive occlusion, the presence of diminutive targets, data sparsity, missed detections, and the unpredictable initiation and termination of object motion trajectories. Countless methodologies have emerged to grapple with these issues, yet 3D MOT endures as a formidable problem that warrants further exploration. This paper undertakes a comprehensive examination, assessment, and synthesis of the research landscape in this domain, remaining attuned to the latest developments in 3D MOT while suggesting prospective avenues for future investigation. Our exploration commences with a systematic exposition of key facets of 3D MOT and its associated domains, including problem delineation, classification, methodological approaches, fundamental principles, and empirical investigations. Subsequently, we categorize these methodologies into distinct groups, dissecting each group meticulously with regard to its challenges, underlying rationale, progress, merits, and demerits. Furthermore, we present a concise recapitulation of experimental metrics and offer an overview of prevalent datasets, facilitating a quantitative comparison for a more intuitive assessment. Lastly, our deliberations culminate in a discussion of the prevailing research landscape, highlighting extant challenges and charting possible directions for 3D MOT research. We present a structured and lucid road-map to guide forthcoming endeavors in this field.Comment: 24 pages, 6 figures, 2 table

    3D Video Object Detection with Learnable Object-Centric Global Optimization

    Full text link
    We explore long-term temporal visual correspondence-based optimization for 3D video object detection in this work. Visual correspondence refers to one-to-one mappings for pixels across multiple images. Correspondence-based optimization is the cornerstone for 3D scene reconstruction but is less studied in 3D video object detection, because moving objects violate multi-view geometry constraints and are treated as outliers during scene reconstruction. We address this issue by treating objects as first-class citizens during correspondence-based optimization. In this work, we propose BA-Det, an end-to-end optimizable object detector with object-centric temporal correspondence learning and featuremetric object bundle adjustment. Empirically, we verify the effectiveness and efficiency of BA-Det for multiple baseline 3D detectors under various setups. Our BA-Det achieves SOTA performance on the large-scale Waymo Open Dataset (WOD) with only marginal computation cost. Our code is available at https://github.com/jiaweihe1996/BA-Det.Comment: CVPR202

    The always best positioned paradigm for mobile indoor applications

    Get PDF
    In this dissertation, methods for personal positioning in outdoor and indoor environments are investigated. The Always Best Positioned paradigm, which has the goal of providing a preferably consistent self-positioning, will be defined. Furthermore, the localization toolkit LOCATO will be presented, which allows to easily realize positioning systems that follow the paradigm. New algorithms were developed, which particularly address the robustness of positioning systems with respect to the Always Best Positioned paradigm. With the help of this toolkit, three example positioning-systems were implemented, each designed for different applications and requirements: a low-cost system, which can be used in conjunction with user-adaptive public displays, a so-called opportunistic system, which enables positioning with room-level accuracy in any building that provides a WiFi infrastructure, and a high-accuracy system for instrumented environments, which works with active RFID tags and infrared beacons. Furthermore, a new and unique evaluation-method for positioning systems is presented, which uses step-accurate natural walking-traces as ground truth. Finally, six location based services will be presented, which were realized either with the tools provided by LOCATO or with one of the example positioning-systems.In dieser Doktorarbeit werden Methoden zur Personenpositionierung im Innen- und Außenbereich von Gebäuden untersucht. Es wird das ,,Always Best Positioned” Paradigma definiert, welches eine möglichst lückenlose Selbstpositionierung zum Ziel hat. Weiterhin wird die Lokalisierungsplattform LOCATO vorgestellt, welche eine einfache Umsetzung von Positionierungssystemen ermöglicht. Hierzu wurden neue Algorithmen entwickelt, welche gezielt die Robustheit von Positionierungssystemen unter Berücksichtigung des ,,Always Best Positioned” Paradigmas angehen. Mit Hilfe dieser Plattform wurden drei Beispiel Positionierungssysteme entwickelt, welche unterschiedliche Einsatzgebiete berücksichtigen: Ein kostengünstiges System, das im Zusammenhang mit benutzeradaptiven öffentlichen Bildschirmen benutzt werden kann; ein sogenanntes opportunistisches Positionierungssystem, welches eine raumgenaue Positionierung in allen Gebäuden mit WLAN-Infrastruktur ermöglicht, sowie ein metergenaues Positionierungssystem, welches mit Hilfe einer Instrumentierung aus aktiven RFID-Tags und Infrarot-Baken arbeitet. Weiterhin wird erstmalig eine Positionierungsevaluation vorgestellt, welche schrittgenaue, natürliche Bewegungspfade als Referenzsystem einsetzt. Im Abschluss werden 6 lokationsbasierte Dienste vorgestellt, welche entweder mit Hilfe von LOCATO oder mit Hilfe einer der drei Beispiel-Positionierungssysteme entwickelt wurden

    Virtual road signs : a thesis presented in partial fulfilment of the requirements for the degree of Master of Technology in Computer Systems Engineering at Massey University, Palmerston North, New Zealand

    Get PDF
    Conventional road signs are subject to a number of problems and limitations. They are unable to disseminate dynamic information to road users, their visibility is heavily dependent on environmental conditions, they are expensive to maintain and frequently the target of vandals and thieves. Virtual road signs (VRS) differ from conventional signs in that they exist only in an information database - no physical signs exist on the roadside. By projecting virtual signs into a driver's field of view at the correct time, virtual road signs attempt to mimic conventional road signs. In addition, their visibility is independent of weather and traffic conditions, they can be tailored to specific driver and vehicle needs (such as truck drivers), and they cannot be vandalised like physical signs. This thesis examines many of the major technical design decisions that must be made in implementing a virtual road sign system. A software prototype was designed and written to implement an experimental VRS system. The prototype served as a testbed to assess the technical feasibility of a VRS system and investigate alternative VRS designs. One limitation of the project was the lack of suitable display device that could display virtual signs inside a vehicle in real-time. Therefore, this project examined only the proof-of-concept. A test world was created around a university campus in which virtual signs were "erected" to target a visitor to the campus. The prototype used a handheld GPS receiver to track a vehicle as it was driven around the campus. A Kalman filter was implemented to filter the GPS data and predict the motion of the vehicle when GPS data was unavailable. A laptop PC provided onboard processing capability inside the test vehicle. The prototype shows that technical implementation of virtual road signs is potentially feasible, subject to limitations in current display devices such as heads-up displays. Potential applications include signs custom designed for tourists to indicate places of interest, bilingual signage, and aiding co-drivers in rally car driving. Before large-scale implementation can be considered, however, much research is needed, particularly with respect to systems acceptability to the public and road authorities

    End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

    Get PDF
    End-to-end approaches to autonomous driving commonly rely on expert demonstrations. Although humans are good drivers, they are not good coaches for end-to-end algorithms that demand dense on-policy supervision. On the contrary, automated experts that leverage privileged information can efficiently generate large scale on-policy and off-policy demonstrations. However, existing automated experts for urban driving make heavy use of hand-crafted rules and perform suboptimally even on driving simulators, where ground-truth information is available. To address these issues, we train a reinforcement learning expert that maps bird's-eye view images to continuous low-level actions. While setting a new performance upper-bound on CARLA, our expert is also a better coach that provides informative supervision signals for imitation learning agents to learn from. Supervised by our reinforcement learning coach, a baseline end-to-end agent with monocular camera-input achieves expert-level performance. Our end-to-end agent achieves a 78% success rate while generalizing to a new town and new weather on the NoCrash-dense benchmark and state-of-the-art performance on the more challenging CARLA LeaderBoard

    Understanding a Dynamic World: Dynamic Motion Estimation for Autonomous Driving Using LIDAR

    Full text link
    In a society that is heavily reliant on personal transportation, autonomous vehicles present an increasingly intriguing technology. They have the potential to save lives, promote efficiency, and enable mobility. However, before this vision becomes a reality, there are a number of challenges that must be solved. One key challenge involves problems in dynamic motion estimation, as it is critical for an autonomous vehicle to have an understanding of the dynamics in its environment for it to operate safely on the road. Accordingly, this thesis presents several algorithms for dynamic motion estimation for autonomous vehicles. We focus on methods using light detection and ranging (LIDAR), a prevalent sensing modality used by autonomous vehicle platforms, due to its advantages over other sensors, such as cameras, including lighting invariance and fidelity of 3D geometric data. First, we propose a dynamic object tracking algorithm. The proposed method takes as input a stream of LIDAR data from a moving object collected by a multi-sensor platform. It generates an estimate of its trajectory over time and a point cloud model of its shape. We formulate the problem similarly to simultaneous localization and mapping (SLAM), allowing us to leverage existing techniques. Unlike prior work, we properly handle a stream of sensor measurements observed over time by deriving our algorithm using a continuous-time estimation framework. We evaluate our proposed method on a real-world dataset that we collect. Second, we present a method for scene flow estimation from a stream of LIDAR data. Inspired by optical flow and scene flow from the computer vision community, our framework can estimate dynamic motion in the scene without relying on segmentation and data association while still rivaling the results of state-of-the-art object tracking methods. We design our algorithms to exploit a graphics processing unit (GPU), enabling real-time performance. Third, we leverage deep learning tools to build a feature learning framework that allows us to train an encoding network to estimate features from a LIDAR occupancy grid. The learned feature space describes the geometric and semantic structure of any location observed by the LIDAR data. We formulate the training process so that distances in this learned feature space are meaningful in comparing the similarity of different locations. Accordingly, we demonstrate that using this feature space improves our estimate of the dynamic motion in the environment over time. In summary, this thesis presents three methods to aid in understanding a dynamic world for autonomous vehicle applications with LIDAR. These methods include a novel object tracking algorithm, a real-time scene flow estimation method, and a feature learning framework to aid in dynamic motion estimation. Furthermore, we demonstrate the performance of all our proposed methods on a collection of real-world datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147587/1/aushani_1.pd
    corecore