5,442 research outputs found

    Human Motion Trajectory Prediction: A Survey

    Full text link
    With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Detect the unexpected: a science for surveillance

    Get PDF
    Purpose – The purpose of this paper is to outline a strategy for research development focused on addressing the neglected role of visual perception in real life tasks such as policing surveillance and command and control settings. Approach – The scale of surveillance task in modern control room is expanding as technology increases input capacity at an accelerating rate. The authors review recent literature highlighting the difficulties that apply to modern surveillance and give examples of how poor detection of the unexpected can be, and how surprising this deficit can be. Perceptual phenomena such as change blindness are linked to the perceptual processes undertaken by law-enforcement personnel. Findings – A scientific programme is outlined for how detection deficits can best be addressed in the context of a multidisciplinary collaborative agenda between researchers and practitioners. The development of a cognitive research field specifically examining the occurrence of perceptual “failures” provides an opportunity for policing agencies to relate laboratory findings in psychology to their own fields of day-to-day enquiry. Originality/value – The paper shows, with examples, where interdisciplinary research may best be focussed on evaluating practical solutions and on generating useable guidelines on procedure and practice. It also argues that these processes should be investigated in real and simulated context-specific studies to confirm the validity of the findings in these new applied scenarios

    Automatically detecting road sign text from natural scene video

    Full text link
    Automatic detection of text on road signs can help drivers keep aware of the traffic situation and surrounding environments by reminding them of the signs ahead. Current systems can only detect constrained road signs or produce unsatisfying performance when dealing with complex scenes in practical use. This paper firstly reviews the existing techniques used for text detection from natural scene. A novel system which detects text on road signs from natural scene video is then proposed. Our detailed approaches and methodology give a promising solution to this problem in order to reduce the running time and improve the recognition rate. © 2006 IEEE

    Vision Language Models in Autonomous Driving and Intelligent Transportation Systems

    Full text link
    The applications of Vision-Language Models (VLMs) in the fields of Autonomous Driving (AD) and Intelligent Transportation Systems (ITS) have attracted widespread attention due to their outstanding performance and the ability to leverage Large Language Models (LLMs). By integrating language data, the vehicles, and transportation systems are able to deeply understand real-world environments, improving driving safety and efficiency. In this work, we present a comprehensive survey of the advances in language models in this domain, encompassing current models and datasets. Additionally, we explore the potential applications and emerging research directions. Finally, we thoroughly discuss the challenges and research gap. The paper aims to provide researchers with the current work and future trends of VLMs in AD and ITS

    Temporal Mapping of Surveillance Video for Indexing and Summarization

    Get PDF
    This work converts the surveillance video to a temporal domain image called temporal profile that is scrollable and scalable for quick searching of long surveillance video by human operators. Such a profile is sampled with linear pixel lines located at critical locations in the video frames. It has precise time stamp on the target passing events through those locations in the field of view, shows target shapes for identification, and facilitates the target search in long videos. In this paper, we first study the projection and shape properties of dynamic scenes in the temporal profile so as to set sampling lines. Then, we design methods to capture target motion and preserve target shapes for target recognition in the temporal profile. It also provides the uniformed resolution of large crowds passing through so that it is powerful in target counting and flow measuring. We also align multiple sampling lines to visualize the spatial information missed in a single line temporal profile. Finally, we achieve real time adaptive background removal and robust target extraction to ensure long-term surveillance. Compared to the original video or the shortened video, this temporal profile reduced data by one dimension while keeping the majority of information for further video investigation. As an intermediate indexing image, the profile image can be transmitted via network much faster than video for online video searching task by multiple operators. Because the temporal profile can abstract passing targets with efficient computation, an even more compact digest of the surveillance video can be created

    Overview of contextual tracking approaches in information fusion

    Get PDF
    Proceedings of: Geospatial InfoFusion III. 2-3 May 2013 Baltimore, Maryland, United States.Many information fusion solutions work well in the intended scenarios; but the applications, supporting data, and capabilities change over varying contexts. One example is weather data for electro-optical target trackers of which standards have evolved over decades. The operating conditions of: technology changes, sensor/target variations, and the contextual environment can inhibit performance if not included in the initial systems design. In this paper, we seek to define and categorize different types of contextual information. We describe five contextual information categories that support target tracking: (1) domain knowledge from a user to aid the information fusion process through selection, cueing, and analysis, (2) environment-to-hardware processing for sensor management, (3) known distribution of entities for situation/threat assessment, (4) historical traffic behavior for situation awareness patterns of life (POL), and (5) road information for target tracking and identification. Appropriate characterization and representation of contextual information is needed for future high-level information fusion systems design to take advantage of the large data content available for a priori knowledge target tracking algorithm construction, implementation, and application.Publicad
    • …
    corecore