5,442 research outputs found
Learning Behavioural Context
The original publication is available at www.springerlink.co
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Detect the unexpected: a science for surveillance
Purpose – The purpose of this paper is to outline a strategy for research development focused on addressing the neglected role of visual perception in real life tasks such as policing surveillance and command and control settings. Approach – The scale of surveillance task in modern control room is expanding as technology increases input capacity at an accelerating rate. The authors review recent literature highlighting the difficulties that apply to modern surveillance and give examples of how poor detection of the unexpected can be, and how surprising this deficit can be. Perceptual phenomena such as change blindness are linked to the perceptual processes undertaken by law-enforcement personnel. Findings – A scientific programme is outlined for how detection deficits can best be addressed in the context of a multidisciplinary collaborative agenda between researchers and practitioners. The development of a cognitive research field specifically examining the occurrence of perceptual “failures” provides an opportunity for policing agencies to relate laboratory findings in psychology to their own fields of day-to-day enquiry. Originality/value – The paper shows, with examples, where interdisciplinary research may best be focussed on evaluating practical solutions and on generating useable guidelines on procedure and practice. It also argues that these processes should be investigated in real and simulated context-specific studies to confirm the validity of the findings in these new applied scenarios
Automatically detecting road sign text from natural scene video
Automatic detection of text on road signs can help drivers keep aware of the traffic situation and surrounding environments by reminding them of the signs ahead. Current systems can only detect constrained road signs or produce unsatisfying performance when dealing with complex scenes in practical use. This paper firstly reviews the existing techniques used for text detection from natural scene. A novel system which detects text on road signs from natural scene video is then proposed. Our detailed approaches and methodology give a promising solution to this problem in order to reduce the running time and improve the recognition rate. © 2006 IEEE
Vision Language Models in Autonomous Driving and Intelligent Transportation Systems
The applications of Vision-Language Models (VLMs) in the fields of Autonomous
Driving (AD) and Intelligent Transportation Systems (ITS) have attracted
widespread attention due to their outstanding performance and the ability to
leverage Large Language Models (LLMs). By integrating language data, the
vehicles, and transportation systems are able to deeply understand real-world
environments, improving driving safety and efficiency. In this work, we present
a comprehensive survey of the advances in language models in this domain,
encompassing current models and datasets. Additionally, we explore the
potential applications and emerging research directions. Finally, we thoroughly
discuss the challenges and research gap. The paper aims to provide researchers
with the current work and future trends of VLMs in AD and ITS
Temporal Mapping of Surveillance Video for Indexing and Summarization
This work converts the surveillance video to a temporal domain image called temporal profile that is scrollable and scalable for quick searching of long surveillance video by human operators. Such a profile is sampled with linear pixel lines located at critical locations in the video frames. It has precise time stamp on the target passing events through those locations in the field of view, shows target shapes for identification, and facilitates the target search in long videos. In this paper, we first study the projection and shape properties of dynamic scenes in the temporal profile so as to set sampling lines. Then, we design methods to capture target motion and preserve target shapes for target recognition in the temporal profile. It also provides the uniformed resolution of large crowds passing through so that it is powerful in target counting and flow measuring. We also align multiple sampling lines to visualize the spatial information missed in a single line temporal profile. Finally, we achieve real time adaptive background removal and robust target extraction to ensure long-term surveillance. Compared to the original video or the shortened video, this temporal profile reduced data by one dimension while keeping the majority of information for further video investigation. As an intermediate indexing image, the profile image can be transmitted via network much faster than video for online video searching task by multiple operators. Because the temporal profile can abstract passing targets with efficient computation, an even more compact digest of the surveillance video can be created
Overview of contextual tracking approaches in information fusion
Proceedings of: Geospatial InfoFusion III. 2-3 May 2013 Baltimore, Maryland, United States.Many information fusion solutions work well in the intended scenarios; but the applications, supporting data, and capabilities change over varying contexts. One example is weather data for electro-optical target trackers of which standards have evolved over decades. The operating conditions of: technology changes, sensor/target variations, and the contextual environment can inhibit performance if not included in the initial systems design. In this paper, we seek to define and categorize different types of contextual information. We describe five contextual information categories that support target tracking: (1) domain knowledge from a user to aid the information fusion process through selection, cueing, and analysis, (2) environment-to-hardware processing for sensor management, (3) known distribution of entities for situation/threat assessment, (4) historical traffic behavior for situation awareness patterns of life (POL), and (5) road information for target tracking and identification. Appropriate characterization and representation of contextual information is needed for future high-level information fusion systems design to take advantage of the large data content available for a priori knowledge target tracking algorithm construction, implementation, and application.Publicad
- …