4 research outputs found
Drive Video Analysis for the Detection of Traffic Near-Miss Incidents
Because of their recent introduction, self-driving cars and advanced driver
assistance system (ADAS) equipped vehicles have had little opportunity to
learn, the dangerous traffic (including near-miss incident) scenarios that
provide normal drivers with strong motivation to drive safely. Accordingly, as
a means of providing learning depth, this paper presents a novel traffic
database that contains information on a large number of traffic near-miss
incidents that were obtained by mounting driving recorders in more than 100
taxis over the course of a decade. The study makes the following two main
contributions: (i) In order to assist automated systems in detecting near-miss
incidents based on database instances, we created a large-scale traffic
near-miss incident database (NIDB) that consists of video clip of dangerous
events captured by monocular driving recorders. (ii) To illustrate the
applicability of NIDB traffic near-miss incidents, we provide two primary
database-related improvements: parameter fine-tuning using various near-miss
scenes from NIDB, and foreground/background separation into motion
representation. Then, using our new database in conjunction with a monocular
driving recorder, we developed a near-miss recognition method that provides
automated systems with a performance level that is comparable to a human-level
understanding of near-miss incidents (64.5% vs. 68.4% at near-miss recognition,
61.3% vs. 78.7% at near-miss detection).Comment: Accepted to ICRA 201
Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
We address the challenging task of anticipating human-object interaction in
first person videos. Most existing methods ignore how the camera wearer
interacts with the objects, or simply consider body motion as a separate
modality. In contrast, we observe that the international hand movement reveals
critical information about the future activity. Motivated by this, we adopt
intentional hand movement as a future representation and propose a novel deep
network that jointly models and predicts the egocentric hand motion,
interaction hotspots and future action. Specifically, we consider the future
hand motion as the motor attention, and model this attention using latent
variables in our deep model. The predicted motor attention is further used to
characterise the discriminative spatial-temporal visual features for predicting
actions and interaction hotspots. We present extensive experiments
demonstrating the benefit of the proposed joint model. Importantly, our model
produces new state-of-the-art results for action anticipation on both EGTEA
Gaze+ and the EPIC-Kitchens datasets. Our project page is available at
https://aptx4869lm.github.io/ForecastingHOI