2 research outputs found
Causality inspired retrieval of human-object interactions from video
Notwithstanding recent advances in machine vision,
video activity recognition from multiple cameras still remains
a challenging task as many real-world interactions cannot be
automatically recognised for many reasons, such as partial
occlusion or coverage black-spots. In this paper we propose a
new technique that infers the unseen relationship between two
individuals captured by different cameras and use it to retrieve
relevant video clips if there is a likely interaction between
the two individuals. We introduce a human object interaction
(HOI) model integrating the causal relationship between the
humans and the objects. For this we first extract the key frames
and generate the labels or annotations using the state-of-the-art
image captioning models. Next, we extract SVO (subject, verb,
object) triples and encode the descriptions into a vector form
for HOI inference using the Stanford CoreNLP parser. In order
to calculate the HOI co-existence and the possible causality
score we use transfer entropy. From our experimentation, we
found that integrating casual relations into the content indexing
process and using transfer entropy to calculate the causality
score leads to improvement in retrieval performance