3 research outputs found
Learning Situation Hyper-Graphs for Video Question Answering
Answering questions about complex situations in videos requires not only
capturing the presence of actors, objects, and their relations but also the
evolution of these relationships over time. A situation hyper-graph is a
representation that describes situations as scene sub-graphs for video frames
and hyper-edges for connected sub-graphs and has been proposed to capture all
such information in a compact structured form. In this work, we propose an
architecture for Video Question Answering (VQA) that enables answering
questions related to video content by predicting situation hyper-graphs, coined
Situation Hyper-Graph based Video Question Answering (SHG-VQA). To this end, we
train a situation hyper-graph decoder to implicitly identify graph
representations with actions and object/human-object relationships from the
input video clip. and to use cross-attention between the predicted situation
hyper-graphs and the question embedding to predict the correct answer. The
proposed method is trained in an end-to-end manner and optimized by a VQA loss
with the cross-entropy function and a Hungarian matching loss for the situation
graph prediction. The effectiveness of the proposed architecture is extensively
evaluated on two challenging benchmarks: AGQA and STAR. Our results show that
learning the underlying situation hyper-graphs helps the system to
significantly improve its performance for novel challenges of video
question-answering tasks
The IPIN 2019 Indoor Localisation Competition—Description and Results
IPIN 2019 Competition, sixth in a series of IPIN competitions, was held at the CNR Research Area of Pisa (IT), integrated into the program of the IPIN 2019 Conference. It included two on-site real-time Tracks and three off-site Tracks. The four Tracks presented in this paper were set in the same environment, made of two buildings close together for a total usable area of 1000 m 2 outdoors and and 6000 m 2 indoors over three floors, with a total path length exceeding 500 m. IPIN competitions, based on the EvAAL framework, have aimed at comparing the accuracy performance of personal positioning systems in fair and realistic conditions: past editions of the competition were carried in big conference settings, university campuses and a shopping mall. Positioning accuracy is computed while the person carrying the system under test walks at normal walking speed, uses lifts and goes up and down stairs or briefly stops at given points. Results presented here are a showcase of state-of-the-art systems tested side by side in real-world settings as part of the on-site real-time competition Tracks. Results for off-site Tracks allow a detailed and reproducible comparison of the most recent positioning and tracking algorithms in the same environment as the on-site Tracks
Deep Learning For Automated Real-Time Detection And Segmentation Of Intestinal Lesions In Colonoscopies
Master'sMASTER OF SCIENCE (RSH-FOS