Search CORE

1,283 research outputs found

Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

Author: Alayrac Jean-Baptiste
Hahn Meera
Laptev Ivan
Rehg James M.
Ruiz Nataniel
Publication venue
Publication date: 22/09/2018
Field of study

Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision. The task is challenging due to the difficulty of bridging the semantic gap between the visual and natural language domains. This paper addresses the task of automatically generating an alignment between a set of instructions and a first person video demonstrating an activity. The sparse descriptions and ambiguity of written instructions create significant alignment challenges. The key to our approach is the use of egocentric cues to generate a concise set of action proposals, which are then matched to recipe steps using object recognition and computational linguistic techniques. We obtain promising results on both the Extended GTEA Gaze+ dataset and the Bristol Egocentric Object Interactions Dataset

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Contextual Media Retrieval Using Natural Language Queries

Author: Bulling Andreas
Chowdhury Sreyasi Nag
Fritz Mario
Malinowski Mateusz
Publication venue
Publication date: 01/01/2016
Field of study

The widespread integration of cameras in hand-held and head-worn devices as well as the ability to share content online enables a large and diverse visual capture of the world that millions of users build up collectively every day. We envision these images as well as associated meta information, such as GPS coordinates and timestamps, to form a collective visual memory that can be queried while automatically taking the ever-changing context of mobile users into account. As a first step towards this vision, in this work we present Xplore-M-Ego: a novel media retrieval system that allows users to query a dynamic database of images and videos using spatio-temporal natural language queries. We evaluate our system using a new dataset of real user queries as well as through a usability study. One key finding is that there is a considerable amount of inter-user variability, for example in the resolution of spatial relations in natural language utterances. We show that our retrieval system can cope with this variability using personalisation through an online learning-based retrieval formulation.Comment: 8 pages, 9 figures, 1 tabl

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Egocentric Perception of Hands and Its Applications

Author: Lu Yao
Publication venue
Publication date: 22/03/2022
Field of study

Explore Bristol Research

The insider on the outside: a novel system for the detection of information leakers in social networks

Author: Cascavilla Giuseppe
Conti Mauro
Schwart David G.
Yahav Inbal
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2018
Field of study

Confidential information is all too easily leaked by naive users posting comments. In this paper we introduce DUIL, a system for Detecting Unintentional Information Leakers. The value of DUIL is in its ability to detect those responsible for information leakage that occurs through comments posted on news articles in a public environment, when those articles have withheld material non-public information. DUIL is comprised of several artefacts, each designed to analyse a different aspect of this challenge: the information, the user(s) who posted the information, and the user(s) who may be involved in the dissemination of information. We present a design science analysis of DUIL as an information system artefact comprised of social, information, and technology artefacts. We demonstrate the performance of DUIL on real data crawled from several Facebook news pages spanning two years of news articles

Archivio della ricerca- Università di Roma La Sapienza

Archivio istituzionale della ricerca - Università di Padova

EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding

Author: Alvarado Andres
Babaei Yasmine
Culatana Sean Chang
El-Mohri Hichem
Hu Jiabo
Sumbaly Roshan
Xiao Fanyi
Yan Zhicheng
Zhu Chenchen
Publication venue
Publication date: 15/09/2023
Field of study

Object understanding in egocentric visual data is arguably a fundamental research topic in egocentric vision. However, existing object datasets are either non-egocentric or have limitations in object categories, visual content, and annotation granularities. In this work, we introduce EgoObjects, a large-scale egocentric dataset for fine-grained object understanding. Its Pilot version contains over 9K videos collected by 250 participants from 50+ countries using 4 wearable devices, and over 650K object annotations from 368 object categories. Unlike prior datasets containing only object category labels, EgoObjects also annotates each object with an instance-level identifier, and includes over 14K unique object instances. EgoObjects was designed to capture the same object under diverse background complexities, surrounding objects, distance, lighting and camera motion. In parallel to the data collection, we conducted data annotation by developing a multi-stage federated annotation process to accommodate the growing nature of the dataset. To bootstrap the research on EgoObjects, we present a suite of 4 benchmark tasks around the egocentric object understanding, including a novel instance level- and the classical category level object detection. Moreover, we also introduce 2 novel continual learning object detection tasks. The dataset and API are available at https://github.com/facebookresearch/EgoObjects.Comment: ICCV 2023 final version and supplement. See more details in project page: https://github.com/facebookresearch/EgoObject

arXiv.org e-Print Archive