Search CORE

14 research outputs found

The Global Patch Collider

Author: Christoph Rhemann
Pushmeet Kohli
Sean Ryan Fanello
Shahram Izadi
Shenlong Wang
Publication venue
Publication date: 30/04/2020
Field of study

Abstract This paper proposes a novel extremely efficient, fullyparallelizable, task-specific algorithm for the computation of global point-wise correspondences in images and videos. Our algorithm, the Global Patch Collider, is based on detecting unique collisions between image points using a collection of learned tree structures that act as conditional hash functions. In contrast to conventional approaches that rely on pairwise distance computation, our algorithm isolates distinctive pixel pairs that hit the same leaf during traversal through multiple learned tree structures. The split functions stored at the intermediate nodes of the trees are trained to ensure that only visually similar patches or their geometric or photometric transformed versions fall into the same leaf node. The matching process involves passing all pixel positions in the images under analysis through the tree structures. We then compute matches by isolating points that uniquely collide with each other ie. fell in the same empty leaf in multiple trees. Our algorithm is linear in the number of pixels but can be made constant time on a parallel computation architecture as the tree traversal for individual image points is decoupled. We demonstrate the efficacy of our method by using it to perform optical flow matching and stereo matching on some challenging benchmarks. Experimental results show that not only is our method extremely computationally efficient, but it is also able to match or outperform state of the art methods that are much more complex

CiteSeerX

In-air gestures around unmodified mobile devices

Author: Jie Song
Gábor Sörös
Fabrizio Pece
Sean Ryan Fanello
Shahram Izadi
Cem Keskin
Otmar Hilliges
Publication venue: ACM Press
Publication date: 14/12/2001
Field of study

International audience(Communication de la commission concernant les accords d'importance mineure qui ne restreignent pas sensiblement le jeu de la concurrence au sens de l'art. 81, § 1 du traité instituant la Communauté européenne (de minimis), JOCE C. 368, 22 déc. 2001, Déc. du 25 juill. 2001, Deutsche Post - interception de courrier transfrontière, JOCE L. 331, 15 déc. 2001

Crossref

Fusion4D: Real-time Performance Capture of Challenging Scenes

Author: Davidson Philip
Degtyarev Yury
Dou Mingsong
Izadi Shahram
Khamis Sameh
Kim David
Kohli Pushmeet
Kowdle Adarsh
Orts-Escolano Sergio
Rhemann Christoph
Ryan Fanello Sean
Tankovich Vladimir
Taylor Jonathan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well as parameterizing the nonrigid scene motion. Our approach is highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes. We demonstrate advantages over related real-time techniques that either deform an online generated template or continually fuse depth data nonrigidly into a single reference model. Finally, we show geometric reconstruction results on par with offline methods which require orders of magnitude more processing time and many more RGBD cameras

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Touché: Data-Driven Interactive Sword Fighting in Virtual Reality

Author: Anand
Asadi-Aghbolaghi Maryam
Bratman Michael
Broersen Jan
Büttner Michael
Catmull Edwin
Ciccone Loïc
Clavet Simon
Dehesa Javier
Dehesa Javier
DeVault David
Diederik
Erol Kutluhan
Escalera Sergio
Fanello Sean Ryan
Keskin Cem
Kotseruba Iuliia
Kovar Lucas
Lee Yongjoon
Lu Liang
Michael
Mizuguchi Mark
Nair Vinod
Neverova Natalia
Pepe Felipe
Sagar Mark
Taubert Nick
Taylor Graham W.
Treuille Adrien
Tsironi Eleni
Wang Sy Bor
Witkin Andrew
Wong Sebastien C.
Xu Deyou
Yu Fisher
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/04/2020
Field of study

OPUS

Crossref

One-shot learning for real-time action recognition.

Author: FANELLO SEAN RYAN
GORI ILARIA
METTA GIORGIO
ODONE FRANCESCA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The goal of the paper is to develop a one-shot real-time learning and recognition system for 3D actions. We use RGBD images, combine motion and appearance cues, and map them into a new overcomplete space. The proposed method relies on descriptors based on 3D Histogram of Flow (3DHOF) and on Global Histogram of Oriented Gradient (GHOG); adaptive sparse coding (SC) is further applied to capture high-level patterns. We add effective on-line video segmentation and \ufb01nally the recognition of actions through linear SVMs. The main contribution of the paper is a real-time system for one-shot action modeling; moreover we highlight the effectiveness of sparse coding techniques to represent 3D actions. We obtain very good results on the ChaLearn Gesture Dataset and with a Kinect sensor

Archivio istituzionale della ricerca - Università di Genova

Weakly supervised strategies for natural object recognition in robotics2013 IEEE International Conference on Robotics and Automation

Author: CILIBERTO CARLO
FANELLO SEAN RYAN
METTA GIORGIO
NATALE LORENZO
Publication venue
Publication date: 01/01/2013
Field of study

Archivio istituzionale della ricerca - Università di Genova

Keep It Simple And Sparse: Real-Time Action Recognition

Author: Fanello SEAN RYAN
Gori Ilaria
Metta Giorgio
Odone Francesca
Publication venue
Publication date: 01/01/2013
Field of study

Sparsity has been showed to be one of the most important properties for visual recognition purposes. In this paper we show that sparse representation plays a fundamental role in achieving one-shot learning and real-time recognition of actions. We start off from RGBD images, combine motion and appearance cues and extract state-of-the-art features in a computationally efficient way. The proposed method relies on descriptors based on 3D Histograms of Scene Flow (3DHOFs) and Global Histograms of Oriented Gradient (GHOGs); adaptive sparse coding is applied to capture high-level patterns from data. We then propose a simultaneous on-line video segmentation and recognition of actions using linear SVMs. The main contribution of the paper is an effective real- time system for one-shot action modeling and recognition; the paper highlights the effectiveness of sparse coding techniques to represent 3D actions. We obtain very good results on three different datasets: a benchmark dataset for one-shot action learning (the ChaLearn Gesture Dataset), an in-house dataset acquired by a Kinect sensor including complex actions and gestures differing by small details, and a dataset created for human-robot interaction purposes. Finally we demonstrate that our system is effective also in a human-robot interaction setting and propose a memory game, \u201cAll Gestures You Can\u201d, to be played against a humanoid robot

Archivio istituzionale della ricerca - Università di Genova

Visual recognition for humanoid robots

Author: Ciliberto Carlo
Fanello Sean Ryan
Metta Giorgio
Noceti Nicoletta
Odone Francesca
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Visual perception is a fundamental component for most robotics systems operating in human environments. Specifically, visual recognition is a prerequisite to a large variety of tasks such as tracking, manipulation, human\u2013robot interaction. As a consequence, the lack of successful recognition often becomes a bottleneck for the application of robotics system to real-world situations. In this paper we aim at improving the robot visual perception capabilities in a natural, human-like fashion, with a very limited amount of constraints to the acquisition scenario. In particular our goal is to build and analyze a learning system that can rapidly be re-trained in order to incorporate new evidence if available. To this purpose, we review the state-of-the-art coding\u2013pooling pipelines for visual recognition and propose two modifications which allow us to improve the quality of the representation, while maintaining real-time performances: a coding scheme, Best Code Entries (BCE), and a new pooling operator, Mid-Level Classification Weights (MLCW). The former focuses entirely on sparsity and improves the stability and computational efficiency of the coding phase, the latter increases the discriminability of the visual representation, and therefore the overall recognition accuracy of the system, by exploiting data supervision. The proposed pipeline is assessed from a qualitative perspective on a Human\u2013Robot Interaction (HRI) application on the iCub platform. Quantitative evaluation of the proposed system is performed both on in-house robotics data-sets (iCubWorld) and on established computer vision benchmarks (Caltech-256, PASCAL VOC 2007). As a byproduct of this work, we provide for the robotics community an implementation of the proposed visual recognition pipeline which can be used as perceptual layer for more complex robotics applications

Archivio istituzionale della ricerca - Università di Genova

Ask the Image: Supervised Pooling to Preserve Feature Locality

Author: Ciliberto C.
Fanello SEAN RYAN
Metta Giorgio
Noceti Nicoletta
Odone Francesca
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In this paper we propose a weighted supervised pooling method for visual recognition systems. We combine a standard Spatial Pyramid Representation which is commonly adopted to encode spatial information, with an appropriate Feature Space Representation favoring semantic information in an appropriate feature space. For the latter, we propose a weighted pooling strategy exploiting data supervision to weigh each local descriptor coherently with its likelihood to belong to a given object class. The two representations are then combined adaptively with Multiple Kernel Learning. Experiments on common benchmarks (Caltech- 256 and PASCAL VOC-2007) show that our image representation improves the current visual recognition pipeline and it is competitive with similar state-of-art pooling methods. We also evaluate our method on a real Human-Robot Interaction setting, where the pure Spatial Pyramid Representation does not provide sufficient discriminative power, obtaining a remarkable improvemen

Crossref

Archivio istituzionale della ricerca - Università di Genova