Search CORE

6,724 research outputs found

A Deep-structured Conditional Random Field Model for Object Silhouette Tracking

Author: Azimifar Zohreh
Shafiee Mohammad
Wong Alexander
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 04/08/2015
Field of study

In this work, we introduce a deep-structured conditional random field (DS-CRF) model for the purpose of state-based object silhouette tracking. The proposed DS-CRF model consists of a series of state layers, where each state layer spatially characterizes the object silhouette at a particular point in time. The interactions between adjacent state layers are established by inter-layer connectivity dynamically determined based on inter-frame optical flow. By incorporate both spatial and temporal context in a dynamic fashion within such a deep-structured probabilistic graphical model, the proposed DS-CRF model allows us to develop a framework that can accurately and efficiently track object silhouettes that can change greatly over time, as well as under different situations such as occlusion and multiple targets within the scene. Experiment results using video surveillance datasets containing different scenarios such as occlusion and multiple targets showed that the proposed DS-CRF approach provides strong object silhouette tracking performance when compared to baseline methods such as mean-shift tracking, as well as state-of-the-art methods such as context tracking and boosted particle filtering.Comment: 17 page

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Cooperative Adaptive Control for Cloud-Based Robotics

Author: Slotine Jean-Jacques E.
Wensing Patrick M.
Publication venue
Publication date: 08/03/2018
Field of study

This paper studies collaboration through the cloud in the context of cooperative adaptive control for robot manipulators. We first consider the case of multiple robots manipulating a common object through synchronous centralized update laws to identify unknown inertial parameters. Through this development, we introduce a notion of Collective Sufficient Richness, wherein parameter convergence can be enabled through teamwork in the group. The introduction of this property and the analysis of stable adaptive controllers that benefit from it constitute the main new contributions of this work. Building on this original example, we then consider decentralized update laws, time-varying network topologies, and the influence of communication delays on this process. Perhaps surprisingly, these nonidealized networked conditions inherit the same benefits of convergence being determined through collective effects for the group. Simple simulations of a planar manipulator identifying an unknown load are provided to illustrate the central idea and benefits of Collective Sufficient Richness.Comment: ICRA 201

arXiv.org e-Print Archive

Crossref

Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots

Author: Hua Gang
Yang Jiaolong
Zhao Chunshui
Zhou Chen
Publication venue
Publication date: 14/08/2017
Field of study

Safety is paramount for mobile robotic platforms such as self-driving cars and unmanned aerial vehicles. This work is devoted to a task that is indispensable for safety yet was largely overlooked in the past -- detecting obstacles that are of very thin structures, such as wires, cables and tree branches. This is a challenging problem, as thin objects can be problematic for active sensors such as lidar and sonar and even for stereo cameras. In this work, we propose to use video sequences for thin obstacle detection. We represent obstacles with edges in the video frames, and reconstruct them in 3D using efficient edge-based visual odometry techniques. We provide both a monocular camera solution and a stereo camera solution. The former incorporates Inertial Measurement Unit (IMU) data to solve scale ambiguity, while the latter enjoys a novel, purely vision-based solution. Experiments demonstrated that the proposed methods are fast and able to detect thin obstacles robustly and accurately under various conditions.Comment: Appeared at IEEE CVPR 2017 Workshop on Embedded Visio

arXiv.org e-Print Archive

Crossref

Recommended from our members

Image Understanding and Robotics Research at Columbia University

Author: Allen Peter K.
Boult Terrance E.
Ibrahim Hussein
Kender John R.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1988
Field of study

The research investigations of the Vision/Robotics Laboratory at Columbia University reflect the diversity of interests of its four faculty members, two staff programmers and 15 Ph.D. students. Several of the projects involve either a visiting computer science post-doc, other faculty members in the department or the university, or researchers at AT&T Bell Laboratories or Philips laboratories. We list below a summary of our interest and results, together with the principal researchers associated with them. Since it is difficult to separate those aspects of robotic research that are purely visual from those that are vision-like (for example, tactile sensing) or vision-related (for example, integrated vision-robotic systems), we have listed all robotic research that is not purely manipulative

Columbia University Academic Commons

ModDrop: adaptive multi-modal gesture recognition

Author: Nebout Florian
Neverova Natalia
Taylor Graham W.
Wolf Christian
Publication venue
Publication date: 06/06/2015
Field of study

We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure

arXiv.org e-Print Archive

HAL

Hal-Diderot