Search CORE

35 research outputs found

Interactive Object Learning and Recognition with Multiclass Support Vector Machines

Author: Ales Ude
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Crossref

Active Vision for Scene Understanding

Author: Grotz Markus
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2021
Field of study

Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot\u27s view in order to explore interaction possibilities of the scene

KITopen

Directory of Open Access Books (DOAB)

Active Vision for Scene Understanding

Author: Grotz Markus
Publication venue
Publication date
Field of study

Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot's view in order to explore interaction possibilities of the scene

OAPEN Library

Gaze control for visually guided manipulation

Author: Nunez-Varela Jose Ignacio
Publication venue
Publication date: 01/12/2013
Field of study

Human studies have shown that gaze shifts are mostly driven by the task. One explanation is that fixations gather information about task relevant properties, where task relevance is signalled by reward. This thesis pursues primarily an engineering science goal to determine what mechanisms a rational decision maker could employ to select a gaze location optimally, or near optimally, given limited information and limited computation time. To do so we formulate and characterise three computational models of gaze shifting (implemented on a simulated humanoid robot), which use lookahead to imagine the informational effects of possible gaze fixations. Our first model selects the gaze that most reduces uncertainty in the scene (Unc), the second maximises expected rewards by reducing uncertainty (Rew+Unc), and the third maximises the expected gain in cumulative reward by reducing uncertainty (Rew+Unc+Gain). We also present an integrated account of a visual search process into the Rew+Unc+Gain gaze scheme. Our secondary goal is concerned with the way in which humans might select the next gaze location. We compare the hand-eye coordination timings of our models to previously published human data, and we provide evidence that only the models that incorporate both uncertainty and reward (Rew+Unc and Rew+Unc+Gain) match human data

University of Birmingham Research Archive, E-theses Repository

Suivi Multi-Locuteurs avec des Informations Audio-Visuelles pour la Perception des Robots

Author: Ban Yutong
Publication venue: HAL CCSD
Publication date: 10/05/2019
Field of study

Robot perception plays a crucial role in human-robot interaction (HRI). Perception system provides the robot information of the surroundings and enables the robot to give feedbacks. In a conversational scenario, a group of people may chat in front of the robot and move freely. In such situations, robots are expected to understand where are the people, who are speaking, or what are they talking about. This thesis concentrates on answering the first two questions, namely speaker tracking and diarization. We use different modalities of the robot’s perception system to achieve the goal. Like seeing and hearing for a human-being, audio and visual information are the critical cues for a robot in a conversational scenario. The advancement of computer vision and audio processing of the last decade has revolutionized the robot perception abilities. In this thesis, we have the following contributions: we first develop a variational Bayesian framework for tracking multiple objects. The variational Bayesian framework gives closed-form tractable problem solutions, which makes the tracking process efficient. The framework is first applied to visual multiple-person tracking. Birth and death process are built jointly with the framework to deal with the varying number of the people in the scene. Furthermore, we exploit the complementarity of vision and robot motorinformation. On the one hand, the robot’s active motion can be integrated into the visual tracking system to stabilize the tracking. On the other hand, visual information can be used to perform motor servoing. Moreover, audio and visual information are then combined in the variational framework, to estimate the smooth trajectories of speaking people, and to infer the acoustic status of a person- speaking or silent. In addition, we employ the model to acoustic-only speaker localization and tracking. Online dereverberation techniques are first applied then followed by the tracking system. Finally, a variant of the acoustic speaker tracking model based on von-Mises distribution is proposed, which is specifically adapted to directional data. All the proposed methods are validated on datasets according to applications.La perception des robots joue un rôle crucial dans l’interaction homme-robot (HRI). Le système de perception fournit les informations au robot sur l’environnement, ce qui permet au robot de réagir en consequence. Dans un scénario de conversation, un groupe de personnes peut discuter devant le robot et se déplacer librement. Dans de telles situations, les robots sont censés comprendre où sont les gens, ceux qui parlent et de quoi ils parlent. Cette thèse se concentre sur les deux premières questions, à savoir le suivi et la diarisation des locuteurs. Nous utilisons différentes modalités du système de perception du robot pour remplir cet objectif. Comme pour l’humain, l’ouie et la vue sont essentielles pour un robot dans un scénario de conversation. Les progrès de la vision par ordinateur et du traitement audio de la dernière décennie ont révolutionné les capacités de perception des robots. Dans cette thèse, nous développons les contributions suivantes : nous développons d’abord un cadre variationnel bayésien pour suivre plusieurs objets. Le cadre bayésien variationnel fournit des solutions explicites, rendant le processus de suivi très efficace. Cette approche est d’abord appliqué au suivi visuel de plusieurs personnes. Les processus de créations et de destructions sont en adéquation avecle modèle probabiliste proposé pour traiter un nombre variable de personnes. De plus, nous exploitons la complémentarité de la vision et des informations du moteur du robot : d’une part, le mouvement actif du robot peut être intégré au système de suivi visuel pour le stabiliser ; d’autre part, les informations visuelles peuvent être utilisées pour effectuer l’asservissement du moteur. Par la suite, les informations audio et visuelles sont combinées dans le modèle variationnel, pour lisser les trajectoires et déduire le statut acoustique d’une personne : parlant ou silencieux. Pour experimenter un scenario où l’informationvisuelle est absente, nous essayons le modèle pour la localisation et le suivi des locuteurs basé sur l’information acoustique uniquement. Les techniques de déréverbération sont d’abord appliquées, dont le résultat est fourni au système de suivi. Enfin, une variante du modèle de suivi des locuteurs basée sur la distribution de von-Mises est proposée, celle-ci étant plus adaptée aux données directionnelles. Toutes les méthodes proposées sont validées sur des bases de données specifiques à chaque application

Mosaic-based navigation for autonomous underwater vehicles

Author: A. Bernardino
J. Santos-Victor
N.R. Gracias
S. van der Zwaan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Learned manipulation on robot arms with parallel-jaw grippers

Author: Hannuksela Kuisma
Publication venue
Publication date: 10/05/2023
Field of study

Trepo - Institutional Repository of Tampere University

Tracking Moving Objects by a Mobile Camera

Author: Lee Sang Wook
Wohn Kwangyoen
Publication venue: ScholarlyCommons
Publication date: 01/01/1987
Field of study

A system for video tracking of a moving object by the robot-held camera is presented, and efficient tracking methods are proposed. We describe our initial implementation of a system which is capable of tracking a single moving object against highly textured background. A pyramid-based image processor, PVM-1 is employed to support some fast algorithms in locating the moving object from the difference image. Object tracking is accomplished in the static look-and-move mode by the translational motion of a CCD camera mounted on the robot arm. Discussion is given on the implementation of tracking filters and on the effective utilization of multi-resolution processing for the object searching. Finally a method for dynamic look-and-move tracking is proposed for the future improvement of tracking performance

ScholarlyCommons@Penn

Design of A Saccadic Active Vision System

Author: Wong Winnie Sze-Wing
Publication venue: 'University of Waterloo'
Publication date: 01/01/2006
Field of study

Human vision is remarkable. By limiting the main concentration of high-acuity photoreceptors to the eye's central fovea region, we efficiently view the world by redirecting the fovea between points of interest using eye movements called saccades. Part I describes a saccadic vision system prototype design. The dual-resolution saccadic camera detects objects of interest in a scene by processing low-resolution image information; it then revisits salient regions in high-resolution. The end product is a dual-resolution image in which background information is displayed in low-resolution, and salient areas are captured in high-acuity. This lends to a resource-efficient active vision system. Part II describes CMOS image sensor designs for active vision. Specifically, this discussion focuses on methods to determine regions of interest and achieve high dynamic range on the sensor

University of Waterloo's Institutional Repository