29 research outputs found

    Towards gestural understanding for intelligent robots

    Get PDF
    Fritsch JN. Towards gestural understanding for intelligent robots. Bielefeld: UniversitĂ€t Bielefeld; 2012.A strong driving force of scientific progress in the technical sciences is the quest for systems that assist humans in their daily life and make their life easier and more enjoyable. Nowadays smartphones are probably the most typical instances of such systems. Another class of systems that is getting increasing attention are intelligent robots. Instead of offering a smartphone touch screen to select actions, these systems are intended to offer a more natural human-machine interface to their users. Out of the large range of actions performed by humans, gestures performed with the hands play a very important role especially when humans interact with their direct surrounding like, e.g., pointing to an object or manipulating it. Consequently, a robot has to understand such gestures to offer an intuitive interface. Gestural understanding is, therefore, a key capability on the way to intelligent robots. This book deals with vision-based approaches for gestural understanding. Over the past two decades, this has been an intensive field of research which has resulted in a variety of algorithms to analyze human hand motions. Following a categorization of different gesture types and a review of other sensing techniques, the design of vision systems that achieve hand gesture understanding for intelligent robots is analyzed. For each of the individual algorithmic steps – hand detection, hand tracking, and trajectory-based gesture recognition – a separate Chapter introduces common techniques and algorithms and provides example methods. The resulting recognition algorithms are considering gestures in isolation and are often not sufficient for interacting with a robot who can only understand such gestures when incorporating the context like, e.g., what object was pointed at or manipulated. Going beyond a purely trajectory-based gesture recognition by incorporating context is an important prerequisite to achieve gesture understanding and is addressed explicitly in a separate Chapter of this book. Two types of context, user-provided context and situational context, are reviewed and existing approaches to incorporate context for gestural understanding are reviewed. Example approaches for both context types provide a deeper algorithmic insight into this field of research. An overview of recent robots capable of gesture recognition and understanding summarizes the currently realized human-robot interaction quality. The approaches for gesture understanding covered in this book are manually designed while humans learn to recognize gestures automatically during growing up. Promising research targeted at analyzing developmental learning in children in order to mimic this capability in technical systems is highlighted in the last Chapter completing this book as this research direction may be highly influential for creating future gesture understanding systems

    Context-based Visual Feedback Recognition

    Get PDF
    PhD thesisDuring face-to-face conversation, people use visual feedback (e.g.,head and eye gesture) to communicate relevant information and tosynchronize rhythm between participants. When recognizing visualfeedback, people often rely on more than their visual perception.For instance, knowledge about the current topic and from previousutterances help guide the recognition of nonverbal cues. The goal ofthis thesis is to augment computer interfaces with the ability toperceive visual feedback gestures and to enable the exploitation ofcontextual information from the current interaction state to improvevisual feedback recognition.We introduce the concept of visual feedback anticipationwhere contextual knowledge from an interactive system (e.g. lastspoken utterance from the robot or system events from the GUIinterface) is analyzed online to anticipate visual feedback from ahuman participant and improve visual feedback recognition. Ourmulti-modal framework for context-based visual feedback recognitionwas successfully tested on conversational and non-embodiedinterfaces for head and eye gesture recognition.We also introduce Frame-based Hidden-state Conditional RandomField model, a new discriminative model for visual gesturerecognition which can model the sub-structure of a gesture sequence,learn the dynamics between gesture labels, and can be directlyapplied to label unsegmented sequences. The FHCRF model outperformsprevious approaches (i.e. HMM, SVM and CRF) for visual gesturerecognition and can efficiently learn relevant contextualinformation necessary for visual feedback anticipation.A real-time visual feedback recognition library for interactiveinterfaces (called Watson) was developed to recognize head gaze,head gestures, and eye gaze using the images from a monocular orstereo camera and the context information from the interactivesystem. Watson was downloaded by more then 70 researchers around theworld and was successfully used by MERL, USC, NTT, MIT Media Lab andmany other research groups

    From motion capture to interactive virtual worlds : towards unconstrained motion-capture algorithms for real-time performance-driven character animation

    Get PDF
    This dissertation takes performance-driven character animation as a representative application and advances motion capture algorithms and animation methods to meet its high demands. Existing approaches have either coarse resolution and restricted capture volume, require expensive and complex multi-camera systems, or use intrusive suits and controllers. For motion capture, set-up time is reduced using fewer cameras, accuracy is increased despite occlusions and general environments, initialization is automated, and free roaming is enabled by egocentric cameras. For animation, increased robustness enables the use of low-cost sensors input, custom control gesture definition is guided to support novice users, and animation expressiveness is increased. The important contributions are: 1) an analytic and differentiable visibility model for pose optimization under strong occlusions, 2) a volumetric contour model for automatic actor initialization in general scenes, 3) a method to annotate and augment image-pose databases automatically, 4) the utilization of unlabeled examples for character control, and 5) the generalization and disambiguation of cyclical gestures for faithful character animation. In summary, the whole process of human motion capture, processing, and application to animation is advanced. These advances on the state of the art have the potential to improve many interactive applications, within and outside virtual reality.Diese Arbeit befasst sich mit Performance-driven Character Animation, insbesondere werden Motion Capture-Algorithmen entwickelt um den hohen Anforderungen dieser Beispielanwendung gerecht zu werden. Existierende Methoden haben entweder eine geringe Genauigkeit und einen eingeschrĂ€nkten Aufnahmebereich oder benötigen teure Multi-Kamera-Systeme, oder benutzen störende Controller und spezielle AnzĂŒge. FĂŒr Motion Capture wird die Setup-Zeit verkĂŒrzt, die Genauigkeit fĂŒr Verdeckungen und generelle Umgebungen erhöht, die Initialisierung automatisiert, und BewegungseinschrĂ€nkung verringert. FĂŒr Character Animation wird die Robustheit fĂŒr ungenaue Sensoren erhöht, Hilfe fĂŒr benutzerdefinierte Gestendefinition geboten, und die AusdrucksstĂ€rke der Animation verbessert. Die wichtigsten BeitrĂ€ge sind: 1) ein analytisches und differenzierbares Sichtbarkeitsmodell fĂŒr Rekonstruktionen unter starken Verdeckungen, 2) ein volumetrisches Konturenmodell fĂŒr automatische Körpermodellinitialisierung in genereller Umgebung, 3) eine Methode zur automatischen Annotation von Posen und Augmentation von Bildern in großen Datenbanken, 4) das Nutzen von Beispielbewegungen fĂŒr Character Animation, und 5) die Generalisierung und Übertragung von zyklischen Gesten fĂŒr genaue Charakteranimation. Es wird der gesamte Prozess erweitert, von Motion Capture bis hin zu Charakteranimation. Die Verbesserungen sind fĂŒr viele interaktive Anwendungen geeignet, innerhalb und außerhalb von virtueller RealitĂ€t

    BlickpunktabhÀngige Computergraphik

    Get PDF
    Contemporary digital displays feature multi-million pixels at ever-increasing refresh rates. Reality, on the other hand, provides us with a view of the world that is continuous in space and time. The discrepancy between viewing the physical world and its sampled depiction on digital displays gives rise to perceptual quality degradations. By measuring or estimating where we look, gaze-contingent algorithms aim at exploiting the way we visually perceive to remedy visible artifacts. This dissertation presents a variety of novel gaze-contingent algorithms and respective perceptual studies. Chapter 4 and 5 present methods to boost perceived visual quality of conventional video footage when viewed on commodity monitors or projectors. In Chapter 6 a novel head-mounted display with real-time gaze tracking is described. The device enables a large variety of applications in the context of Virtual Reality and Augmented Reality. Using the gaze-tracking VR headset, a novel gaze-contingent render method is described in Chapter 7. The gaze-aware approach greatly reduces computational efforts for shading virtual worlds. The described methods and studies show that gaze-contingent algorithms are able to improve the quality of displayed images and videos or reduce the computational effort for image generation, while display quality perceived by the user does not change.Moderne digitale Bildschirme ermöglichen immer höhere Auflösungen bei ebenfalls steigenden Bildwiederholraten. Die RealitĂ€t hingegen ist in Raum und Zeit kontinuierlich. Diese Grundverschiedenheit fĂŒhrt beim Betrachter zu perzeptuellen Unterschieden. Die Verfolgung der Aug-Blickrichtung ermöglicht blickpunktabhĂ€ngige Darstellungsmethoden, die sichtbare Artefakte verhindern können. Diese Dissertation trĂ€gt zu vier Bereichen blickpunktabhĂ€ngiger und wahrnehmungstreuer Darstellungsmethoden bei. Die Verfahren in Kapitel 4 und 5 haben zum Ziel, die wahrgenommene visuelle QualitĂ€t von Videos fĂŒr den Betrachter zu erhöhen, wobei die Videos auf gewöhnlicher Ausgabehardware wie z.B. einem Fernseher oder Projektor dargestellt werden. Kapitel 6 beschreibt die Entwicklung eines neuartigen Head-mounted Displays mit UnterstĂŒtzung zur Erfassung der Blickrichtung in Echtzeit. Die Kombination der Funktionen ermöglicht eine Reihe interessanter Anwendungen in Bezug auf Virtuelle RealitĂ€t (VR) und Erweiterte RealitĂ€t (AR). Das vierte und abschließende Verfahren in Kapitel 7 dieser Dissertation beschreibt einen neuen Algorithmus, der das entwickelte Eye-Tracking Head-mounted Display zum blickpunktabhĂ€ngigen Rendern nutzt. Die QualitĂ€t des Shadings wird hierbei auf Basis eines Wahrnehmungsmodells fĂŒr jeden Bildpixel in Echtzeit analysiert und angepasst. Das Verfahren hat das Potenzial den Berechnungsaufwand fĂŒr das Shading einer virtuellen Szene auf ein Bruchteil zu reduzieren. Die in dieser Dissertation beschriebenen Verfahren und Untersuchungen zeigen, dass blickpunktabhĂ€ngige Algorithmen die DarstellungsqualitĂ€t von Bildern und Videos wirksam verbessern können, beziehungsweise sich bei gleichbleibender BildqualitĂ€t der Berechnungsaufwand des bildgebenden Verfahrens erheblich verringern lĂ€sst
    corecore