Search CORE

24 research outputs found

Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks

Author: Fink G. A.
Fritsch J.
McGuire P. C.
Ritter H.
Roethling F.
Sagerer G.
Steil J. J.
Wachsmuth S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

A major challenge for the realization of intelligent robots is to supply them with cognitive abilities in order to allow ordinary users to program them easily and intuitively. One way of such programming is teaching work tasks by interactive demonstration. To make this effective and convenient for the user, the machine must be capable to establish a common focus of attention and be able to use and integrate spoken instructions, visual perceptions, and non-verbal clues like gestural commands. We report progress in building a hybrid architecture that combines statistical methods, neural networks, and finite state machines into an integrated system for instructing grasping tasks by man-machine interaction. The system combines the GRAVIS-robot for visual attention and gestural instruction with an intelligent interface for speech recognition and linguistic interpretation, and an modality fusion module to allow multi-modal task-oriented man-machine communication with respect to dextrous robot manipulation of objects.Comment: 7 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Markov models for offline handwriting recognition: a survey

Author: A. Brakensiek
A. El-Yacoubi
A. Kundu
A. Vinciarelli
A. Vinciarelli
A. Vinciarelli
A. Viterbi
A.H.R. Ko
A.P. Dempster
A.W. Senior
E. Bocchieri
G.A. Fink
G.A. Fink
Gernot A. Fink
H. Bunke
H. Fujisawa
H. Fujisawa
H. Xue
J. Cai
J. Coetzer
J.A. Pittman
L. Baum
L. Baum
L. Likforman-Sulem
L.M. Lorigo
M. Wienecke
N. Arica
N. Arica
N. Arica
O.D. Trier
P. Natarajan
P. Natarajan
P.D. Gader
R. Davis
R. Nopsuwanchai
R. Plamondon
R.M. Bozinovic
R.O. Duda
S. Günter
S. Madhvanath
S. Young
S.F. Chen
T. Steinherz
Thomas Plötz
U.V. Marti
U.V. Marti
W. Cho
X.D. Huang
X.D. Huang
Y. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Implementation and Evaluation of Acoustic Distance Measures for Syllables

Author: Munier Christian
Publication venue: Bielefeld University
Publication date: 01/01/2011
Field of study

Munier C. Implementation and Evaluation of Acoustic Distance Measures for Syllables. Bielefeld (Germany): Bielefeld University; 2011.In dieser Arbeit werden verschiedene akustische Ähnlichkeitsmaße für Silben motiviert und anschließend evaluiert. Der Mahalanobisabstand als lokales Abstandsmaß für einen Dynamic-Time-Warping-Ansatz zum Messen von akustischen Abständen hat die Fähigkeit, Silben zu unterscheiden. Als solcher erlaubt er die Klassifizierung von Silben mit einer Genauigkeit, die für die Klassifizierung von kleinen akustischen Einheiten üblich ist (60 Prozent für eine Nächster-Nachbar-Klassifizierung auf einem Satz von zehn Silben für Samples eines einzelnen Sprechers). Dieses Maß kann durch verschiedene Techniken verbessert werden, die jedoch seine Ausführungsgeschwindigkeit verschlechtern (Benutzen von mehr Mischverteilungskomponenten für die Schätzung von Kovarianzen auf einer Gaußschen Mischverteilung, Benutzen von voll besetzten Kovarianzmatrizen anstelle von diagonalen Kovarianzmatrizen). Durch experimentelle Evaluierung wird deutlich, dass ein gut funktionierender Algorithmus zur Silbensegmentierung, welcher eine akkurate Schätzung von Silbengrenzen erlaubt, für die korrekte Berechnung von akustischen Abständen durch die in dieser Arbeit entwickelten Ähnlichkeitsmaße unabdingbar ist. Weitere Ansätze für Ähnlichkeitsmaße, die durch ihre Anwendung in der Timbre-Klassifizierung von Musikstücken motiviert sind, zeigen keine adäquate Fähigkeit zur Silbenunterscheidung.In this work, several acoustic similarity measures for syllables are motivated and successively evaluated. The Mahalanobis distance as local distance measure for a dynamic time warping approach to measure acoustic distances is a measure that is able to discriminate syllables and thus allows for syllable classification with an accuracy that is common to the classification of small acoustic units (60 percent for a nearest neighbor classification of a set of ten syllables using samples of a single speaker). This measure can be improved using several techniques that however impair the execution speed of the distance measure (usage of more mixture density components for the estimation of covariances from a Gaussian mixture model, usage of fully occupied covariance matrices instead of diagonal covariance matrices). Through experimental evaluation it becomes evident that a decently working syllable segmentation algorithm allowing for accurate syllable border estimations is essential to the correct computation of acoustic distances by the similarity measures developed in this work. Further approaches for similarity measures which are motivated by their usage in timbre classification of music pieces do not show adequate syllable discrimination abilities

Publications at Bielefeld University

Gamble:A Multiuser Game with an Embodied Conversational Agent

Author: Rehm Matthias
Wissner Michael
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

VBN

Emotion recognition from speech: An implementation in MATLAB

Author: Wusu-Ansah Maame Akua Afrakoma
Publication venue
Publication date: 01/01/2019
Field of study

Capstone Project submitted to the Department of Engineering, Ashesi University in partial fulfillment of the requirements for the award of Bachelor of Science degree in Electrical and Electronic Engineering, April 2019Human Computer Interaction now focuses more on being able to relate to human emotions. Recognizing human emotions from speech is an area that a lot of research is being done into with the rise of robots and Virtual reality. In this paper, emotion recognition from speech is done in MATLAB. Feature extraction is done based on the pitch and 13 MFCCs of the audio files. Two classification methods are used and compared to determine the one with the highest accuracy for the data set.Ashesi Universit

Ashesi Institutional Repository

Vision systems with the human in the loop

Author: Bauckhage Christian
Hanheide Marc
Kaster Thomas
Pfeiffer Michael
Sagerer Gerhard
Wrede Sebastian
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2005
Field of study

The emerging cognitive vision paradigm deals with vision systems that apply machine learning and automatic reasoning in order to learn from what they perceive. Cognitive vision systems can rate the relevance and consistency of newly acquired knowledge, they can adapt to their environment and thus will exhibit high robustness. This contribution presents vision systems that aim at flexibility and robustness. One is tailored for content-based image retrieval, the others are cognitive vision systems that constitute prototypes of visual active memories which evaluate, gather, and integrate contextual knowledge for visual analysis. All three systems are designed to interact with human users. After we will have discussed adaptive content-based image retrieval and object and action recognition in an office environment, the issue of assessing cognitive systems will be raised. Experiences from psychologically evaluated human-machine interactions will be reported and the promising potential of psychologically-based usability experiments will be stressed

University of Lincoln Institutional Repository

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Publications at Bielefeld University

An interactive interface for service robots

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Crossref

From chatterbots to natural interaction:Face to face communication with Embodied Conversational Agents.

Author: André Elisabeth
Rehm Matthias
Publication venue
Publication date: 01/01/2005
Field of study

OPUS Augsburg

VBN

Probabilistic Scene Modeling for Situated Computer Vision

Author: Swadzba Agnes
Wachsmuth Sven
Publication venue: Dagstuhl Seminar Proceedings. 08091 - Logic and Probability for Scene Interpretation
Publication date: 01/01/2008
Field of study

Dagstuhl Research Online Publication Server