Search CORE

92 research outputs found

Recommended from our members

An evaluation framework for stereo-based driver assistance

Author: Banitsas KA
Gehrig S
Pfeiffer D
Schneider N
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

This is the post-print version of the Article - Copyright @ 2012 Springer VerlagThe accuracy of stereo algorithms or optical flow methods is commonly assessed by comparing the results against the Middlebury database. However, equivalent data for automotive or robotics applications rarely exist as they are difficult to obtain. As our main contribution, we introduce an evaluation framework tailored for stereo-based driver assistance able to deliver excellent performance measures while circumventing manual label effort. Within this framework one can combine several ways of ground-truthing, different comparison metrics, and use large image databases. Using our framework we show examples on several types of ground truthing techniques: implicit ground truthing (e.g. sequence recorded without a crash occurred), robotic vehicles with high precision sensors, and to a small extent, manual labeling. To show the effectiveness of our evaluation framework we compare three different stereo algorithms on pixel and object level. In more detail we evaluate an intermediate representation called the Stixel World. Besides evaluating the accuracy of the Stixels, we investigate the completeness (equivalent to the detection rate) of the StixelWorld vs. the number of phantom Stixels. Among many findings, using this framework enables us to reduce the number of phantom Stixels by a factor of three compared to the base parametrization. This base parametrization has already been optimized by test driving vehicles for distances exceeding 10000 km

Brunel University Research Archive

Computer Vision Systems, Second International Workshop, ICVS 2001 Vancouver, Canada, July 7-8, 2001, Proceedings

Author
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2001
Field of study

TUbiblio

Hand Gesture Interaction with Human-Computer

Author: Dejan Chandra Gope Dr.
Publication venue: Global Journals Inc. (US)
Publication date: 18/12/2011
Field of study

Hand gestures are an important modality for human computer interaction. Compared to many existing interfaces, hand gestures have the advantages of being easy to use, natural, and intuitive. Successful applications of hand gesture recognition include computer games control, human-robot interaction, and sign language recognition, to name a few. Vision-based recognition systems can give computers the capability of understanding and responding to hand gestures. The paper gives an overview of the field of hand gesture interaction with Human- Computer, and describes the early stages of a project about gestural command sets, an issue that has often been neglected. Currently we have built a first prototype for exploring the use of pieand marking menus in gesture-based interaction. The purpose is to study if such menus, with practice, could support the development of autonomous gestural command sets. The scenario is remote control of home appliances, such as TV sets and DVD players, which in the future could be extended to the more general scenario of ubiquitous computing in everyday situations. Some early observations are reported, mainly concerning problems with user fatigue and precision of gestures. Future work is discussed, such as introducing flow menus for reducing fatigue, and control menus for continuous control functions. The computer vision algorithms will also have to be developed further

Global Journal of Computer Science and Technology (GJCST)

Visual Attention Mechanism for a Social Robot

Author: Antonio Bandera
Antonio Jesús Palomino
Juan Pedro Bandera
R. Marfil
Ricardo Vázquez-Martín
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2012
Field of study

This paper describes a visual perception system for a social robot. The central part of this system is an artificial attention mechanism that discriminates the most relevant information from all the visual information perceived by the robot. It is composed by three stages. At the preattentive stage, the concept of saliency is implemented based on ‘proto-objects’ [37]. From these objects, different saliency maps are generated. Then, the semiattentive stage identifies and tracks significant items according to the tasks to accomplish. This tracking process allows to implement the ‘inhibition of return’. Finally, the attentive stage fixes the field of attention to the most relevant object depending on the behaviours to carry out. Three behaviours have been implemented and tested which allow the robot to detect visual landmarks in an initially unknown environment, and to recognize and capture the upper-body motion of people interested in interact with it

Crossref

Directory of Open Access Journals

Visual Concept Detection in Images and Videos

Author: Mühling Markus
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2014
Field of study

The rapidly increasing proliferation of digital images and videos leads to a situation where content-based search in multimedia databases becomes more and more important. A prerequisite for effective image and video search is to analyze and index media content automatically. Current approaches in the field of image and video retrieval focus on semantic concepts serving as an intermediate description to bridge the “semantic gap” between the data representation and the human interpretation. Due to the large complexity and variability in the appearance of visual concepts, the detection of arbitrary concepts represents a very challenging task. In this thesis, the following aspects of visual concept detection systems are addressed: First, enhanced local descriptors for mid-level feature coding are presented. Based on the observation that scale-invariant feature transform (SIFT) descriptors with different spatial extents yield large performance differences, a novel concept detection system is proposed that combines feature representations for different spatial extents using multiple kernel learning (MKL). A multi-modal video concept detection system is presented that relies on Bag-of-Words representations for visual and in particular for audio features. Furthermore, a method for the SIFT-based integration of color information, called color moment SIFT, is introduced. Comparative experimental results demonstrate the superior performance of the proposed systems on the Mediamill and on the VOC Challenge. Second, an approach is presented that systematically utilizes results of object detectors. Novel object-based features are generated based on object detection results using different pooling strategies. For videos, detection results are assembled to object sequences and a shot-based confidence score as well as further features, such as position, frame coverage or movement, are computed for each object class. These features are used as additional input for the support vector machine (SVM)-based concept classifiers. Thus, other related concepts can also profit from object-based features. Extensive experiments on the Mediamill, VOC and TRECVid Challenge show significant improvements in terms of retrieval performance not only for the object classes, but also in particular for a large number of indirectly related concepts. Moreover, it has been demonstrated that a few object-based features are beneficial for a large number of concept classes. On the VOC Challenge, the additional use of object-based features led to a superior performance for the image classification task of 63.8% mean average precision (AP). Furthermore, the generalization capabilities of concept models are investigated. It is shown that different source and target domains lead to a severe loss in concept detection performance. In these cross-domain settings, object-based features achieve a significant performance improvement. Since it is inefficient to run a large number of single-class object detectors, it is additionally demonstrated how a concurrent multi-class object detection system can be constructed to speed up the detection of many object classes in images. Third, a novel, purely web-supervised learning approach for modeling heterogeneous concept classes in images is proposed. Tags and annotations of multimedia data in the WWW are rich sources of information that can be employed for learning visual concepts. The presented approach is aimed at continuous long-term learning of appearance models and improving these models periodically. For this purpose, several components have been developed: a crawling component, a multi-modal clustering component for spam detection and subclass identification, a novel learning component, called “random savanna”, a validation component, an updating component, and a scalability manager. Only a single word describing the visual concept is required to initiate the learning process. Experimental results demonstrate the capabilities of the individual components. Finally, a generic concept detection system is applied to support interdisciplinary research efforts in the field of psychology and media science. The psychological research question addressed in the field of behavioral sciences is, whether and how playing violent content in computer games may induce aggression. Therefore, novel semantic concepts most notably “violence” are detected in computer game videos to gain insights into the interrelationship of violent game events and the brain activity of a player. Experimental results demonstrate the excellent performance of the proposed automatic concept detection approach for such interdisciplinary research

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Vector Disparity Sensor with Vergence Control for Active Vision Systems

Author: Agostino Gibaldi
Barron
Beauchemin
Bergen
Bernardino
Bertozzi
Brandt
Burt
Chang
Chessa
Coelho
Diaz
Diaz
Diaz
Eduardo Ros
Enrights
Faugeras
Fleet
Fleet
Francisco Barranco
Gautama
Gong
Hansard
Javier Diaz
Jin
Liu
Lu
Musleh
Nalpantidis
Oisel
Oppenheim
Ortigosa
Sabatini
Sapienza
Semmlow
Silvio P. Sabatini
Solari
Theimer
Tomasi
Tomasi
Trucco
Vanegas
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 01/01/2012
Field of study

This paper presents an architecture for computing vector disparity for active vision systems as used on robotics applications. The control of the vergence angle of a binocular system allows us to efficiently explore dynamic environments, but requires a generalization of the disparity computation with respect to a static camera setup, where the disparity is strictly 1-D after the image rectification. The interaction between vision and motor control allows us to develop an active sensor that achieves high accuracy of the disparity computation around the fixation point, and fast reaction time for the vergence control. In this contribution, we address the development of a real-time architecture for vector disparity computation using an FPGA device. We implement the disparity unit and the control module for vergence, version, and tilt to determine the fixation point. In addition, two on-chip different alternatives for the vector disparity engines are discussed based on the luminance (gradient-based) and phase information of the binocular images. The multiscale versions of these engines are able to estimate the vector disparity up to 32 fps on VGA resolution images with very good accuracy as shown using benchmark sequences with known ground-truth. The performances in terms of frame-rate, resource utilization, and accuracy of the presented approaches are discussed. On the basis of these results, our study indicates that the gradient-based approach leads to the best trade-off choice for the integration with the active vision system

Multidisciplinary Digital Publishing Institute

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

PubMed Central

Repositorio Institucional Universidad de Granada

Archivio istituzionale della ricerca - Università di Genova

‘IMPLICIT CREATION’ – NON-PROGRAMMER CONCEPTUAL MODELS FOR AUTHORING IN INTERACTIVE DIGITAL STORYTELLING

Author: Spierling Ulrike Martina
Publication venue: 'University of Plymouth'
Publication date: 01/01/2012
Field of study

Interactive Digital Storytelling (IDS) constitutes a research field that emerged from several areas of art, creation and computer science. It inquires technologies and possible artefacts that allow ‘highly-interactive’ experiences of digital worlds with compelling stories. However, the situation for story creators approaching ‘highly-interactive’ storytelling is complex. There is a gap between the available technology, which requires programming and prior knowledge in Artificial Intelligence, and established models of storytelling, which are too linear to have the potential to be highly interactive. This thesis reports on research that lays the ground for bridging this gap, leading to novel creation philosophies in future work. A design research process has been pursued, which centred on the suggestion of conceptual models, explaining a) process structures of interdisciplinary development, b) interactive story structures including the user of the interactive story system, and c) the positioning of human authors within semi-automated creative processes. By means of ‘implicit creation’, storytelling and modelling of simulated worlds are reconciled. The conceptual models are informed by exhaustive literature review in established neighbouring disciplines. These are a) creative principles in different storytelling domains, such as screenwriting, video game writing, role playing and improvisational theatre, b) narratological studies of story grammars and structures, and c) principles of designing interactive systems, in the areas of basic HCI design and models, discourse analysis in conversational systems, as well as game- and simulation design. In a case study of artefact building, the initial models have been put into practice, evaluated and extended. These artefacts are a) a conceived authoring tool (‘Scenejo’) for the creation of digital conversational stories, and b) the development of a serious game (‘The Killer Phrase Game’) as an application development. The study demonstrates how starting out from linear storytelling, iterative steps of ‘implicit creation’ can lead to more variability and interactivity in the designed interactive story. In the concrete case, the steps included abstraction of dialogues into conditional actions, and creating a dynamic world model of the conversation. This process and artefact can be used as a model illustrating non-programmer approaches to ‘implicit creation’ in a learning process. Research demonstrates that the field of Interactive Digital Storytelling still has to be further advanced until general creative principles can be fully established, which is a long-term endeavour, dependent upon environmental factors. It also requires further technological developments. The gap is not yet closed, but it can be better explained. The research results build groundwork for education of prospective authors. Concluding the thesis, IDS-specific creative principles have been proposed for evaluation in future work

Plymouth Electronic Archive and Research Library

Languages of games and play: A systematic mapping study

Author: Rozen R.A. (Riemer) van
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2020
Field of study

Digital games are a powerful means for creating enticing, beautiful, educational, and often highly addictive interactive experiences that impact the lives of billions of players worldwide. We explore what informs the design and construction of good games to learn how to speed-up game development. In particular, we study to what extent languages, notations, patterns, and tools, can offer experts theoretical foundations, systematic techniques, and practical solutions they need to raise their productivity and improve the quality of games and play. Despite the growing number of publications on this topic there is currently no overview describing the state-of-the-art that relates research areas, goals, and applications. As a result, efforts and successes are often one-off, lessons learned go overlooked, language reuse remains minimal, and opportunities for collaboration and synergy are lost. We present a systematic map that identifies relevant publications and gives an overview of research areas and publication venues. In addition, we categorize research perspectives along common objectives, techniques, and approaches, illustrated by summaries of selected languages. Finally, we distill challenges and opportunities for future research and development

CWI's Institutional Repository