4,283 research outputs found

    Modelling and tracking objects with a topology preserving self-organising neural network

    Get PDF
    Human gestures form an integral part in our everyday communication. We use gestures not only to reinforce meaning, but also to describe the shape of objects, to play games, and to communicate in noisy environments. Vision systems that exploit gestures are often limited by inaccuracies inherent in handcrafted models. These models are generated from a collection of training examples which requires segmentation and alignment. Segmentation in gesture recognition typically involves manual intervention, a time consuming process that is feasible only for a limited set of gestures. Ideally gesture models should be automatically acquired via a learning scheme that enables the acquisition of detailed behavioural knowledge only from topological and temporal observation. The research described in this thesis is motivated by a desire to provide a framework for the unsupervised acquisition and tracking of gesture models. In any learning framework, the initialisation of the shapes is very crucial. Hence, it would be beneficial to have a robust model not prone to noise that can automatically correspond the set of shapes. In the first part of this thesis, we develop a framework for building statistical 2D shape models by extracting, labelling and corresponding landmark points using only topological relations derived from competitive hebbian learning. The method is based on the assumption that correspondences can be addressed as an unsupervised classification problem where landmark points are the cluster centres (nodes) in a high-dimensional vector space. The approach is novel in that the network can be used in cases where the topological structure of the input pattern is not known a priori thus no topology of fixed dimensionality is imposed onto the network. In the second part, we propose an approach to minimise the user intervention in the adaptation process, which requires to specify a priori the number of nodes needed to represent an object, by utilising an automatic criterion for maximum node growth. Furthermore, this model is used to represent motion in image sequences by initialising a suitable segmentation that separates the object of interest from the background. The segmentation system takes into consideration some illumination tolerance, images as inputs from ordinary cameras and webcams, some low to medium cluttered background avoiding extremely cluttered backgrounds, and that the objects are at close range from the camera. In the final part, we extend the framework for the automatic modelling and unsupervised tracking of 2D hand gestures in a sequence of k frames. The aim is to use the tracked frames as training examples in order to build the model and maintain correspondences. To do that we add an active step to the Growing Neural Gas (GNG) network, which we call Active Growing Neural Gas (A-GNG) that takes into consideration not only the geometrical position of the nodes, but also the underlined local feature structure of the image, and the distance vector between successive images. The quality of our model is measured through the calculation of the topographic product. The topographic product is our topology preserving measure which quantifies the neighbourhood preservation. In our system we have applied specific restrictions in the velocity and the appearance of the gestures to simplify the difficulty of the motion analysis in the gesture representation. The proposed framework has been validated on applications related to sign language. The work has great potential in Virtual Reality (VR) applications where the learning and the representation of gestures becomes natural without the need of expensive wear cable sensors

    Evaluation of different chrominance models in the detection and reconstruction of faces and hands using the growing neural gas network

    Get PDF
    Physical traits such as the shape of the hand and face can be used for human recognition and identification in video surveillance systems and in biometric authentication smart card systems, as well as in personal health care. However, the accuracy of such systems suffers from illumination changes, unpredictability, and variability in appearance (e.g. occluded faces or hands, cluttered backgrounds, etc.). This work evaluates different statistical and chrominance models in different environments with increasingly cluttered backgrounds where changes in lighting are common and with no occlusions applied, in order to get a reliable neural network reconstruction of faces and hands, without taking into account the structural and temporal kinematics of the hands. First a statistical model is used for skin colour segmentation to roughly locate hands and faces. Then a neural network is used to reconstruct in 3D the hands and faces. For the filtering and the reconstruction we have used the growing neural gas algorithm which can preserve the topology of an object without restarting the learning process. Experiments conducted on our own database but also on four benchmark databases (Stirling’s, Alicante, Essex, and Stegmann’s) and on deaf individuals from normal 2D videos are freely available on the BSL signbank dataset. Results demonstrate the validity of our system to solve problems of face and hand segmentation and reconstruction under different environmental conditions

    Fast 2D/3D object representation with growing neural gas

    Get PDF
    This work presents the design of a real-time system to model visual objects with the use of self-organising networks. The architecture of the system addresses multiple computer vision tasks such as image segmentation, optimal parameter estimation and object representation. We first develop a framework for building non-rigid shapes using the growth mechanism of the self-organising maps, and then we define an optimal number of nodes without overfitting or underfitting the network based on the knowledge obtained from information-theoretic considerations. We present experimental results for hands and faces, and we quantitatively evaluate the matching capabilities of the proposed method with the topographic product. The proposed method is easily extensible to 3D objects, as it offers similar features for efficient mesh reconstruction

    A Posture Sequence Learning System for an Anthropomorphic Robotic Hand

    Get PDF
    The paper presents a cognitive architecture for posture learning of an anthropomorphic robotic hand. Our approach is aimed to allow the robotic system to perform complex perceptual operations, to interact with a human user and to integrate the perceptions by a cognitive representation of the scene and the observed actions. The anthropomorphic robotic hand imitates the gestures acquired by the vision system in order to learn meaningful movements, to build its knowledge by different conceptual spaces and to perform complex interaction with the human operator

    An inertial motion capture framework for constructing body sensor networks

    Get PDF
    Motion capture is the process of measuring and subsequently reconstructing the movement of an animated object or being in virtual space. Virtual reconstructions of human motion play an important role in numerous application areas such as animation, medical science, ergonomics, etc. While optical motion capture systems are the industry standard, inertial body sensor networks are becoming viable alternatives due to portability, practicality and cost. This thesis presents an innovative inertial motion capture framework for constructing body sensor networks through software environments, smartphones and web technologies. The first component of the framework is a unique inertial motion capture software environment aimed at providing an improved experimentation environment, accompanied by programming scaffolding and a driver development kit, for users interested in studying or engineering body sensor networks. The software environment provides a bespoke 3D engine for kinematic motion visualisations and a set of tools for hardware integration. The software environment is used to develop the hardware behind a prototype motion capture suit focused on low-power consumption and hardware-centricity. Additional inertial measurement units, which are available commercially, are also integrated to demonstrate the functionality the software environment while providing the framework with additional sources for motion data. The smartphone is the most ubiquitous computing technology and its worldwide uptake has prompted many advances in wearable inertial sensing technologies. Smartphones contain gyroscopes, accelerometers and magnetometers, a combination of sensors that is commonly found in inertial measurement units. This thesis presents a mobile application that investigates whether the smartphone is capable of inertial motion capture by constructing a novel omnidirectional body sensor network. This thesis proposes a novel use for web technologies through the development of the Motion Cloud, a repository and gateway for inertial data. Web technologies have the potential to replace motion capture file formats with online repositories and to set a new standard for how motion data is stored. From a single inertial measurement unit to a more complex body sensor network, the proposed architecture is extendable and facilitates the integration of any inertial hardware configuration. The Motion Cloud’s data can be accessed through an application-programming interface or through a web portal that provides users with the functionality for visualising and exporting the motion data

    An Overview of Self-Adaptive Technologies Within Virtual Reality Training

    Get PDF
    This overview presents the current state-of-the-art of self-adaptive technologies within virtual reality (VR) training. Virtual reality training and assessment is increasingly used for five key areas: medical, industrial & commercial training, serious games, rehabilitation and remote training such as Massive Open Online Courses (MOOCs). Adaptation can be applied to five core technologies of VR including haptic devices, stereo graphics, adaptive content, assessment and autonomous agents. Automation of VR training can contribute to automation of actual procedures including remote and robotic assisted surgery which reduces injury and improves accuracy of the procedure. Automated haptic interaction can enable tele-presence and virtual artefact tactile interaction from either remote or simulated environments. Automation, machine learning and data driven features play an important role in providing trainee-specific individual adaptive training content. Data from trainee assessment can form an input to autonomous systems for customised training and automated difficulty levels to match individual requirements. Self-adaptive technology has been developed previously within individual technologies of VR training. One of the conclusions of this research is that while it does not exist, an enhanced portable framework is needed and it would be beneficial to combine automation of core technologies, producing a reusable automation framework for VR training

    A review of theories and methods in the science of face-to-face social interaction

    Get PDF
    For most of human history, face-to-face interactions have been the primary and most fundamental way to build social relationships, and even in the digital era they remain the basis of our closest bonds. These interactions are built on the dynamic integration and coordination of verbal and non-verbal information between multiple people. However, the psychological processes underlying face-to-face interaction remain difficult to study. In this Review, we discuss three ways the multimodal phenomena underlying face-to-face social interaction can be organized to provide a solid basis for theory development. Next, we review three types of theory of social interaction: theories that focus on the social meaning of actions, theories that explain actions in terms of simple behaviour rules and theories that rely on rich cognitive models of the internal states of others. Finally, we address how different methods can be used to distinguish between theories, showcasing new approaches and outlining important directions for future research. Advances in how face-to-face social interaction can be studied, combined with a renewed focus on cognitive theories, could lead to a renaissance in social interaction research and advance scientific understanding of face-to-face interaction and its underlying cognitive foundations
    • …
    corecore