23 research outputs found

    Probabilistic consolidation of grasp experience

    Get PDF
    We present a probabilistic model for joint representation of several sensory modalities and action parameters in a robotic grasping scenario. Our non-linear probabilistic latent variable model encodes relationships between grasp-related parameters, learns the importance of features, and expresses confidence in estimates. The model learns associations between stable and unstable grasps that it experiences during an exploration phase. We demonstrate the applicability of the model for estimating grasp stability, correcting grasps, identifying objects based on tactile imprints and predicting tactile imprints from object-relative gripper poses. We performed experiments on a real platform with both known and novel objects, i.e., objects the robot trained with, and previously unseen objects. Grasp correction had a 75% success rate on known objects, and 73% on new objects. We compared our model to a traditional regression model that succeeded in correcting grasps in only 38% of cases

    Structured manifolds for motion production and segmentation : a structured Kernel Regression approach

    Get PDF
    Steffen JF. Structured manifolds for motion production and segmentation : a structured Kernel Regression approach. Bielefeld (Germany): Bielefeld University; 2010

    A Posture Sequence Learning System for an Anthropomorphic Robotic Hand

    Get PDF
    The paper presents a cognitive architecture for posture learning of an anthropomorphic robotic hand. Our approach is aimed to allow the robotic system to perform complex perceptual operations, to interact with a human user and to integrate the perceptions by a cognitive representation of the scene and the observed actions. The anthropomorphic robotic hand imitates the gestures acquired by the vision system in order to learn meaningful movements, to build its knowledge by different conceptual spaces and to perform complex interaction with the human operator

    Building Blocks for Cognitive Robots: Embodied Simulation and Schemata in a Cognitive Architecture

    Get PDF
    Hemion N. Building Blocks for Cognitive Robots: Embodied Simulation and Schemata in a Cognitive Architecture. Bielefeld: Bielefeld University; 2013.Building robots with the ability to perform general intelligent action is a primary goal of artificial intelligence research. The traditional approach is to study and model fragments of cognition separately, with the hope that it will somehow be possible to integrate the specialist solutions into a functioning whole. However, while individual specialist systems demonstrate proficiency in their respective niche, current integrated systems remain clumsy in their performance. Recent findings in neurobiology and psychology demonstrate that many regions of the brain are involved not only in one but in a variety of cognitive tasks, suggesting that the cognitive architecture of the brain uses generic computations in a distributed network, instead of specialist computations in local modules. Designing the cognitive architecture for a robot based on these findings could lead to more capable integrated systems. In this thesis, theoretical background on the concept of embodied cognition is provided, and fundamental mechanisms of cognition are discussed that are hypothesized across theories. Based on this background, a view of how to connect elements of the different theories is proposed, providing enough detail to allow computational modeling. The view proposes a network of generic building blocks to be the central component of a cognitive architecture. Each building block learns an internal model for its inputs. Given partial inputs or cues, the building blocks can collaboratively restore missing components, providing the basis for embodied simulation, which in theories of embodied cognition is hypothesized to be a central mechanism of cognition and the basis for many cognitive functions. In simulation experiments, it is demonstrated how the building blocks can be autonomously learned by a robot from its sensorimotor experience, and that the mechanism of embodied simulation allows the robot to solve multiple tasks simultaneously. In summary, this thesis investigates how to develop cognitive robots under the paradigm of embodied cognition. It provides a description of a novel cognitive architecture and thoroughly discusses its relation to a broad body of interdisciplinary literature on embodied cognition. This thesis hence promotes the view that the cognitive system houses a network of active elements, which organize the agent's experiences and collaboratively carry out many cognitive functions. On the long run, it will be inevitable to study complete cognitive systems such as the cognitive architecture described in this thesis, instead of only studying small learning systems separately, to answer the question of how to build truly autonomous cognitive robots

    Gesture recognition using principal component analysis, multi-scale theory, and hidden Markov models

    Get PDF
    In this thesis, a dynamic gesture recognition system is presented which requires no special hardware other than a Web cam . The system is based on a novel method combining Principal Component Analysis (PCA) with hierarchical m ulti-scale theory and Discrete Hidden Markov Models (DHMMs). We use a hierarchical decision tree based on multi-scale theory. Firstly we convolve all members of the training data with a Gaussian kernel, w h ich blu rs d iffe ren c e s b e tw e en images and reduces their separation in feature space. Th is reduces the number of eigen vectors needed to describe the data. A principal component space is computed from the convolved data. We divide the data in this space in to several clusters using the £-means algorithm. Then the level of b lurring is reduced and PCA is applied to each of the clusters separately. A new principal component space is formed from each cluster. Each of these spaces is then divided in to clusters and the process is repeated. We thus produce a tree of principal component spaces where each level of the tree represents a different degree of blurring. The search time is then proportional to the depth of the tree, which makes it possible to search hundreds of gestures with very little computational cost. The output of the decision tree is then input in to the DHMM recogniser to recognise temporal information

    Deep Clustering and Deep Network Compression

    Get PDF
    The use of deep learning has grown increasingly in recent years, thereby becoming a much-discussed topic across a diverse range of fields, especially in computer vision, text mining, and speech recognition. Deep learning methods have proven to be robust in representation learning and attained extraordinary achievement. Their success is primarily due to the ability of deep learning to discover and automatically learn feature representations by mapping input data into abstract and composite representations in a latent space. Deep learning’s ability to deal with high-level representations from data has inspired us to make use of learned representations, aiming to enhance unsupervised clustering and evaluate the characteristic strength of internal representations to compress and accelerate deep neural networks.Traditional clustering algorithms attain a limited performance as the dimensionality in-creases. Therefore, the ability to extract high-level representations provides beneficial components that can support such clustering algorithms. In this work, we first present DeepCluster, a clustering approach embedded in a deep convolutional auto-encoder. We introduce two clustering methods, namely DCAE-Kmeans and DCAE-GMM. The DeepCluster allows for data points to be grouped into their identical cluster, in the latent space, in a joint-cost function by simultaneously optimizing the clustering objective and the DCAE objective, producing stable representations, which is appropriate for the clustering process. Both qualitative and quantitative evaluations of proposed methods are reported, showing the efficiency of deep clustering on several public datasets in comparison to the previous state-of-the-art methods.Following this, we propose a new version of the DeepCluster model to include varying degrees of discriminative power. This introduces a mechanism which enables the imposition of regularization techniques and the involvement of a supervision component. The key idea of our approach is to distinguish the discriminatory power of numerous structures when searching for a compact structure to form robust clusters. The effectiveness of injecting various levels of discriminatory powers into the learning process is investigated alongside the exploration and analytical study of the discriminatory power obtained through the use of two discriminative attributes: data-driven discriminative attributes with the support of regularization techniques, and supervision discriminative attributes with the support of the supervision component. An evaluation is provided on four different datasets.The use of neural networks in various applications is accompanied by a dramatic increase in computational costs and memory requirements. Making use of the characteristic strength of learned representations, we propose an iterative pruning method that simultaneously identifies the critical neurons and prunes the model during training without involving any pre-training or fine-tuning procedures. We introduce a majority voting technique to compare the activation values among neurons and assign a voting score to evaluate their importance quantitatively. This mechanism effectively reduces model complexity by eliminating the less influential neurons and aims to determine a subset of the whole model that can represent the reference model with much fewer parameters within the training process. Empirically, we demonstrate that our pruning method is robust across various scenarios, including fully-connected networks (FCNs), sparsely-connected networks (SCNs), and Convolutional neural networks (CNNs), using two public datasets.Moreover, we also propose a novel framework to measure the importance of individual hidden units by computing a measure of relevance to identify the most critical filters and prune them to compress and accelerate CNNs. Unlike existing methods, we introduce the use of the activation of feature maps to detect valuable information and the essential semantic parts, with the aim of evaluating the importance of feature maps, inspired by novel neural network interpretability. A majority voting technique based on the degree of alignment between a se-mantic concept and individual hidden unit representations is utilized to evaluate feature maps’ importance quantitatively. We also propose a simple yet effective method to estimate new convolution kernels based on the remaining crucial channels to accomplish effective CNN compression. Experimental results show the effectiveness of our filter selection criteria, which outperforms the state-of-the-art baselines.To conclude, we present a comprehensive, detailed review of time-series data analysis, with emphasis on deep time-series clustering (DTSC), and a founding contribution to the area of applying deep clustering to time-series data by presenting the first case study in the context of movement behavior clustering utilizing the DeepCluster method. The results are promising, showing that the latent space encodes sufficient patterns to facilitate accurate clustering of movement behaviors. Finally, we identify state-of-the-art and present an outlook on this important field of DTSC from five important perspectives

    Gesture recognition with application in music arrangement

    Get PDF
    This thesis studies the interaction with music synthesis systems using hand gestures. Traditionally users of such systems were limited to input devices such as buttons, pedals, faders, and joysticks. The use of gestures allows the user to interact with the system in a more intuitive way. Without the constraint of input devices, the user can simultaneously control more elements within the music composition, thus increasing the level of the system's responsiveness to the musician's creative thoughts. A working system of this concept is implemented, employing computer vision and machine intelligence techniques to recognise the user's gestures.Dissertation (MSc)--University of Pretoria, 2006.Computer ScienceMScunrestricte
    corecore