1,202 research outputs found

    Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues

    Get PDF
    In this contribution, we present a large-scale hierarchical system for object detection fusing bottom-up (signal-driven) processing results with top-down (model or task-driven) attentional modulation. Specifically, we focus on the question of how the autonomous learning of invariant models can be embedded into a performing system and how such models can be used to define object-specific attentional modulation signals. Our system implements bi-directional data flow in a processing hierarchy. The bottom-up data flow proceeds from a preprocessing level to the hypothesis level where object hypotheses created by exhaustive object detection algorithms are represented in a roughly retinotopic way. A competitive selection mechanism is used to determine the most confident hypotheses, which are used on the system level to train multimodal models that link object identity to invariant hypothesis properties. The top-down data flow originates at the system level, where the trained multimodal models are used to obtain space- and feature-based attentional modulation signals, providing biases for the competitive selection process at the hypothesis level. This results in object-specific hypothesis facilitation/suppression in certain image regions which we show to be applicable to different object detection mechanisms. In order to demonstrate the benefits of this approach, we apply the system to the detection of cars in a variety of challenging traffic videos. Evaluating our approach on a publicly available dataset containing approximately 3,500 annotated video images from more than 1 h of driving, we can show strong increases in performance and generalization when compared to object detection in isolation. Furthermore, we compare our results to a late hypothesis rejection approach, showing that early coupling of top-down and bottom-up information is a favorable approach especially when processing resources are constrained

    Early Turn-taking Prediction with Spiking Neural Networks for Human Robot Collaboration

    Full text link
    Turn-taking is essential to the structure of human teamwork. Humans are typically aware of team members' intention to keep or relinquish their turn before a turn switch, where the responsibility of working on a shared task is shifted. Future co-robots are also expected to provide such competence. To that end, this paper proposes the Cognitive Turn-taking Model (CTTM), which leverages cognitive models (i.e., Spiking Neural Network) to achieve early turn-taking prediction. The CTTM framework can process multimodal human communication cues (both implicit and explicit) and predict human turn-taking intentions in an early stage. The proposed framework is tested on a simulated surgical procedure, where a robotic scrub nurse predicts the surgeon's turn-taking intention. It was found that the proposed CTTM framework outperforms the state-of-the-art turn-taking prediction algorithms by a large margin. It also outperforms humans when presented with partial observations of communication cues (i.e., less than 40% of full actions). This early prediction capability enables robots to initiate turn-taking actions at an early stage, which facilitates collaboration and increases overall efficiency.Comment: Submitted to IEEE International Conference on Robotics and Automation (ICRA) 201

    Deep fusion of multi-channel neurophysiological signal for emotion recognition and monitoring

    Get PDF
    How to fuse multi-channel neurophysiological signals for emotion recognition is emerging as a hot research topic in community of Computational Psychophysiology. Nevertheless, prior feature engineering based approaches require extracting various domain knowledge related features at a high time cost. Moreover, traditional fusion method cannot fully utilise correlation information between different channels and frequency components. In this paper, we design a hybrid deep learning model, in which the 'Convolutional Neural Network (CNN)' is utilised for extracting task-related features, as well as mining inter-channel and inter-frequency correlation, besides, the 'Recurrent Neural Network (RNN)' is concatenated for integrating contextual information from the frame cube sequence. Experiments are carried out in a trial-level emotion recognition task, on the DEAP benchmarking dataset. Experimental results demonstrate that the proposed framework outperforms the classical methods, with regard to both of the emotional dimensions of Valence and Arousal

    Cortical control of forelimb movement

    Get PDF
    Cortical control of movement is mediated by wide-spread projections impacting many nervous system regions in a top-down manner. Although much knowledge about cortical circuitry has been accumulated from local cortical microcircuits, cortico-cortical and cortico-subcortical networks, how cortex communicates to regions closer to motor execution, including the brainstem, is less well understood. In this dissertation, we investigate the organization of cortico-medulla projections and their roles in controlling forelimb movement. We focus on anatomical and functional relationships between cortex and lateral rostral medulla (LatRM), a region in caudal brainstem which is shown to be key in the control of forelimb movement. Our findings reveal the precise anatomical and functional organization between different cortical regions and matched postsynaptic neurons in the caudal brainstem, tuned to different phases of one carefully orchestrated behavior, which advance the our knowledge on circuit mechanisms involved in the control of body movements, and unravel the logic of how the top-level control region in the mammalian nervous system – the cortex – intersects with a high degree of specificity with command centers in the brainstem and beyond

    Towards a data-driven treatment of epilepsy: computational methods to overcome low-data regimes in clinical settings

    Get PDF
    Epilepsy is the most common neurological disorder, affecting around 1 % of the population. One third of patients with epilepsy are drug-resistant. If the epileptogenic zone can be localized precisely, curative resective surgery may be performed. However, only 40 to 70 % of patients remain seizure-free after surgery. Presurgical evaluation, which in part aims to localize the epileptogenic zone (EZ), is a complex multimodal process that requires subjective clinical decisions, often relying on a multidisciplinary team’s experience. Thus, the clinical pathway could benefit from data-driven methods for clinical decision support. In the last decade, deep learning has seen great advancements due to the improvement of graphics processing units (GPUs), the development of new algorithms and the large amounts of generated data that become available for training. However, using deep learning in clinical settings is challenging as large datasets are rare due to privacy concerns and expensive annotation processes. Methods to overcome the lack of data are especially important in the context of presurgical evaluation of epilepsy, as only a small proportion of patients with epilepsy end up undergoing surgery, which limits the availability of data to learn from. This thesis introduces computational methods that pave the way towards integrating data-driven methods into the clinical pathway for the treatment of epilepsy, overcoming the challenge presented by the relatively small datasets available. We used transfer learning from general-domain human action recognition to characterize epileptic seizures from video–telemetry data. We developed a software framework to predict the location of the epileptogenic zone given seizure semiologies, based on retrospective information from the literature. We trained deep learning models using self-supervised and semi-supervised learning to perform quantitative analysis of resective surgery by segmenting resection cavities on brain magnetic resonance images (MRIs). Throughout our work, we shared datasets and software tools that will accelerate research in medical image computing, particularly in the field of epilepsy
    corecore