1,799 research outputs found

    Multimodal Data Fusion based on the Global Workspace Theory

    Get PDF
    We propose a novel neural network architecture, named the Global Workspace Network (GWN), which addresses the challenge of dynamic and unspecified uncertainties in multimodal data fusion. Our GWN is a model of attention across modalities and evolving through time, and is inspired by the well-established Global Workspace Theory from the field of cognitive science. The GWN achieved average F1 score of 0.92 for discrimination between pain patients and healthy participants and average F1 score = 0.75 for further classification of three pain levels for a patient, both based on the multimodal EmoPain dataset captured from people with chronic pain and healthy people performing different types of exercise movements in unconstrained settings. In these tasks, the GWN significantly outperforms the typical fusion approach of merging by concatenation. We further provide extensive analysis of the behaviour of the GWN and its ability to address uncertainties (hidden noise) in multimodal data

    Multimodal Data Fusion based on the Global Workspace Theory

    Get PDF
    We propose a novel neural network architecture, named the Global Workspace Network (GWN), which addresses the challenge of dynamic and unspecified uncertainties in multimodal data fusion. Our GWN is a model of attention across modalities and evolving through time, and is inspired by the well-established Global Workspace Theory from the field of cognitive science. The GWN achieved average F1 score of 0.92 for discrimination between pain patients and healthy participants and average F1 score = 0.75 for further classification of three pain levels for a patient, both based on the multimodal EmoPain dataset captured from people with chronic pain and healthy people performing different types of exercise movements in unconstrained settings. In these tasks, the GWN significantly outperforms the typical fusion approach of merging by concatenation. We further provide extensive analysis of the behaviour of the GWN and its ability to address uncertainties (hidden noise) in multimodal data.Comment: 12 pages, 5 figure

    Multimodal Bayesian Network for Artificial Perception

    Get PDF
    In order to make machines perceive their external environment coherently, multiple sources of sensory information derived from several different modalities can be used (e.g. cameras, LIDAR, stereo, RGB-D, and radars). All these different sources of information can be efficiently merged to form a robust perception of the environment. Some of the mechanisms that underlie this merging of the sensor information are highlighted in this chapter, showing that depending on the type of information, different combination and integration strategies can be used and that prior knowledge are often required for interpreting the sensory signals efficiently. The notion that perception involves Bayesian inference is an increasingly popular position taken by a considerable number of researchers. Bayesian models have provided insights into many perceptual phenomena, showing that they are a valid approach to deal with real-world uncertainties and for robust classification, including classification in time-dependent problems. This chapter addresses the use of Bayesian networks applied to sensory perception in the following areas: mobile robotics, autonomous driving systems, advanced driver assistance systems, sensor fusion for object detection, and EEG-based mental states classification

    Semi-supervised Multimodal Representation Learning through a Global Workspace

    Full text link
    Recent deep learning models can efficiently combine inputs from different modalities (e.g., images and text) and learn to align their latent representations, or to translate signals from one domain to another (as in image captioning, or text-to-image generation). However, current approaches mainly rely on brute-force supervised training over large multimodal datasets. In contrast, humans (and other animals) can learn useful multimodal representations from only sparse experience with matched cross-modal data. Here we evaluate the capabilities of a neural network architecture inspired by the cognitive notion of a "Global Workspace": a shared representation for two (or more) input modalities. Each modality is processed by a specialized system (pretrained on unimodal data, and subsequently frozen). The corresponding latent representations are then encoded to and decoded from a single shared workspace. Importantly, this architecture is amenable to self-supervised training via cycle-consistency: encoding-decoding sequences should approximate the identity function. For various pairings of vision-language modalities and across two datasets of varying complexity, we show that such an architecture can be trained to align and translate between two modalities with very little need for matched data (from 4 to 7 times less than a fully supervised approach). The global workspace representation can be used advantageously for downstream classification tasks and for robust transfer learning. Ablation studies reveal that both the shared workspace and the self-supervised cycle-consistency training are critical to the system's performance.Comment: Under revie

    Human Motion Trajectory Prediction: A Survey

    Full text link
    With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

    Towards conscious-like behavior in computer game characters

    Get PDF
    Proceeding of: IEEE Symposium on Computational Intelligence and Games 2009 (CIG-2009). Milano, Italy, 7-10 Septiembre, 2009.The main sources of inspiration for the design of more engaging synthetic characters are existing psychological models of human cognition. Usually, these models, and the associated Artificial Intelligence (AI) techniques, are based on partial aspects of the real complex systems involved in the generation of human-like behavior. Emotions, planning, learning, user modeling, set shifting, and attention mechanisms are some remarkable examples of features typically considered in isolation within classical AI control models. Artificial cognitive architectures aim at integrating many of these aspects together into effective control systems. However, the design of this sort of architectures is not straightforward. In this paper, we argue that current research efforts in the young field of Machine Consciousness (MC) could contribute to tackle complexity and provide a useful framework for the design of more appealing synthetic characters. This hypothesis is illustrated with the application of a novel consciousness-based cognitive architecture to the development of a First Person Shooter video game character.This work was supported by the Spanish Ministry of Education under CICYT grant TRA2007-67374-C02-02.Publicad

    Preserving Communication Context. Virtual workspace and interpersonal space in Japanese CSCW.

    Get PDF
    The past decade has seen the development of a perspective\ud holding that technology is socially constructed (Mackenzie and Wacjman, 1985; Bijker, Hughes and Pinch, 1987; Bijker and Law, 1992). This paper examines the social construction of one group of technologies, systems for computer supported cooperative work (CSCW). It describes the design of CSCW in Japan, with particular attention to the influence of culture on the design process. Two case studies are presented to illustrate the argument that culture is an important factor in technology design, despite commonly held assumptions about the neutrality and objectivity of science and technology. The paper further argues that, by looking at\ud CSCW systems as texts which reflect the context of their production and the society from which they come, we may be better able to understand the transformations that operate when these texts are “read” in the contexts of their implementation

    Advanced Algorithms for 3D Medical Image Data Fusion in Specific Medical Problems

    Get PDF
    Fúze obrazu je dnes jednou z nejběžnějších avšak stále velmi diskutovanou oblastí v lékařském zobrazování a hraje důležitou roli ve všech oblastech lékařské péče jako je diagnóza, léčba a chirurgie. V této dizertační práci jsou představeny tři projekty, které jsou velmi úzce spojeny s oblastí fúze medicínských dat. První projekt pojednává o 3D CT subtrakční angiografii dolních končetin. V práci je využito kombinace kontrastních a nekontrastních dat pro získání kompletního cévního stromu. Druhý projekt se zabývá fúzí DTI a T1 váhovaných MRI dat mozku. Cílem tohoto projektu je zkombinovat stukturální a funkční informace, které umožňují zlepšit znalosti konektivity v mozkové tkáni. Třetí projekt se zabývá metastázemi v CT časových datech páteře. Tento projekt je zaměřen na studium vývoje metastáz uvnitř obratlů ve fúzované časové řadě snímků. Tato dizertační práce představuje novou metodologii pro klasifikaci těchto metastáz. Všechny projekty zmíněné v této dizertační práci byly řešeny v rámci pracovní skupiny zabývající se analýzou lékařských dat, kterou vedl pan Prof. Jiří Jan. Tato dizertační práce obsahuje registrační část prvního a klasifikační část třetího projektu. Druhý projekt je představen kompletně. Další část prvního a třetího projektu, obsahující specifické předzpracování dat, jsou obsaženy v disertační práci mého kolegy Ing. Romana Petera.Image fusion is one of today´s most common and still challenging tasks in medical imaging and it plays crucial role in all areas of medical care such as diagnosis, treatment and surgery. Three projects crucially dependent on image fusion are introduced in this thesis. The first project deals with the 3D CT subtraction angiography of lower limbs. It combines pre-contrast and contrast enhanced data to extract the blood vessel tree. The second project fuses the DTI and T1-weighted MRI brain data. The aim of this project is to combine the brain structural and functional information that purvey improved knowledge about intrinsic brain connectivity. The third project deals with the time series of CT spine data where the metastases occur. In this project the progression of metastases within the vertebrae is studied based on fusion of the successive elements of the image series. This thesis introduces new methodology of classifying metastatic tissue. All the projects mentioned in this thesis have been solved by the medical image analysis group led by Prof. Jiří Jan. This dissertation concerns primarily the registration part of the first project and the classification part of the third project. The second project is described completely. The other parts of the first and third project, including the specific preprocessing of the data, are introduced in detail in the dissertation thesis of my colleague Roman Peter, M.Sc.

    On consciousness, resting state fMRI, and neurodynamics

    Get PDF
    corecore