76 research outputs found

    Social Saliency: Visual Psychophysics and Single-Neuron Recordings in Humans

    Get PDF
    My thesis studies how people pay attention to other people and the environment. How does the brain figure out what is important and what are the neural mechanisms underlying attention? What is special about salient social cues compared to salient non-social cues? In Chapter I, I review social cues that attract attention, with an emphasis on the neurobiology of these social cues. I also review neurological and psychiatric links: the relationship between saliency, the amygdala and autism. The first empirical chapter then begins by noting that people constantly move in the environment. In Chapter II, I study the spatial cues that attract attention during locomotion using a cued speeded discrimination task. I found that when the motion was expansive, attention was attracted towards the singular point of the optic flow (the focus of expansion, FOE) in a sustained fashion. The more ecologically valid the motion features became (e.g., temporal expansion of each object, spatial depth structure implied by distribution of the size of the objects), the stronger the attentional effects. However, compared to inanimate objects and cues, people preferentially attend to animals and faces, a process in which the amygdala is thought to play an important role. To directly compare social cues and non-social cues in the same experiment and investigate the neural structures processing social cues, in Chapter III, I employ a change detection task and test four rare patients with bilateral amygdala lesions. All four amygdala patients showed a normal pattern of reliably faster and more accurate detection of animate stimuli, suggesting that advantageous processing of social cues can be preserved even without the amygdala, a key structure of the “social brain”. People not only attend to faces, but also pay attention to others’ facial emotions and analyze faces in great detail. Humans have a dedicated system for processing faces and the amygdala has long been associated with a key role in recognizing facial emotions. In Chapter IV, I study the neural mechanisms of emotion perception and find that single neurons in the human amygdala are selective for subjective judgment of others’ emotions. Lastly, people typically pay special attention to faces and people, but people with autism spectrum disorders (ASD) might not. To further study social attention and explore possible deficits of social attention in autism, in Chapter V, I employ a visual search task and show that people with ASD have reduced attention, especially social attention, to target-congruent objects in the search array. This deficit cannot be explained by low-level visual properties of the stimuli and is independent of the amygdala, but it is dependent on task demands. Overall, through visual psychophysics with concurrent eye-tracking, my thesis found and analyzed socially salient cues and compared social vs. non-social cues and healthy vs. clinical populations. Neural mechanisms underlying social saliency were elucidated through electrophysiology and lesion studies. I finally propose further research questions based on the findings in my thesis and introduce my follow-up studies and preliminary results beyond the scope of this thesis in the very last section, Future Directions

    Spoken command recognition for robotics

    Get PDF
    In this thesis, I investigate spoken command recognition technology for robotics. While high robustness is expected, the distant and noisy conditions in which the system has to operate make the task very challenging. Unlike commercial systems which all rely on a "wake-up" word to initiate the interaction, the pipeline proposed here directly detect and recognizes commands from the continuous audio stream. In order to keep the task manageable despite low-resource conditions, I propose to focus on a limited set of commands, thus trading off flexibility of the system against robustness. Domain and speaker adaptation strategies based on a multi-task regularization paradigm are first explored. More precisely, two different methods are proposed which rely on a tied loss function which penalizes the distance between the output of several networks. The first method considers each speaker or domain as a task. A canonical task-independent network is jointly trained with task-dependent models, allowing both types of networks to improve by learning from one another. While an improvement of 3.2% on the frame error rate (FER) of the task-independent network is obtained, this only partially carried over to the phone error rate (PER), with 1.5% of improvement. Similarly, a second method explored the parallel training of the canonical network with a privileged model having access to i-vectors. This method proved less effective with only 1.2% of improvement on the FER. In order to make the developed technology more accessible, I also investigated the use of a sequence-to-sequence (S2S) architecture for command classification. The use of an attention-based encoder-decoder model reduced the classification error by 40% relative to a strong convolutional neural network (CNN)-hidden Markov model (HMM) baseline, showing the relevance of S2S architectures in such context. In order to improve the flexibility of the trained system, I also explored strategies for few-shot learning, which allow to extend the set of commands with minimum requirements in terms of data. Retraining a model on the combination of original and new commands, I managed to achieve 40.5% of accuracy on the new commands with only 10 examples for each of them. This scores goes up to 81.5% of accuracy with a larger set of 100 examples per new command. An alternative strategy, based on model adaptation achieved even better scores, with 68.8% and 88.4% of accuracy with 10 and 100 examples respectively, while being faster to train. This high performance is obtained at the expense of the original categories though, on which the accuracy deteriorated. Those results are very promising as the methods allow to easily extend an existing S2S model with minimal resources. Finally, a full spoken command recognition system (named iCubrec) has been developed for the iCub platform. The pipeline relies on a voice activity detection (VAD) system to propose a fully hand-free experience. By segmenting only regions that are likely to contain commands, the VAD module also allows to reduce greatly the computational cost of the pipeline. Command candidates are then passed to the deep neural network (DNN)-HMM command recognition system for transcription. The VoCub dataset has been specifically gathered to train a DNN-based acoustic model for our task. Through multi-condition training with the CHiME4 dataset, an accuracy of 94.5% is reached on VoCub test set. A filler model, complemented by a rejection mechanism based on a confidence score, is finally added to the system to reject non-command speech in a live demonstration of the system

    Elucidating the efficacy and response to social cognitive training in recent-onset psychosis

    Get PDF
    Neurocognitive deficits are one of the core features of psychosis spectrum disorders (PSD), and they are predictive of poor functional outcome and negative symptoms many years later (Green, Kern, Braff, Mintz, 2000). Neurocognitive interventions (NCIs) have emerged in the last two decades as a strong potential supplementary treatment option to improve cognitive deficits and drop in functioning affecting patients with PSD. Social cognitive training (SCT) involving e.g., facial stimuli, has gained considerably more attention in recent studies than computerized NCIs, that use basic visual or auditory stimuli. This is due to the complex character of social cognition (SC) that draws on multiple brain structures involved in behaviors and perception beyond default cognitive function. SC is also tightly interlinked with psychosocial functioning. Although they are cost-effective and quite independent of clinical staff, such technological approaches as SCT are currently not integrated into routine clinical practice. Recent studies have mapped the effects of SCT in task-based studies on multiple brain regions such as the amygdala, putamen, medial prefrontal cortex, and postcentral gyrus (Ramsay MacDonald III, 2015). Yet, the degree to which alterations in brain function are associated with response to such interventions is still poorly understood. Importantly, resting-state functional connectivity (rsFC) may be a viable neuromarker as it has shown greater sensitivity in distinguishing patients from healthy controls (HC) across neuroimaging studies, and is relatively easy to administer especially in patients with acute symptoms (Kambeitz et al., 2015). In this dissertation, we employed 1) a univariate statistical approach to elucidate the efficacy of a 10-hour SCT in improving cognition, symptoms, functioning and the restoration of rsFC in patients undergoing SCT as compared to the treatment as usual (TAU) group, and 2) multivariate methods. In particular, we used a Support Vector Machine (SVM) approach to neuromonitor the recovery of rsFC in the SCT group compared to TAU. We also investigated the potential utility of rsFC as a baseline (T0) neuromarker viable of predicting role functioning approximately 2 months later. First, current findings suggest a 10-hour SCT has the capability of improving role functioning in recent-onset psychosis (ROP) patients. Second, we have shown intervention-specific rsFC changes within parts of default mode and social cognitive network. Moreover, patients with worse SC performance at T0 showed greater rsFC changes following the intervention, suggestive of a greater degree of rsFC restoration potential in patients with worse social cognitive deficits. Third, when referring to neuromonitoring results, it is important to state that only greater transition from ROP to ?HC-like? SVM decision scores, based on the resting-state modality, was paralleled by intervention specific significantly greater improvement in global cognition and attention. Finally, we were able to show the early prediction of good versus poor role functioning is feasible at the individual subject level using a rsFC-based linear SVM classifier with a Balanced Accuracy (BAC) of 74 %. This dissertation sheds light on the effects and feasibility of a relatively short computerized SCT, and the potential utility of multivariate pattern analysis (MVPA) for better clinical stratification of predicted treatment response based on rsFC neuromarkers

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum

    The use of extended reality and machine learning to improve healthcare and promote greenhealth

    Get PDF
    Com a Quarta Revolução Industrial, a propagação da Internet das Coisas, o avanço nas áreas de Inteligência Artificial e de Machine Learning até à migração para a Computação em Nuvem, o termo "Ambientes Inteligentes" cada vez mais deixa de ser uma idealização para se tornar realidade. Da mesma forma as tecnologias de Realidade Extendida também elas têm aumentado a sua presença no mundo tecnológico após um "período de hibernação", desde a popularização do conceito de Metaverse assim como a entrada das grandes empresas informáticas como a Apple e a Google num mercado onde a Realidade Virtual, Realidade Aumentada e Realidade Mista eram dominadas por empresas com menos experiência no desenvolvimento de sistemas (e.g. Meta), reconhecimento a nível mundial (e.g. HTC Vive), ou suporte financeiro e confiança do mercado. Esta tese tem como foco o estudo do potencial uso das tecnologias de Realidade Estendida de forma a promover Saúde Verde assim como seu uso em Hospitais Inteligentes, uma das variantes de Ambientes Inteligentes, incorporando Machine Learning e Computer Vision, como ferramenta de suporte e de melhoria de cuidados de saúde, tanto do ponto de vista do profissional de saúde como do paciente, através duma revisão literarária e análise da atualidade. Resultando na elaboração de um modelo conceptual com a sugestão de tecnologias a poderem ser usadas para alcançar esse cenário selecionadas pelo seu potencial, sendo posteriormente descrito o desenvolvimento de protótipos de partes do modelo conceptual para Óculos de Realidade Extendida como validação de conceito.With the Fourth Industrial Revolution, the spread of the Internet of Things, the advance in the areas of Artificial Intelligence and Machine Learning until the migration to Cloud Computing, the term "Intelligent Environments" increasingly ceases to be an idealization to become reality. Likewise, Extended Reality technologies have also increased their presence in the technological world after a "hibernation period", since the popularization of the Metaverse concept, as well as the entry of large computer companies such as Apple and Google into a market where Virtual Reality, Augmented Reality and Mixed Reality were dominated by companies with less experience in system development (e.g. Meta), worldwide recognition (e.g. HTC Vive) or financial support and trust in the market. This thesis focuses on the study of the potential use of Extended Reality technologies in order to promote GreenHealth as well as their use in Smart Hospitals, one of the variants of Smart Environments, incorporating Machine Learning and Computer Vision, as a tool to support and improve healthcare, both from the point of view of the health professional and the patient, through a literature review and analysis of the current situation. Resulting in the elaboration of a conceptual model with the suggestion of technologies that can be used to achieve this scenario selected for their potential, and then the development of prototypes of parts of the conceptual model for Extended Reality Headsets as concept validation

    Sonic Interactions in Virtual Environments

    Get PDF

    Sonic interactions in virtual environments

    Get PDF
    This book tackles the design of 3D spatial interactions in an audio-centered and audio-first perspective, providing the fundamental notions related to the creation and evaluation of immersive sonic experiences. The key elements that enhance the sensation of place in a virtual environment (VE) are: Immersive audio: the computational aspects of the acoustical-space properties of Virutal Reality (VR) technologies Sonic interaction: the human-computer interplay through auditory feedback in VE VR systems: naturally support multimodal integration, impacting different application domains Sonic Interactions in Virtual Environments will feature state-of-the-art research on real-time auralization, sonic interaction design in VR, quality of the experience in multimodal scenarios, and applications. Contributors and editors include interdisciplinary experts from the fields of computer science, engineering, acoustics, psychology, design, humanities, and beyond. Their mission is to shape an emerging new field of study at the intersection of sonic interaction design and immersive media, embracing an archipelago of existing research spread in different audio communities and to increase among the VR communities, researchers, and practitioners, the awareness of the importance of sonic elements when designing immersive environments

    Deception

    Get PDF

    Nutrition for Brain Development

    Get PDF
    High-quality primary data publications and review articles have been selected for publication in this Special Issue. They, collectively, draw a comprehensive picture of some of the most relevant questions linking (healthy) nutrition to brain development and brain disorders
    corecore