193 research outputs found
Pathway to Future Symbiotic Creativity
This report presents a comprehensive view of our vision on the development
path of the human-machine symbiotic art creation. We propose a classification
of the creative system with a hierarchy of 5 classes, showing the pathway of
creativity evolving from a mimic-human artist (Turing Artists) to a Machine
artist in its own right. We begin with an overview of the limitations of the
Turing Artists then focus on the top two-level systems, Machine Artists,
emphasizing machine-human communication in art creation. In art creation, it is
necessary for machines to understand humans' mental states, including desires,
appreciation, and emotions, humans also need to understand machines' creative
capabilities and limitations. The rapid development of immersive environment
and further evolution into the new concept of metaverse enable symbiotic art
creation through unprecedented flexibility of bi-directional communication
between artists and art manifestation environments. By examining the latest
sensor and XR technologies, we illustrate the novel way for art data collection
to constitute the base of a new form of human-machine bidirectional
communication and understanding in art creation. Based on such communication
and understanding mechanisms, we propose a novel framework for building future
Machine artists, which comes with the philosophy that a human-compatible AI
system should be based on the "human-in-the-loop" principle rather than the
traditional "end-to-end" dogma. By proposing a new form of inverse
reinforcement learning model, we outline the platform design of machine
artists, demonstrate its functions and showcase some examples of technologies
we have developed. We also provide a systematic exposition of the ecosystem for
AI-based symbiotic art form and community with an economic model built on NFT
technology. Ethical issues for the development of machine artists are also
discussed
Learning Biosignals with Deep Learning
The healthcare system, which is ubiquitously recognized as one of the most influential
system in society, is facing new challenges since the start of the decade.The myriad of
physiological data generated by individuals, namely in the healthcare system, is generating
a burden on physicians, losing effectiveness on the collection of patient data. Information
systems and, in particular, novel deep learning (DL) algorithms have been prompting a
way to take this problem.
This thesis has the aim to have an impact in biosignal research and industry by
presenting DL solutions that could empower this field. For this purpose an extensive study
of how to incorporate and implement Convolutional Neural Networks (CNN), Recursive
Neural Networks (RNN) and Fully Connected Networks in biosignal studies is discussed.
Different architecture configurations were explored for signal processing and decision
making and were implemented in three different scenarios: (1) Biosignal learning and
synthesis; (2) Electrocardiogram (ECG) biometric systems, and; (3) Electrocardiogram
(ECG) anomaly detection systems. In (1) a RNN-based architecture was able to replicate
autonomously three types of biosignals with a high degree of confidence. As for (2) three
CNN-based architectures, and a RNN-based architecture (same used in (1)) were used
for both biometric identification, reaching values above 90% for electrode-base datasets
(Fantasia, ECG-ID and MIT-BIH) and 75% for off-person dataset (CYBHi), and biometric
authentication, achieving Equal Error Rates (EER) of near 0% for Fantasia and MIT-BIH
and bellow 4% for CYBHi. As for (3) the abstraction of healthy clean the ECG signal
and detection of its deviation was made and tested in two different scenarios: presence of
noise using autoencoder and fully-connected network (reaching 99% accuracy for binary
classification and 71% for multi-class), and; arrhythmia events by including a RNN to the
previous architecture (57% accuracy and 61% sensitivity).
In sum, these systems are shown to be capable of producing novel results. The incorporation
of several AI systems into one could provide to be the next generation of
preventive medicine, as the machines have access to different physiological and anatomical
states, it could produce more informed solutions for the issues that one may face in the
future increasing the performance of autonomous preventing systems that could be used
in every-day life in remote places where the access to medicine is limited. These systems will also help the study of the signal behaviour and how they are made in real life context
as explainable AI could trigger this perception and link the inner states of a network with
the biological traits.O sistema de saúde, que é ubiquamente reconhecido como um dos sistemas mais influentes
da sociedade, enfrenta novos desafios desde o ínicio da década. A miríade de dados fisiológicos
gerados por indíviduos, nomeadamente no sistema de saúde, está a gerar um fardo
para os médicos, perdendo a eficiência no conjunto dos dados do paciente. Os sistemas de
informação e, mais espcificamente, da inovação de algoritmos de aprendizagem profunda
(DL) têm sido usados na procura de uma solução para este problema.
Esta tese tem o objetivo de ter um impacto na pesquisa e na indústria de biosinais,
apresentando soluções de DL que poderiam melhorar esta área de investigação. Para
esse fim, é discutido um extenso estudo de como incorporar e implementar redes neurais
convolucionais (CNN), redes neurais recursivas (RNN) e redes totalmente conectadas para
o estudo de biosinais.
Diferentes arquiteturas foram exploradas para processamento e tomada de decisão de
sinais e foram implementadas em três cenários diferentes: (1) Aprendizagem e síntese de
biosinais; (2) sistemas biométricos com o uso de eletrocardiograma (ECG), e; (3) Sistema
de detecção de anomalias no ECG. Em (1) uma arquitetura baseada na RNN foi capaz
de replicar autonomamente três tipos de sinais biológicos com um alto grau de confiança.
Quanto a (2) três arquiteturas baseadas em CNN e uma arquitetura baseada em RNN
(a mesma usada em (1)) foram usadas para ambas as identificações, atingindo valores
acima de 90 % para conjuntos de dados à base de eletrodos (Fantasia, ECG-ID e MIT
-BIH) e 75 % para o conjunto de dados fora da pessoa (CYBHi) e autenticação, atingindo
taxas de erro iguais (EER) de quase 0 % para Fantasia e MIT-BIH e abaixo de 4 % para
CYBHi. Quanto a (3) a abstração de sinais limpos e assimptomáticos de ECG e a detecção
do seu desvio foram feitas e testadas em dois cenários diferentes: na presença de ruído
usando um autocodificador e uma rede totalmente conectada (atingindo 99 % de precisão
na classificação binária e 71 % na multi-classe), e; eventos de arritmia incluindo um RNN
na arquitetura anterior (57 % de precisão e 61 % de sensibilidade).
Em suma, esses sistemas são mais uma vez demonstrados como capazes de produzir
resultados inovadores. A incorporação de vários sistemas de inteligência artificial em
um unico sistema pederá desencadear a próxima geração de medicina preventiva. Os
algoritmos ao terem acesso a diferentes estados fisiológicos e anatómicos, podem produzir
soluções mais informadas para os problemas que se possam enfrentar no futuro, aumentando o desempenho de sistemas autónomos de prevenção que poderiam ser usados na vida
quotidiana, nomeadamente em locais remotos onde o acesso à medicinas é limitado. Estes
sistemas também ajudarão o estudo do comportamento do sinal e como eles são feitos no
contexto da vida real, pois a IA explicável pode desencadear essa percepção e vincular os
estados internos de uma rede às características biológicas
Socio-Cognitive and Affective Computing
Social cognition focuses on how people process, store, and apply information about other people and social situations. It focuses on the role that cognitive processes play in social interactions. On the other hand, the term cognitive computing is generally used to refer to new hardware and/or software that mimics the functioning of the human brain and helps to improve human decision-making. In this sense, it is a type of computing with the goal of discovering more accurate models of how the human brain/mind senses, reasons, and responds to stimuli. Socio-Cognitive Computing should be understood as a set of theoretical interdisciplinary frameworks, methodologies, methods and hardware/software tools to model how the human brain mediates social interactions. In addition, Affective Computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects, a fundamental aspect of socio-cognitive neuroscience. It is an interdisciplinary field spanning computer science, electrical engineering, psychology, and cognitive science. Physiological Computing is a category of technology in which electrophysiological data recorded directly from human activity are used to interface with a computing device. This technology becomes even more relevant when computing can be integrated pervasively in everyday life environments. Thus, Socio-Cognitive and Affective Computing systems should be able to adapt their behavior according to the Physiological Computing paradigm. This book integrates proposals from researchers who use signals from the brain and/or body to infer people's intentions and psychological state in smart computing systems. The design of this kind of systems combines knowledge and methods of ubiquitous and pervasive computing, as well as physiological data measurement and processing, with those of socio-cognitive and affective computing
Neural correlates of emotional valence for faces and words
: Stimuli with negative emotional valence are especially apt to influence perception and action because of their crucial role in survival, a property that may not be precisely mirrored by positive emotional stimuli of equal intensity. The aim of this study was to identify the neural circuits differentially coding for positive and negative valence in the implicit processing of facial expressions and words, which are among the main ways human beings use to express emotions. Thirty-six healthy subjects took part in an event-related fMRI experiment. We used an implicit emotional processing task with the visual presentation of negative, positive, and neutral faces and words, as primary stimuli. Dynamic Causal Modeling (DCM) of the fMRI data was used to test effective brain connectivity within two different anatomo-functional models, for the processing of words and faces, respectively. In our models, the only areas showing a significant differential response to negative and positive valence across both face and word stimuli were early visual cortices, with faces eliciting stronger activations. For faces, DCM revealed that this effect was mediated by a facilitation of activity in the amygdala by positive faces and in the fusiform face area by negative faces; for words, the effect was mainly imputable to a facilitation of activity in the primary visual cortex by positive words. These findings support a role of early sensory cortices in discriminating the emotional valence of both faces and words, where the effect may be mediated chiefly by the subcortical/limbic visual route for faces, and rely more on the direct thalamic pathway to primary visual cortex for words
A PROBABILISTIC APPROACH TO THE CONSTRUCTION OF A MULTIMODAL AFFECT SPACE
Understanding affective signals from others is crucial for both human-human and human-agent interaction. The automatic analysis of emotion is by and large addressed as a pattern recognition problem which grounds in early psychological theories of emotion. Suitable features are first extracted and then used as input to classification (discrete emotion recognition) or regression (continuous affect detection). In this thesis, differently from many computational models in the literature, we draw on a simulationist approach to the analysis of facially displayed emotions - e.g., in the course of a face-to-face interaction between an expresser and an observer. At the heart of such perspective lies the enactment of the perceived emotion in the observer. We propose a probabilistic framework based on a deep latent representation of a continuous affect space, which can be exploited for both the estimation and the enactment of affective states in a multimodal space. Namely, we consider the observed facial expression together with physiological activations driven by internal autonomic activity. The rationale behind the approach lies in the large body of evidence from affective neuroscience showing that when we observe emotional facial expressions, we react with congruent facial mimicry. Further, in more complex situations, affect understanding is likely to rely on a comprehensive representation grounding the reconstruction of the state of the body associated with the displayed emotion. We show that our approach can address such problems in a unified and principled perspective, thus avoiding ad hoc heuristics while minimising learning efforts. Moreover, our model improves the inferred belief through the adoption of an inner loop of measurements and predictions within the central affect state-space, that realise the dynamics of the affect enactment. Results so far achieved have been obtained by adopting two publicly available multimodal corpora
Recent Advances in Signal Processing
The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
Learning to Dream, Dreaming to Learn
The importance of sleep for healthy brain function is widely acknowledged. However, it remains mysterious how the sleeping brain, disconnected from the outside world and plunged into the fantastic experiences of dreams, is actively learning. A main feature of dreams is the generation of new realistic sensory experiences in absence of external input, from the combination of diverse memory elements. How do cortical networks host the generation of these sensory experiences during sleep? What function could these generated experiences serve?
In this thesis, we attempt to answer these questions using an original, computational approach inspired by modern artificial intelligence. In light of existing cognitive theories and experimental data, we suggest that cortical networks implement a generative model of the sensorium that is systematically optimized during wakefulness and sleep states. By performing network simulations on datasets of natural images, our results not only propose potential mechanisms for dream generation during sleep states, but suggest that dreaming is an essential feature for learning semantic representations throughout mammalian development
Mimicking Short-Term Memory in Shape-Reconstruction Task Using an EEG-Induced Type-2 Fuzzy Deep Brain Learning Network
The paper attempts to model short-term memory (STM) for shape-reconstruction tasks by employing a 4-stage deep brain leaning network (DBLN), where the first 2 stages are built with Hebbian learning and the last 2 stages with Type-2 Fuzzy logic. The model is trained stage-wise independently with visual stimulus of the object-geometry as the input of the first stage, EEG acquired from different cortical regions as input and output of respective intermediate stages, and recalled object-geometry as the output of the last stage. Two error feedback loops are employed to train the proposed DBLN. The inner loop adapts the weights of the STM based on a measure of error in model-predicted response with respect to the object-shape recalled by the subject. The outer loop adapts the weights of the iconic (visual) memory based on a measure of error of the model predicted response with respect to the desired object-shape. In the test phase, the DBLN model reproduces the recalled object shape from the given input object geometry.
The motivation of the paper is to test the consistency in STM encoding (in terms of similarity in network weights) for repeated visual stimulation with the same geometric object. Experiments undertaken on healthy subjects, yield high similarity in network weights, whereas patients with pre-frontal lobe Amnesia yield significant discrepancy in the trained weights for any two trials with the same training object. This justifies the importance of the proposed DBLN model in automated diagnosis of patients with learning difficulty. The novelty of the paper lies in the overall design of the DBLN model with special emphasis to the last 2 stages of the network, built with vertical slice based type-2 fuzzy logic, to handle uncertainty in function approximation (with noisy EEG data). The proposed technique outperforms the state-of-the-art functional mapping algorithms with respect to the (pre-defined outer loop) error metric, computational complexity and runtime
Unveiling the frontiers of deep learning: innovations shaping diverse domains
Deep learning (DL) enables the development of computer models that are
capable of learning, visualizing, optimizing, refining, and predicting data. In
recent years, DL has been applied in a range of fields, including audio-visual
data processing, agriculture, transportation prediction, natural language,
biomedicine, disaster management, bioinformatics, drug design, genomics, face
recognition, and ecology. To explore the current state of deep learning, it is
necessary to investigate the latest developments and applications of deep
learning in these disciplines. However, the literature is lacking in exploring
the applications of deep learning in all potential sectors. This paper thus
extensively investigates the potential applications of deep learning across all
major fields of study as well as the associated benefits and challenges. As
evidenced in the literature, DL exhibits accuracy in prediction and analysis,
makes it a powerful computational tool, and has the ability to articulate
itself and optimize, making it effective in processing data with no prior
training. Given its independence from training data, deep learning necessitates
massive amounts of data for effective analysis and processing, much like data
volume. To handle the challenge of compiling huge amounts of medical,
scientific, healthcare, and environmental data for use in deep learning, gated
architectures like LSTMs and GRUs can be utilized. For multimodal learning,
shared neurons in the neural network for all activities and specialized neurons
for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table
- …