125 research outputs found
Feature Extraction
Feature extraction is a procedure aimed at selecting and transforming a data set in order to increase the performance of a pattern recognition or machine learning system. Nowadays, since the amount of data available and its dimension is growing exponentially, it is a fundamental procedure to avoid overfitting and the curse of dimensionality, while, in some cases, allowing a interpretative analysis of the data. The topic itself is a thriving discipline of study, and it is difficult to address every single feature extraction algorithm. Therefore, we provide an overview of the topic, introducing widely used techniques, while at the same time presenting some domain-specific feature extraction algorithms. Finally, as a case, study, we will illustrate the vastness of the field by analysing the usage and impact of feature extraction in neuroimaging
DMRN+16: Digital Music Research Network One-day Workshop 2021
DMRN+16: Digital Music Research Network One-day Workshop 2021 Queen Mary University of London Tuesday 21st December 2021 Keynote speakers Keynote 1. Prof. Sophie Scott -Director, Institute of Cognitive Neuroscience, UCL. Title: "Sound on the brain - insights from functional neuroimaging and neuroanatomy" Abstract In this talk I will use functional imaging and models of primate neuroanatomy to explore how sound is processed in the human brain. I will demonstrate that sound is represented cortically in different parallel streams. I will expand this to show how this can impact on the concept of auditory perception, which arguably incorporates multiple kinds of distinct perceptual processes. I will address the roles that subcortical processes play in this, and also the contributions from hemispheric asymmetries. Keynote 2: Prof. Gus Xia - Assistant Professor at NYU Shanghai Title: "Learning interpretable music representations: from human stupidity to artificial intelligence" Abstract Gus has been leading the Music X Lab in developing intelligent systems that help people better compose and learn music. In this talk, he will show us the importance of music representation for both humans and machines, and how to learn better music representations via the design of inductive bias. Once we got interpretable music representations, the potential applications are limitless
Self-Supervised Pretraining and Transfer Learning on fMRI Data with Transformers
Transfer learning is a machine learning technique founded on the idea that knowledge acquired by a model during âpretrainingâ on a source task can be transferred to the learning of a target task. Successful transfer learning can result in improved performance, faster convergence, and reduced demand for data. This technique is particularly desirable for the task of brain decoding in the domain of functional magnetic resonance imaging (fMRI), wherein even the most modern machine learning methods can struggle to decode labelled features of brain images. This challenge is due to the highly complex underlying signal, physical and neurological differences between brains, low data collection throughput, and other factors. Transfer learning is exciting in its potential to mitigate these challenges, but with this application still in its infancy, we must begin on the ground floor. The goals of this thesis were to design, implement, and evaluate a framework for pretraining and transfer learning on arbitrary fMRI datasets, then demonstrate its performance with respect to the literature, and achieve substantive progress toward generalized pretrained models of the brain. The primary contribution is our novel framework which achieves these goals, called BEAT, which stands for Bi-directional Encoders for Auditory Tasks. The design and implementation of BEAT include adapting state-of-the-art deep learning architectures to sequences of fMRI data, as well as a novel self-supervised pretraining task called Next Thought Prediction and several novel supervised brain decoding tasks. To evaluate BEAT, we pretrained ii on Next Thought Prediction and performed transfer learning to the brain decoding tasks, which are specific to one of three fMRI datasets. To demonstrate significant benefits of transfer learning, BEAT decoded instrumental timbre from one of the fMRI datasets which standard methods failed to decode in addition to improved downstream performance. Toward generalized pretrained models of the brain, BEAT learned Next Thought Prediction on one fMRI dataset, and then successfully transferred that learning to a supervised brain decoding task on an entirely distinct dataset, with different participants and stimuli. To our knowledge this is the first instance of transfer learning across participants and stimuliâa necessity for whole-brain pretrained models
The neurobiology of cortical music representations
Music is undeniable one of humanityâs defining traits, as it has been documented since the earliest
days of mankind, is present in all knowcultures and perceivable by all humans nearly alike.
Intrigued by its omnipresence, researchers of all disciplines started the investigation of musicâs
mystical relationship and tremendous significance to humankind already several hundred
years ago. Since comparably recently, the immense advancement of neuroscientific methods
also enabled the examination of cognitive processes related to the processing of music. Within
this neuroscience ofmusic, the vast majority of research work focused on how music, as an auditory
stimulus, reaches the brain and howit is initially processed, aswell as on the tremendous
effects it has on and can evoke through the human brain. However, intermediate steps, that is
how the human brain achieves a transformation of incoming signals to a seemingly specialized
and abstract representation of music have received less attention. Aiming to address this gap,
the here presented thesis targeted these transformations, their possibly underlying processes
and how both could potentially be explained through computational models. To this end, four
projects were conducted. The first two comprised the creation and implementation of two
open source toolboxes to first, tackle problems inherent to auditory neuroscience, thus also affecting
neuroscientific music research and second, provide the basis for further advancements
through standardization and automation. More precisely, this entailed deteriorated hearing
thresholds and abilities in MRI settings and the aggravated localization and parcellation of the
human auditory cortex as the core structure involved in auditory processing. The third project
focused on the humanâs brain apparent tuning to music by investigating functional and organizational
principles of the auditory cortex and network with regard to the processing of different
auditory categories of comparable social importance, more precisely if the perception of music
evokes a is distinct and specialized pattern. In order to provide an in depth characterization
of the respective patterns, both the segregation and integration of auditory cortex regions was
examined. In the fourth and final project, a highly multimodal approach that included fMRI,
EEG, behavior and models of varying complexity was utilized to evaluate how the aforementioned
music representations are generated along the cortical hierarchy of auditory processing
and how they are influenced by bottom-up and top-down processes. The results of project 1
and 2 demonstrated the necessity for the further advancement of MRI settings and definition
of working models of the auditory cortex, as hearing thresholds and abilities seem to vary as
a function of the used data acquisition protocol and the localization and parcellation of the
human auditory cortex diverges drastically based on the approach it is based one. Project 3
revealed that the human brain apparently is indeed tuned for music by means of a specialized
representation, as it evoked a bilateral network with a right hemispheric weight that was not
observed for the other included categories. The result of this specialized and hierarchical recruitment
of anterior and posterior auditory cortex regions was an abstract music component
ix
x SUMMARY
that is situated in anterior regions of the superior temporal gyrus and preferably encodes music,
regardless of sung or instrumental. The outcomes of project 4 indicated that even though
the entire auditory cortex, again with a right hemispheric weight, is involved in the complex
processing of music in particular, anterior regions yielded an abstract representation that varied
excessively over time and could not sufficiently explained by any of the tested models. The
specialized and abstract properties of this representation was furthermore underlined by the
predictive ability of the tested models, as models that were either based on high level features
such as behavioral representations and concepts or complex acoustic features always outperformed
models based on single or simpler acoustic features. Additionally, factors know to influence
auditory and thus music processing, like musical training apparently did not alter the
observed representations. Together, the results of the projects suggest that the specialized and
stable cortical representation of music is the outcome of sophisticated transformations of incoming
sound signals along the cortical hierarchy of auditory processing that generate a music
component in anterior regions of the superior temporal gyrus by means of top-down processes
that interact with acoustic features, guiding their processing.Musik ist unbestreitbarer Weise eine der definierenden Eigenschaften des Menschen. Dokumentiert
seit den fruÌhesten Tagen der Menschheit und in allen bekannten Kulturen vorhanden,
ist sie von allenMenschen nahezu gleichwahrnehmbar. Fasziniert von ihrerOmniprÀsenz
haben Wissenschaftler aller Disziplinen vor einigen hundert Jahren begonnen die mystische
Beziehung zwischen Musik und Mensch, sowie ihre enorme Bedeutung fuÌr selbigen zu untersuchen.
Seit einem vergleichsweise kurzem Zeitraum ist es durch den immensen Fortschritt
neurowissenschafticher Methoden auch möglich die kognitiven Prozesse, welche an der Verarbeitung
von Musik beteiligt, sind zu untersuchen. Innerhalb dieser Neurowissenschaft der
Musik hat sich ein GroĂteil der Forschungsarbeit darauf konzentriert wie Musik, als auditorischer
Stimulus, das menschliche Gehirn erreicht und wie sie initial verarbeitet wird, als auch
welche kolossallen Effekte sie auf selbiges hat und auch dadurch bewirken kann. Jedoch haben
die Zwischenschritte, also wie das menschliche Gehirn eintreffende Signale in eine scheinbar
spezialisierte und abstrakte ReprÀsentation vonMusik umwandelt, vergleichsweise wenig Aufmerksamkeit
erhalten. Um die dadurch entstandene LuÌcke zu adressieren, hat die hier vorliegende
Dissertation diese Prozesse und wie selbige durch Modelle erklÀrt werden können in
vier Projekten untersucht. Die ersten beiden Projekte beinhalteten die Herstellung und Implementierung
von zwei Toolboxen um erstens, inhÀrente Probleme der auditorischen Neurowissenschaft,
daher auch neurowissenschaftlicher Untersuchungen von Musik, zu verbessern
und zweitens, eine Basis fuÌr weitere Fortschritte durch Standardisierung und Automatisierung
zu schaffen. Im genaueren umfasste dies die stark beeintrÀchtigten Hörschwellen und
âfĂ€higkeiten in MRT-Untersuchungen und die erschwerte Lokalisation und Parzellierung des
menschlichen auditorischen Kortex als Kernstruktur auditiver Verarbeitung. Das dritte Projekt
befasste sich mit der augenscheinlichen Spezialisierung von Musik im menschlichen Gehirn
durch die Untersuchung funktionaler und organisatorischer Prinzipien des auditorischen
Kortex und Netzwerks bezuÌglich der Verarbeitung verschiedener auditorischer Kategorien vergleichbarer
sozialer Bedeutung, im genaueren ob die Wahrnehmung von Musik ein distinktes
und spezialisiertes neuronalenMuster hervorruft. Umeine ausfuÌhrliche Charakterisierung
der entsprechenden neuronalen Muster zu ermöglichen wurde die Segregation und Integration
der Regionen des auditorischen Kortex untersucht. Im vierten und letzten Projekt wurde
ein hochmultimodaler Ansatz,welcher fMRT, EEG, Verhalten undModelle verschiedener KomplexitÀt
beinhaltete, genutzt, umzu evaluieren, wie die zuvor genannten ReprÀsentationen von
Musik entlang der kortikalen Hierarchie der auditorischen Verarbeitung generiert und wie sie
möglicherweise durch Bottom-up- und Top-down-AnsÀtze beeinflusst werden. Die Ergebnisse
von Projekt 1 und 2 demonstrierten die Notwendigkeit fuÌr weitere Verbesserungen von MRTUntersuchungen
und die Definition eines Funktionsmodells des auditorischen Kortex, daHörxi
xii ZUSAMMENFASSUNG
schwellen und âfĂ€higkeiten stark in AbhĂ€ngigkeit der verwendeten Datenerwerbsprotokolle
variierten und die Lokalisation, sowie Parzellierung des menschlichen auditorischen Kortex
basierend auf den zugrundeliegenden AnsÀtzen drastisch divergiert. Projekt 3 zeigte, dass das
menschliche Gehirn tatsÀchlich eine spezialisierte ReprÀsentation vonMusik enthÀlt, da selbige
als einzige auditorische Kategorie ein bilaterales Netzwerk mit rechtshemisphÀrischer Gewichtung
evozierte. Aus diesemNetzwerk, welches die Rekrutierung anteriorer und posteriorer
Teile des auditorischen Kortex beinhaltete, resultierte eine scheinbar abstrakte ReprÀsentation
von Musik in anterioren Regionen des Gyrus temporalis superior, welche prÀferiert Musik enkodiert,
ungeachtet ob gesungen oder instrumental. Die Resultate von Projekt 4 deuten darauf
hin, dass der gesamte auditorische Kortex, erneut mit rechtshemisphÀrischer Gewichtung, an
der komplexen Verarbeitung vonMusik beteiligt ist, besonders aber anteriore Regionen, die bereits
genannten abstrakte ReprĂ€sentation hervorrufen, welche sich exzessiv uÌber die Zeitdauer
derWahrnehmung verÀndert und nicht hinreichend durch eines der getestetenModelle erklÀrt
werden kann. Die spezialisierten und abstrakten Eigenschaften dieser ReprÀsentationen wurden
weiterhin durch die prÀdiktiven FÀhigkeiten der getestetenModelle unterstrichen, daModelle,
welche entweder auf höheren Eigenschaften wie VerhaltensreprÀsentationen und mentalen
Konzepten oder komplexen akustischen Eigenschaften basierten, stets Modelle, welche
auf niederen Attributen wie simplen akustischen Eigenschaften basierten, uÌbertrafen. ZusĂ€tzlich
konnte kein Effekt von Faktoren, wie z.B. musikalisches Training, welche bekanntermaĂen
auditorische und daherMusikverarbeitung beeinflussen, nachgewiesen werden.
Zusammengefasst deuten die Ergebnisse der Projekte darauf, hin dass die spezialisierte und
stabile kortikale ReprÀsentation vonMusik ein Resultat komplexer Prozesse ist, welche eintreffende
Signale entlang der kortikalen Hierarchie auditorischer Verarbeitung in eine abstrakte
ReprÀsentation vonMusik innerhalb anteriorer Regionen des Gyrus temporalis superior durch
Top-Down-Prozesse, welche mit akustischen Eigenschaften interagieren und deren Verarbeitung
steuern, umwandeln
Autoencoding sensory substitution
Tens of millions of people live blind, and their number is ever increasing. Visual-to-auditory sensory substitution (SS) encompasses a family of cheap, generic solutions to assist the visually impaired by conveying visual information through sound. The required SS training is lengthy: months of effort is necessary to reach a practical level of adaptation. There are two reasons for the tedious training process: the elongated substituting audio signal, and the disregard for the compressive characteristics of the human hearing system.
To overcome these obstacles, we developed a novel class of SS methods, by training deep recurrent autoencoders for image-to-sound conversion. We successfully trained deep learning models on different datasets to execute visual-to-auditory stimulus conversion. By constraining the visual space, we demonstrated the viability of shortened substituting audio signals, while proposing mechanisms, such as the integration of computational hearing models, to optimally convey visual features in the substituting stimulus as perceptually discernible auditory components. We tested our approach in two separate cases. In the first experiment, the author went blindfolded for 5 days, while performing SS training on hand posture discrimination. The second experiment assessed the accuracy of reaching movements towards objects on a table. In both test cases, above-chance-level accuracy was attained after a few hours of training.
Our novel SS architecture broadens the horizon of rehabilitation methods engineered for the visually impaired. Further improvements on the proposed model shall yield hastened rehabilitation of the blind and a wider adaptation of SS devices as a consequence
Music On Canvas: A Quest to Generate Art That Evokes the Feeling of Music
Although the idea of connecting music and art dates back to ancient Greece, recent advancements in computing have made automating this feasible. This project represents a quest to transform music into art, using three methodologies where each is an improvement towards generating images that convey our feelings and imaginations during music listening. The three methods respectively involve:
1. An element-wise mapping of sound and colors2. Using song tags3. Tuning an Artificial Intelligence (AI) model to generate pictorial text captions.
To create artistic images, methods two and three utilize an existing text-to-image generative AI
Multi-Sensory Interaction for Blind and Visually Impaired People
This book conveyed the visual elements of artwork to the visually impaired through various sensory elements to open a new perspective for appreciating visual artwork. In addition, the technique of expressing a color code by integrating patterns, temperatures, scents, music, and vibrations was explored, and future research topics were presented. A holistic experience using multi-sensory interaction acquired by people with visual impairment was provided to convey the meaning and contents of the work through rich multi-sensory appreciation. A method that allows people with visual impairments to engage in artwork using a variety of senses, including touch, temperature, tactile pattern, and sound, helps them to appreciate artwork at a deeper level than can be achieved with hearing or touch alone. The development of such art appreciation aids for the visually impaired will ultimately improve their cultural enjoyment and strengthen their access to culture and the arts. The development of this new concept aids ultimately expands opportunities for the non-visually impaired as well as the visually impaired to enjoy works of art and breaks down the boundaries between the disabled and the non-disabled in the field of culture and arts through continuous efforts to enhance accessibility. In addition, the developed multi-sensory expression and delivery tool can be used as an educational tool to increase product and artwork accessibility and usability through multi-modal interaction. Training the multi-sensory experiences introduced in this book may lead to more vivid visual imageries or seeing with the mindâs eye
- âŠ